Abstract
Heterogeneous information spaces are typically created by merging data from a variety of different applications and information sources. These sources often use different identifiers for data that describe the same real-word entity (for example an artist, a conference, an organization). In this paper we propose a new probabilistic Entity Linkage algorithm for identifying and linking data that refer to the same real-world entity.
Our approach focuses on managing entity linkage information in heterogeneous information spaces using probabilistic methods. We use a Bayesian network to model evidences which support the possible object matches along with the interdependencies between them. This enables us to flexibly update the network when new information becomes available, and to cope with the different requirements imposed by applications build on top of information spaces.
Our approach focuses on managing entity linkage information in heterogeneous information spaces using probabilistic methods. We use a Bayesian network to model evidences which support the possible object matches along with the interdependencies between them. This enables us to flexibly update the network when new information becomes available, and to cope with the different requirements imposed by applications build on top of information spaces.
Original language | English |
---|---|
Title of host publication | Advanced Information Systems Engineering |
Subtitle of host publication | 20th International Conference |
Publisher | Springer |
Pages | 556-570 |
Number of pages | 15 |
DOIs | |
Publication status | Published - 2008 |
Externally published | Yes |