Comparing Geospatial Entity Classes: An Asymmetric and Context-Dependent Similarity Measure
Andrea Rodríguez and
Max Egenhofer International Journal of Geographical Information Science Vol. 18 (3): 229-256, 2004.
Abstract
Semantic similarity plays an important role in geographic information systems as it supports the identification of objects that are conceptually close, but not identical. Similarity assessments are particularly important for retrieval of geospatial data in such settings as digital libraries or the World Wide Web. Although some computational models for semantic similarity assessment exist, these models are typically limited by their inability to handle such important cognitive properties of similarity judgments as their inherent asymmetry and their dependence on context. This paper defines a new computational model -- the Matching-Distance (MD) model -- for semantic similarity among spatial entity classes that takes into account the distinguishing features of these classes (parts, functions, and attributes) and their semantic interrelations (is-a and part-whole relations). A matching process is combined with a semantic-distance calculation to obtain asymmetric values of similarity that depend on the degree of generalization of entity classes. The matching process used by the MD model is also driven by contextual considerations, where the context determines the relative importance of distinguishing features. Based on a human-subject experiment, the results of the MD model correlate well with people's judgments of similarity. When contextual information is used for determining the importance of distinguishing features, this correlation increases; however, the major component of the correlation between the MD model and people's judgments is due to a detailed definition of entity classes.