Distribution of Topological Relations in Geographic Datasets

John Florence and Max Egenhofer

Abstract

The speed at which geographic information system processes spatial queries often depends on characteristics of the data sets. A fairly accurate estimation of the relative frequency of spatial relations is necessary for the design of appropriate query processing strategies for spatial-relation queries, e.g., to find out, ÒWhat is the spatial relation between object A and B?Ó where A and B may be geographic objects of type point, line, or area. This paper introduces a framework for analyzing the distribution of topological relations, which consists of a categorization of spatial configurations based on the topological relations that may exist between the objects queried. Preliminary results about the distribution of topological relations obtained from several test data sets indicate that in a partition of space with more than 25 objects, over 90% of the relations are disjoint. For configurations in which one category has generally much larger areal objects than the other, we found that the distribution depends on the ratio between the numbers of objects--the smaller the ratio, the less objects are disjoint (75%) and the more regions are contained (15%) and meet (10%).