Zde se nacházíte:
Informace o publikaci
Assessing the Quality of Spatio-textual Datasets in the Absence of Ground Truth
Autoři | |
---|---|
Rok publikování | 2017 |
Druh | Článek ve sborníku |
Konference | Proceedings of the 21st European Conference on Advances in Databases and Information Systems |
Fakulta / Pracoviště MU | |
Citace | |
www | Springer, CORE B conference, SCOPUS, WoS, DBLP |
Doi | http://dx.doi.org/10.1007/978-3-319-67162-8_2 |
Obor | Informatika |
Klíčová slova | spatio-textual data; data quality; relative quality |
Popis | The increasing availability of enriched geospatial data has opened up a new domain and enables the development of more sophisticated location-based services and applications. However, this development has also given rise to various data quality problems as it is very hard to verify the data for all real-world entities contained in a dataset. In this paper, we propose ARCI, a relative quality indicator which exploits the vast availability of spatio-textual datasets, to indicate how confident a user can be in the correctness of a given dataset. ARCI operates in the absence of ground truth and aims at computing the relative quality of an input dataset by cross-referencing its entries among various similar datasets. We also present an algorithm for computing ARCI and we evaluate its performance in a preliminary experimental evaluation using real-world datasets. |