Preprint
Article

Quality Assessment of Crowdsourced Data (CSD) Using Semantics and Geographical Information Retrieval (GIR) Techniques

Altmetrics

Downloads

499

Views

411

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

26 April 2018

Posted:

26 April 2018

You are already at the latest version

Alerts
Abstract
Crowdsourced Data (CSD) generated by citizens is becoming more popular as its potential utilisation in many applications is increasing due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows standard data collection procedures. The quality of CSD can be assessed according to a range of attributes including its relevance. Information relevance has been explored through using in Geographic Information Retrieval (GIR) techniques to identify relevant information. This research tested a relevance assessment approach for CSD by adapting relevance assessment techniques available in the GIR domain. The thematic and geographic relevance were assessed using the Term Frequency-Inverse Document Frequency (TF-IDF), Vector Space Model (VSM) and Natural Language Processing (NLP) techniques. The thematic and geographic specificities of the queries were calculated as 0.44 and 0.67 respectively, which indicates the queries used were more geographically specific than thematically specific. The Spearman's rho value of 0.62 indicated that the final ranked relevance lists showed reasonable agreement with a manually classified list and confirmed the potential of the approach for CSD relevance assessment for other possible crowdsourced data analysis.
Keywords: 
Subject: Environmental and Earth Sciences  -   Other
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated