Version 1
: Received: 22 August 2020 / Approved: 24 August 2020 / Online: 24 August 2020 (03:12:06 CEST)
How to cite:
Haynes, D.; Mitchell, P.; Shook, E. Developing the Geospatial Big Data Benchmark: A Comparative Framework for Evaluating Raster Analysis on Big Data Platforms. Preprints2020, 2020080504. https://doi.org/10.20944/preprints202008.0504.v1
Haynes, D.; Mitchell, P.; Shook, E. Developing the Geospatial Big Data Benchmark: A Comparative Framework for Evaluating Raster Analysis on Big Data Platforms. Preprints 2020, 2020080504. https://doi.org/10.20944/preprints202008.0504.v1
Haynes, D.; Mitchell, P.; Shook, E. Developing the Geospatial Big Data Benchmark: A Comparative Framework for Evaluating Raster Analysis on Big Data Platforms. Preprints2020, 2020080504. https://doi.org/10.20944/preprints202008.0504.v1
APA Style
Haynes, D., Mitchell, P., & Shook, E. (2020). Developing the Geospatial Big Data Benchmark: A Comparative Framework for Evaluating Raster Analysis on Big Data Platforms. Preprints. https://doi.org/10.20944/preprints202008.0504.v1
Chicago/Turabian Style
Haynes, D., Philip Mitchell and Eric Shook. 2020 "Developing the Geospatial Big Data Benchmark: A Comparative Framework for Evaluating Raster Analysis on Big Data Platforms" Preprints. https://doi.org/10.20944/preprints202008.0504.v1
Abstract
Technologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allows researchers to integrate and uncover new knowledge about geospatial patterns and processes. However, we are also at a critical moment, as we have an ever-growing number of big data platforms that are being co-opted to support spatial analysis. A gap in the literature is the lack of a robust framework to assess the capabilities of geospatial analysis on big data platforms. This research begins to address this issue by establishing a geospatial benchmark that employs freely accessible datasets to provide a comprehensive comparison across big data platforms. The benchmark is a critical for evaluating the performance of spatial operations on big data platforms. It provides a common framework to compare existing platforms as well as evaluate new platforms. The benchmark is applied to three big data platforms and reports computing times and performance bottlenecks so that GIScientists can make informed choices regarding the performance of each platform. Each platform is evaluated for five raster operations: pixel count, reclassification, raster add, focal averaging, and zonal statistics using three different datasets.
Computer Science and Mathematics, Information Systems
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.