2017 article

Parallel Processing over Spatial-Temporal Datasets from Geo, Bio, Climate and Social Science Communities: A Research Roadmap

2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), pp. 232–250.

By: S. Prasad*, D. Aghajarian*, M. McDermott*, D. Shah*, M. Mokbel*, S. Puri*, S. Rey*, S. Shekhar* ...

co-author countries: United States of America 🇺🇸
author keywords: High performance computing; Spatial data mining; Remote sensing data; Medical images; Spatial econometrics; Map-reduce systems; CyberGIS; Parallel algorithms and data structures
Source: Web Of Science
Added: August 6, 2018

This vision paper reviews the current state-ofart and lays out emerging research challenges in parallel processing of spatial-temporal large datasets relevant to a variety of scientific communities. The spatio-temporal data, whether captured through remote sensors (global earth observations), ground and ocean sensors (e.g., soil moisture sensors, buoys), social media and hand-held, traffic-related sensors and cameras, medical imaging (e.g., MRI), or large scale simulations (e.g., climate) have always been “big.” A common thread among all these big collections of datasets is that they are spatial and temporal. Processing and analyzing these datasets requires high-performance computing (HPC) infrastructures. Various agencies, scientific communities and increasingly the society at large rely on spatial data management, analysis, and spatial data mining to gain insights and produce actionable plans. Therefore, an ecosystem of integrated and reliable software infrastructure is required for spatialtemporal big data management and analysis that will serve as crucial tools for solving a wide set of research problems from different scientific and engineering areas and to empower users with next-generation tools. This vision requires a multidisciplinary effort to significantly advance domain research and have a broad impact on the society. The areas of research discussed in this paper include (i) spatial data mining, (ii) data analytics over remote sensing data, (iii) processing medical images, (iv) spatial econometrics analyses, (v) Map-Reducebased systems for spatial computation and visualization, (vi) CyberGIS systems, and (vii) foundational parallel algorithms and data structures for polygonal datasets, and why HPC infrastructures, including harnessing graphics accelerators, are needed for time-critical applications.