SPARSE MULTIDIMENSIONAL DATA PROCESSING IN GEOINFORMATICS

Andrey N. Kokoulin

doi:10.5593/sgem2018/2.2/s08.052

Scholarly record

SPARSE MULTIDIMENSIONAL DATA PROCESSING IN GEOINFORMATICS

Andrey N. Kokoulin

First published: 2018-06-20https://doi.org/10.5593/sgem2018/2.2/s08.052View metrics

Abstract

Authors describe the novel approach to boost the scientific data analysis performance with the data-storing schema for distributed scientific research systems. This approach optimizes the workload on storage nodes and enhances the computation performance. Managing the enormous output of scientific research systems is expected to be the most technically difficult part of all recent projects producing Petabytes of imagery data. In this paper we describe the distributed storage structure with indexing techniques which can be effectively applied for scientific multidimensional data processing in geoinformatics. Basic principal of this project is distributed (N,K)-block storage schema (LH*RS or SDDS). We develop the descent of LH*RS especially for multidimensional data arrays using the multiscaled representation of these arrays and using the efficient pre-processing algorithms. The LH*RS is positioned as the general-purpose method, and its efficiency does not depend on the data file type but we can implement some enhancements in its distribution algorithm in order to accelerate its performance in the case of multidimensional or imagery data. Dataset is decomposed into data blocks of several levels using the Wavelet transform. The required dataset of the requested scale and resolution is reconstructed from the corresponding set of downloaded blocks on client?s side. In order to accelerate data queries processing we can additionally use a pre-computed statistic results blocks and their hierarchical representation. Main principle of data preprocessing comprises the original data merging with the results of transformation algorithm in adjacent buckets of the same storage. These results are computed only once during the data storing stage simultaneously with data distribution and with the same computing unit. The main advantage of this approach is that we can use these results together with original data or even separately to serve different data queries with both value and dimension subsetting conditions. This approach can reduce the resource costs of corresponding scientific problems. Another advantage of this schema is possibility of sparse data processing for regular or irregular coordinate grid.

Publication Impact Profile

Captures
Mendeley - Readers: 2

DOI resolver: doi.org/10.5593/sgem2018/2.2/s08.052

Publication details

Title

SPARSE MULTIDIMENSIONAL DATA PROCESSING IN GEOINFORMATICS

Authors

Andrey N. Kokoulin

Proceedings

SGEM International Multidisciplinary Scientific GeoConference EXPO Proceedings; 18th International Multidisciplinary Scientific GeoConference SGEM2018, Informatics, Geoinformatics and Remote Sensing

Publisher

STEF92 Technology

Year

2018

Pages

411-418

SWS Citekey

Kokoulin20188411418

ISSN

1314-2704

ISBN

978-619-7408-40-9

Language

en

Publication type

Conference Paper

Keywords

wavelet transform Hilbert space-filling curve indexing multidimensional array sparse data

References0

0references registered for this publication

Structured references will appear here after the reference import pass. The count is preserved now so the scholarly record is not incomplete.

View or Download full articleAccess options

Full paper accessChoose SWS login, librarian support, or instant article download.

SWS access login

Login as SWS Scientific Committee

Authors and approved SWS contributors will read and export their own linked papers after identity matching by SWS profile, email and SGEM GlobalID.

For librarian assistance: [email protected]

Purchase Instant Access

48-hour online accessComing soon

Online-only accessComing soon

Download the full article in PDF formatEUR 35

Article can be downloaded after successful payment.
Article may be used according to SWS library access terms.
Article cannot be redistributed.

Get full paper

Back to publication list