SWS Academic Research eLibraryEarth & Planetary Sciences

Scholarly record

SPARSE MULTIDIMENSIONAL DATA PROCESSING IN GEOINFORMATICS

Andrey N. Kokoulin

First published: 2018-06-20https://doi.org/10.5593/sgem2018/2.2/s08.052View metrics

Abstract

Authors describe the novel approach to boost the scientific data analysis performance with the data-storing schema for distributed scientific research systems. This approach optimizes the workload on storage nodes and enhances the computation performance. Managing the enormous output of scientific research systems is expected to be the most technically difficult part of all recent projects producing Petabytes of imagery data. In this paper we describe the distributed storage structure with indexing techniques which can be effectively applied for scientific multidimensional data processing in geoinformatics. Basic principal of this project is distributed (N,K)-block storage schema (LH*RS or SDDS). We develop the descent of LH*RS especially for multidimensional data arrays using the multiscaled representation of these arrays and using the efficient pre-processing algorithms. The LH*RS is positioned as the general-purpose method, and its efficiency does not depend on the data file type but we can implement some enhancements in its distribution algorithm in order to accelerate its performance in the case of multidimensional or imagery data. Dataset is decomposed into data blocks of several levels using the Wavelet transform. The required dataset of the requested scale and resolution is reconstructed from the corresponding set of downloaded blocks on client?s side. In order to accelerate data queries processing we can additionally use a pre-computed statistic results blocks and their hierarchical representation. Main principle of data preprocessing comprises the original data merging with the results of transformation algorithm in adjacent buckets of the same storage. These results are computed only once during the data storing stage simultaneously with data distribution and with the same computing unit. The main advantage of this approach is that we can use these results together with original data or even separately to serve different data queries with both value and dimension subsetting conditions. This approach can reduce the resource costs of corresponding scientific problems. Another advantage of this schema is possibility of sparse data processing for regular or irregular coordinate grid.

Publication Impact Profile

PlumX
  • Captures
  • Mendeley - Readers: 2

Publication details

Title
SPARSE MULTIDIMENSIONAL DATA PROCESSING IN GEOINFORMATICS
Authors
Andrey N. Kokoulin
Proceedings
SGEM International Multidisciplinary Scientific GeoConference EXPO Proceedings; 18th International Multidisciplinary Scientific GeoConference SGEM2018, Informatics, Geoinformatics and Remote Sensing
Publisher
STEF92 Technology
Year
2018
Pages
411-418
SWS Citekey
Kokoulin20188411418
ISSN
1314-2704
ISBN
978-619-7408-40-9
Language
en
Publication type
Conference Paper
Keywords
References0
0references registered for this publication

Structured references will appear here after the reference import pass. The count is preserved now so the scholarly record is not incomplete.

View or Download full articleAccess options
Full paper accessChoose SWS login, librarian support, or instant article download.

SWS access login

Login as SWS Scientific Committee

Authors and approved SWS contributors will read and export their own linked papers after identity matching by SWS profile, email and SGEM GlobalID.

For librarian assistance: [email protected]

Purchase Instant Access

48-hour online accessComing soon
Online-only accessComing soon
Download the full article in PDF formatEUR 35
  • Article can be downloaded after successful payment.
  • Article may be used according to SWS library access terms.
  • Article cannot be redistributed.
Get full paper

Back to publication list