|
SPARSE MULTIDIMENSIONAL DATA PROCESSING IN GEOINFORMATICS
|
|
|
A. N. Kokoulin;D. A. Kiryanov;M. R. Kamaltdinov;A. A. Yuzhakov
|
|
|
||
|
|
|
|
1314-2704
|
|
|
||
|
English
|
|
|
18
|
|
|
2.2
|
|
|
|
|
|
||
|
Authors describe the novel approach to boost the scientific data analysis performance with the data-storing schema for distributed scientific research systems. This approach optimizes the workload on storage nodes and enhances the computation performance. Managing the enormous output of scientific research systems is expected to be the most technically difficult part of all recent projects producing Petabytes of imagery data.
In this paper we describe the distributed storage structure with indexing techniques which can be effectively applied for scientific multidimensional data processing in geoinformatics. Basic principal of this project is distributed (N,K)-block storage schema (LH*RS or SDDS). We develop the descent of LH*RS especially for multidimensional data arrays using the multiscaled representation of these arrays and using the efficient pre-processing algorithms. The LH*RS is positioned as the general-purpose method, and its efficiency does not depend on the data file type but we can implement some enhancements in its distribution algorithm in order to accelerate its performance in the case of multidimensional or imagery data. Dataset is decomposed into data blocks of several levels using the Wavelet transform. The required dataset of the requested scale and resolution is reconstructed from the corresponding set of downloaded blocks on client?s side. In order to accelerate data queries processing we can additionally use a pre-computed statistic results blocks and their hierarchical representation. Main principle of data preprocessing comprises the original data merging with the results of transformation algorithm in adjacent buckets of the same storage. These results are computed only once during the data storing stage simultaneously with data distribution and with the same computing unit. The main advantage of this approach is that we can use these results together with original data or even separately to serve different data queries with both value and dimension subsetting conditions. This approach can reduce the resource costs of corresponding scientific problems. Another advantage of this schema is possibility of sparse data processing for regular or irregular coordinate grid. |
|
|
conference
|
|
|
||
|
||
|
18th International Multidisciplinary Scientific GeoConference SGEM 2018
|
|
|
18th International Multidisciplinary Scientific GeoConference SGEM 2018, 02-08 July, 2018
|
|
|
Proceedings Paper
|
|
|
STEF92 Technology
|
|
|
International Multidisciplinary Scientific GeoConference-SGEM
|
|
|
Bulgarian Acad Sci; Acad Sci Czech Republ; Latvian Acad Sci; Polish Acad Sci; Russian Acad Sci; Serbian Acad Sci & Arts; Slovak Acad Sci; Natl Acad Sci Ukraine; Natl Acad Sci Armenia; Sci Council Japan; World Acad Sci; European Acad Sci, Arts & Letters; Ac
|
|
|
411-418
|
|
|
02-08 July, 2018
|
|
|
website
|
|
|
cdrom
|
|
|
653
|
|
|
wavelet transform; Hilbert space-filling curve; indexing; multidimensional array; sparse data
|
|