SWS Academic Research eLibraryEarth & Planetary Sciences

Scholarly record

SPREADSHEET DATA EXTRACTION USING SEMANTIC NETWORK

Nino Tkeshelashvili

First published: 2019-06-20https://doi.org/10.5593/sgem2019/2.1/s07.083View metrics

Abstract

Spreadsheets often contain valuable data that are difficult to process due to an unclear structure. For this reason, spreadsheets also called semi-structured data. Table structure recognition and data extraction is an important area of research. Tables may contain statistical reports, schedules, grade books, results of research or product catalogue. Depending on the purpose, tables have a different structure. In this paper, the authors propose an approach for extracting information from "list" type spreadsheets. These tables store information about objects of the same type, each column represents an object property. Price lists are a good example of such tables? type. The main proposed approach idea is the extraction objects from the spreadsheet using the semantic network. The kernel of semantic network graph is based on Wiktionary data and contains senses and semantic relations between them. Every sense has its owns wordforms, theirs morphological characteristics and instances of the sense, where the instance is the object of given sense. For example, "2m" may be the instance of the sense "length". The semantic network is used to describe the object structure, while instances give useful templates for data. The program developed looks in every row in the spreadsheet, match properties to senses and creates objects with given in semantic network structure. The approach was tested for the corpus of price lists, typical for IT distribution area.

Publication Impact Profile

PlumX
  • Captures
  • Mendeley - Readers: 6

Publication details

Title
SPREADSHEET DATA EXTRACTION USING SEMANTIC NETWORK
Authors
Nino Tkeshelashvili
Proceedings
SGEM International Multidisciplinary Scientific GeoConference EXPO Proceedings; 19th International Multidisciplinary Scientific GeoConference SGEM2019, Informatics, Geoinformatics and Remote Sensing
Publisher
STEF92 Technology
Year
2019
Pages
637-644
SWS Citekey
Tkeshelashvili20197637644
ISSN
1314-2704
ISBN
978-619-7408-79-9
Language
en
Publication type
Conference Paper
Keywords
References0
0references registered for this publication

Structured references will appear here after the reference import pass. The count is preserved now so the scholarly record is not incomplete.

Citing literature

Number of times cited according to Crossref: 1

View or Download full articleAccess options
Full paper accessChoose SWS login, librarian support, or instant article download.

SWS access login

Login as SWS Scientific Committee

Authors and approved SWS contributors will read and export their own linked papers after identity matching by SWS profile, email and SGEM GlobalID.

For librarian assistance: [email protected]

Purchase Instant Access

48-hour online accessComing soon
Online-only accessComing soon
Download the full article in PDF formatEUR 35
  • Article can be downloaded after successful payment.
  • Article may be used according to SWS library access terms.
  • Article cannot be redistributed.
Get full paper

Back to publication list