Tree String Path Subsequences Automaton and Its Use for Indexing XML Documents
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F15%3A00238831" target="_blank" >RIV/68407700:21240/15:00238831 - isvavai.cz</a>
Result on the web
<a href="http://link.springer.com/chapter/10.1007/978-3-319-27653-3_17" target="_blank" >http://link.springer.com/chapter/10.1007/978-3-319-27653-3_17</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-27653-3_17" target="_blank" >10.1007/978-3-319-27653-3_17</a>
Alternative languages
Result language
angličtina
Original language name
Tree String Path Subsequences Automaton and Its Use for Indexing XML Documents
Original language description
The theory of indexing texts is well-researched, which does not hold for indexing other data structures, such as trees for example. In this paper a simple method of indexing a tree for subsequences of string paths in the tree by finite automaton is presented. The use of the index is shown on indexing XML documents for XPath descendant-or-self axis inspired queries. Given a subject tree T with n nodes, the tree is preprocessed and an index, which is a directed acyclic subsequence graph for a set of strings, is constructed. The searching phase uses the index, reads an input string path subsequence Q inspired by the specific XPath query of size m and computes the list of positions of all occurrences of Q in the tree T. The searching is performed in time O(m) and does not depend on n. Although the number of distinct valid queries is O(2n), the size of the index is O(hk), where h is the height of the tree T and k is the number of its leaves. Moreover, we discuss that in the case of indexing
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/GA13-03253S" target="_blank" >GA13-03253S: Text and Tree Structures Processing and Their Applications</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2015
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Languages, Applications and Technologies
ISBN
978-3-319-27652-6
ISSN
1865-0929
e-ISSN
—
Number of pages
11
Pages from-to
171-181
Publisher name
Springer International Publishing AG
Place of publication
Cham
Event location
Madrid
Event date
Jun 18, 2015
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000370191100017