Inexact tree pattern matching with 1-degree edit distance using finite automata
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F23%3A00363988" target="_blank" >RIV/68407700:21240/23:00363988 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1016/j.dam.2023.01.003" target="_blank" >https://doi.org/10.1016/j.dam.2023.01.003</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.dam.2023.01.003" target="_blank" >10.1016/j.dam.2023.01.003</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Inexact tree pattern matching with 1-degree edit distance using finite automata
Popis výsledku v původním jazyce
Given an input tree and a tree pattern, the inexact tree pattern matching problem is about finding all subtrees in the input tree that match the tree pattern with up to errors. To measure the number of errors between two labeled ordered trees, we use the 1-degree edit distance that uses operations node relabel, leaf insert, and leaf delete. The proposed solution is based on a finite automaton that reads a given input tree represented in linear, prefix bar, notation. First, we show its construction for a constrained variant of 1-degree edit distance where leaf insert/delete operations cannot be applied to the tree pattern recursively. Then, we extend this approach to both unit cost and non-unit cost 1-degree edit distance. The deterministic version of the proposed finite automaton finds all inexact occurrences of the tree pattern in time linear to the size of the input tree. However, since the size of such automaton can be exponential in the number of nodes of the tree pattern, it is practical only for small patterns. Therefore, we also consider the dynamic programming approach as a way of simulating the nondeterministic finite automaton. This approach comes with the space complexity and time complexity where is the number of nodes of the tree pattern, is the number of nodes of the input tree, and is the maximum number of errors allowed.
Název v anglickém jazyce
Inexact tree pattern matching with 1-degree edit distance using finite automata
Popis výsledku anglicky
Given an input tree and a tree pattern, the inexact tree pattern matching problem is about finding all subtrees in the input tree that match the tree pattern with up to errors. To measure the number of errors between two labeled ordered trees, we use the 1-degree edit distance that uses operations node relabel, leaf insert, and leaf delete. The proposed solution is based on a finite automaton that reads a given input tree represented in linear, prefix bar, notation. First, we show its construction for a constrained variant of 1-degree edit distance where leaf insert/delete operations cannot be applied to the tree pattern recursively. Then, we extend this approach to both unit cost and non-unit cost 1-degree edit distance. The deterministic version of the proposed finite automaton finds all inexact occurrences of the tree pattern in time linear to the size of the input tree. However, since the size of such automaton can be exponential in the number of nodes of the tree pattern, it is practical only for small patterns. Therefore, we also consider the dynamic programming approach as a way of simulating the nondeterministic finite automaton. This approach comes with the space complexity and time complexity where is the number of nodes of the tree pattern, is the number of nodes of the input tree, and is the maximum number of errors allowed.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/EF16_019%2F0000765" target="_blank" >EF16_019/0000765: Výzkumné centrum informatiky</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Discrete Applied Mathematics
ISSN
0166-218X
e-ISSN
1872-6771
Svazek periodika
330
Číslo periodika v rámci svazku
May
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
20
Strana od-do
78-97
Kód UT WoS článku
000924453300001
EID výsledku v databázi Scopus
2-s2.0-85149786736