Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Approximate Online Pattern Matching in Sublinear Time

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10404557" target="_blank" >RIV/00216208:11320/19:10404557 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://doi.org/10.4230/LIPIcs.FSTTCS.2019.10" target="_blank" >https://doi.org/10.4230/LIPIcs.FSTTCS.2019.10</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.4230/LIPIcs.FSTTCS.2019.10" target="_blank" >10.4230/LIPIcs.FSTTCS.2019.10</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Approximate Online Pattern Matching in Sublinear Time

  • Popis výsledku v původním jazyce

    We consider the approximate pattern matching problem under edit distance. In this problem we are given a pattern P of length m and a text T of length n over some alphabet Sigma, and a positive integer k. The goal is to find all the positions j in T such that there is a substring of T ending at j which has edit distance at most k from the pattern P. Recall, the edit distance between two strings is the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. For a position t in {1,...,n}, let k_t be the smallest edit distance between P and any substring of T ending at t. In this paper we give a constant factor approximation to the sequence k_1,k_2,...,k_n. We consider both offline and online settings. In the offline setting, where both P and T are available, we present an algorithm that for all t in {1,...,n}, computes the value of k_t approximately within a constant factor. The worst case running time of our algorithm is O~(n m^(3/4)). In the online setting, we are given P and then T arrives one symbol at a time. We design an algorithm that upon arrival of the t-th symbol of T computes k_t approximately within O(1)-multiplicative factor and m^(8/9)-additive error. Our algorithm takes O~(m^(1-(7/54))) amortized time per symbol arrival and takes O~(m^(1-(1/54))) additional space apart from storing the pattern P. Both of our algorithms are randomized and produce correct answer with high probability. To the best of our knowledge this is the first algorithm that takes worst-case sublinear (in the length of the pattern) time and sublinear extra space for the online approximate pattern matching problem. To get our result we build on the technique of Chakraborty, Das, Goldenberg, Koucký and Saks [FOCS&apos;18] for computing a constant factor approximation of edit distance in sub-quadratic time.

  • Název v anglickém jazyce

    Approximate Online Pattern Matching in Sublinear Time

  • Popis výsledku anglicky

    We consider the approximate pattern matching problem under edit distance. In this problem we are given a pattern P of length m and a text T of length n over some alphabet Sigma, and a positive integer k. The goal is to find all the positions j in T such that there is a substring of T ending at j which has edit distance at most k from the pattern P. Recall, the edit distance between two strings is the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. For a position t in {1,...,n}, let k_t be the smallest edit distance between P and any substring of T ending at t. In this paper we give a constant factor approximation to the sequence k_1,k_2,...,k_n. We consider both offline and online settings. In the offline setting, where both P and T are available, we present an algorithm that for all t in {1,...,n}, computes the value of k_t approximately within a constant factor. The worst case running time of our algorithm is O~(n m^(3/4)). In the online setting, we are given P and then T arrives one symbol at a time. We design an algorithm that upon arrival of the t-th symbol of T computes k_t approximately within O(1)-multiplicative factor and m^(8/9)-additive error. Our algorithm takes O~(m^(1-(7/54))) amortized time per symbol arrival and takes O~(m^(1-(1/54))) additional space apart from storing the pattern P. Both of our algorithms are randomized and produce correct answer with high probability. To the best of our knowledge this is the first algorithm that takes worst-case sublinear (in the length of the pattern) time and sublinear extra space for the online approximate pattern matching problem. To get our result we build on the technique of Chakraborty, Das, Goldenberg, Koucký and Saks [FOCS&apos;18] for computing a constant factor approximation of edit distance in sub-quadratic time.

Klasifikace

  • Druh

    D - Stať ve sborníku

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

    <a href="/cs/project/GX19-27871X" target="_blank" >GX19-27871X: Efektivní aproximační algoritmy a obvodová složitost</a><br>

  • Návaznosti

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

  • Rok uplatnění

    2019

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název statě ve sborníku

    39th {IARCS} Annual Conference on Foundations of Software Technology and Theoretical Computer Science, {FSTTCS} 2019, December 11-13, 2019, Bombay, India

  • ISBN

    978-3-95977-131-3

  • ISSN

    1868-8969

  • e-ISSN

  • Počet stran výsledku

    15

  • Strana od-do

    1-15

  • Název nakladatele

    Schloss Dagstuhl - Leibniz-Zentrum fur Informatik

  • Místo vydání

    Dagstuhl, Germany

  • Místo konání akce

    Bombay, India

  • Datum konání akce

    11. 12. 2019

  • Typ akce podle státní příslušnosti

    WRD - Celosvětová akce

  • Kód UT WoS článku