All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Latent Tree Language Model

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F16%3A43930573" target="_blank" >RIV/49777513:23520/16:43930573 - isvavai.cz</a>

  • Result on the web

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Latent Tree Language Model

  • Original language description

    In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of a given sentence as a tree of word roles. The learning phase iteratively updates the trees by moving nodes according to Gibbs sampling. We introduce two algorithms to infer a tree for a given sentence. The first one is based on Gibbs sampling. It is fast, but does not guarantee to find the most probable tree. The second one is based on dynamic programming. It is slower, but guarantees to find the most probable tree. We provide comparison of both algorithms. We combine LTLM with 4-gram Modified Kneser-Ney language model via linear interpolation. Our experiments with English and Czech corpora show significant perplexity reductions (up to 46% for English and 49% for Czech) compared with standalone 4-gram Modified Kneser-Ney language model.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

    JD - Use of computers, robotics and its application

  • OECD FORD branch

Result continuities

  • Project

    <a href="/en/project/LO1506" target="_blank" >LO1506: Sustainability support of the centre NTIS - New Technologies for the Information Society</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2016

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    International Conference on Empirical Methods in Natural Language Processing (EMNLP 2016)

  • ISBN

    978-1-945626-25-8

  • ISSN

  • e-ISSN

  • Number of pages

    11

  • Pages from-to

    436-446

  • Publisher name

    The Association for Computational Linguistics

  • Place of publication

    New York

  • Event location

    Austin, Texas, USA

  • Event date

    Nov 1, 2016

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article