All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Semantic Spaces for Improving language Modeling

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F14%3A43918497" target="_blank" >RIV/49777513:23520/14:43918497 - isvavai.cz</a>

  • Result on the web

    <a href="http://dx.doi.org/10.1016/j.csl.2013.05.001" target="_blank" >http://dx.doi.org/10.1016/j.csl.2013.05.001</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1016/j.csl.2013.05.001" target="_blank" >10.1016/j.csl.2013.05.001</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Semantic Spaces for Improving language Modeling

  • Original language description

    Language models are crucial for many tasks in NLP (Natural Language Processing) and n-grams are the best way to build them. Huge effort is being invested in improving n-gram language models. By introducing external information (morphology, syntax, partitioning into documents, etc.) into the models a significant improvement can be achieved. The models can however be improved with no external information and smoothing is an excellent example of such an improvement. In this article we show another way of improving the models that also requires no external information. We examine patterns that can be found in large corpora by building semantic spaces (HAL, COALS, BEAGLE and others described in this article). These semantic spaces have never been tested inlanguage modeling before. Our method uses semantic spaces and clustering to build classes for a class-based language model. The class-based model is then coupled with a standard n-gram model to create a very effective language model. Our

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)

  • CEP classification

    JD - Use of computers, robotics and its application

  • OECD FORD branch

Result continuities

  • Project

    <a href="/en/project/ED1.1.00%2F02.0090" target="_blank" >ED1.1.00/02.0090: NTIS - New Technologies for Information Society</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2014

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    Computer Speech and Language

  • ISSN

    0885-2308

  • e-ISSN

  • Volume of the periodical

    28

  • Issue of the periodical within the volume

    1

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    18

  • Pages from-to

    192-209

  • UT code for WoS article

  • EID of the result in the Scopus database