Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10427063" target="_blank" >RIV/00216208:11320/19:10427063 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://doi.org/10.1145/3321128" target="_blank" >https://doi.org/10.1145/3321128</a>

  • DOI - Digital Object Identifier

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging

  • Popis výsledku v původním jazyce

    Singlish can be interesting to the computational linguistics community both linguistically, as a major low-resource creole based on English, and computationally, for information extraction and sentiment analysis of regional social media. In our conference paper, Wang et al. (2017), we investigated part-of-speech (POS) tagging and dependency parsing for Singlish by constructing a treebank under the Universal Dependencies scheme and successfully used neural stacking models to integrate English syntactic knowledge for boosting Singlish POS tagging and dependency parsing, achieving the state-of-the-art accuracies of 89.50% and 84.47% for Singlish POS tagging and dependency, respectively. In this work, we substantially extend Wang et al. (2017) by enlarging the Singlish treebank to more than triple the size and with much more diversity in topics, as well as further exploring neural multi-task models for integrating English syntactic knowledge. Results show that the enlarged treebank has achieved significant relative error reduction of 45.8% and 15.5% on the base model, 27% and 10% on the neural multi-task model, and 21% and 15% on the neural stacking model for POS tagging and dependency parsing, respectively. Moreover, the state-of-the-art Singlish POS tagging and dependency parsing accuracies have been improved to 91.16% and 85.57%, respectively. We make our treebanks and models available for further research.

  • Název v anglickém jazyce

    From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging

  • Popis výsledku anglicky

    Singlish can be interesting to the computational linguistics community both linguistically, as a major low-resource creole based on English, and computationally, for information extraction and sentiment analysis of regional social media. In our conference paper, Wang et al. (2017), we investigated part-of-speech (POS) tagging and dependency parsing for Singlish by constructing a treebank under the Universal Dependencies scheme and successfully used neural stacking models to integrate English syntactic knowledge for boosting Singlish POS tagging and dependency parsing, achieving the state-of-the-art accuracies of 89.50% and 84.47% for Singlish POS tagging and dependency, respectively. In this work, we substantially extend Wang et al. (2017) by enlarging the Singlish treebank to more than triple the size and with much more diversity in topics, as well as further exploring neural multi-task models for integrating English syntactic knowledge. Results show that the enlarged treebank has achieved significant relative error reduction of 45.8% and 15.5% on the base model, 27% and 10% on the neural multi-task model, and 21% and 15% on the neural stacking model for POS tagging and dependency parsing, respectively. Moreover, the state-of-the-art Singlish POS tagging and dependency parsing accuracies have been improved to 91.16% and 85.57%, respectively. We make our treebanks and models available for further research.

Klasifikace

  • Druh

    O - Ostatní výsledky

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

Ostatní

  • Rok uplatnění

    2019

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů