From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10427063" target="_blank" >RIV/00216208:11320/19:10427063 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1145/3321128" target="_blank" >https://doi.org/10.1145/3321128</a>
DOI - Digital Object Identifier
—

Alternative languages

Result language
angličtina
Original language name
From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging
Original language description
Singlish can be interesting to the computational linguistics community both linguistically, as a major low-resource creole based on English, and computationally, for information extraction and sentiment analysis of regional social media. In our conference paper, Wang et al. (2017), we investigated part-of-speech (POS) tagging and dependency parsing for Singlish by constructing a treebank under the Universal Dependencies scheme and successfully used neural stacking models to integrate English syntactic knowledge for boosting Singlish POS tagging and dependency parsing, achieving the state-of-the-art accuracies of 89.50% and 84.47% for Singlish POS tagging and dependency, respectively. In this work, we substantially extend Wang et al. (2017) by enlarging the Singlish treebank to more than triple the size and with much more diversity in topics, as well as further exploring neural multi-task models for integrating English syntactic knowledge. Results show that the enlarged treebank has achieved significant relative error reduction of 45.8% and 15.5% on the base model, 27% and 10% on the neural multi-task model, and 21% and 15% on the neural stacking model for POS tagging and dependency parsing, respectively. Moreover, the state-of-the-art Singlish POS tagging and dependency parsing accuracies have been improved to 91.16% and 85.57%, respectively. We make our treebanks and models available for further research.
Czech name
—
Czech description
—

Classification

Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
—
Continuities
—

Others

Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Similar results(10)

Cross-lingual Parsing with Polyglot Training and Multi-treebank Learning: A Faroese Case Study Artificially Evolved Chunks for Morphosyntactic Analysis SCTB-V2: the 2nd version of the Chinese treebank in the scientific domain

What are you looking for?

Quick search

Smart search

From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Similar results(10)