An LP-based hyperparameter optimization model for language modeling
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27510%2F18%3A10239458" target="_blank" >RIV/61989100:27510/18:10239458 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/article/10.1007/s11227-018-2236-6" target="_blank" >https://link.springer.com/article/10.1007/s11227-018-2236-6</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s11227-018-2236-6" target="_blank" >10.1007/s11227-018-2236-6</a>
Alternative languages
Result language
angličtina
Original language name
An LP-based hyperparameter optimization model for language modeling
Original language description
In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models' hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find perplexity values in the language modeling literature. We apply our model to find hyperparameters of a language model and compare it to the grid search algorithm. Furthermore, we illustrate that it results in lower perplexity values. We perform this experiment on a real-world dataset from SwiftKey to validate our proposed approach. (C) 2018 Springer Science+Business Media, LLC, part of Springer Nature
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10102 - Applied mathematics
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Journal of Supercomputing
ISSN
0920-8542
e-ISSN
—
Volume of the periodical
74
Issue of the periodical within the volume
5
Country of publishing house
NL - THE KINGDOM OF THE NETHERLANDS
Number of pages
10
Pages from-to
2151-2160
UT code for WoS article
000430412400016
EID of the result in the Scopus database
2-s2.0-85040232951