A Subsampling Line-Search Method with Second-Order Results
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F22%3A00363766" target="_blank" >RIV/68407700:21230/22:00363766 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1287/ijoo.2022.0072" target="_blank" >https://doi.org/10.1287/ijoo.2022.0072</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1287/ijoo.2022.0072" target="_blank" >10.1287/ijoo.2022.0072</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
A Subsampling Line-Search Method with Second-Order Results
Popis výsledku v původním jazyce
In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization, such as a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. For nonconvex data-related problems, such as training deep learning models, one aims at developing methods that converge to second-order stationary points quickly, that is, escape saddle points efficiently. This is particularly difficult to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model; for a sufficiently large sample, a similar amount of reduction holds for the true objective. We then present worst-case complexity guarantees for a notion of stationarity tailored to the subsampling context. Our analysis encompasses the deterministic regime and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice.
Název v anglickém jazyce
A Subsampling Line-Search Method with Second-Order Results
Popis výsledku anglicky
In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization, such as a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. For nonconvex data-related problems, such as training deep learning models, one aims at developing methods that converge to second-order stationary points quickly, that is, escape saddle points efficiently. This is particularly difficult to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model; for a sufficiently large sample, a similar amount of reduction holds for the true objective. We then present worst-case complexity guarantees for a notion of stationarity tailored to the subsampling context. Our analysis encompasses the deterministic regime and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice.
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/EF16_019%2F0000765" target="_blank" >EF16_019/0000765: Výzkumné centrum informatiky</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
INFORMS JOURNAL ON OPTIMIZATION
ISSN
2575-1484
e-ISSN
2575-1492
Svazek periodika
4
Číslo periodika v rámci svazku
4
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
23
Strana od-do
403-425
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—