Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F23%3APU149378" target="_blank" >RIV/00216305:26230/23:PU149378 - isvavai.cz</a>
Result on the web
<a href="https://pero.fit.vutbr.cz/publications" target="_blank" >https://pero.fit.vutbr.cz/publications</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-41685-9_17" target="_blank" >10.1007/978-3-031-41685-9_17</a>
Alternative languages
Result language
angličtina
Original language name
Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition
Original language description
In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple finetuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of finetuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, finetuning provided an average relative CER improvement of 25 % with 16 text lines for new writers and 50 % for 256 text lines.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Document Analysis and Recognition - ICDAR 2023
ISBN
978-3-031-41684-2
ISSN
0302-9743
e-ISSN
—
Number of pages
18
Pages from-to
269-286
Publisher name
Springer Nature Switzerland AG
Place of publication
San José
Event location
San José, California, USA
Event date
Aug 21, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—