NewMove: Customizing Text-to-Video Models with Novel Motions
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F24%3A00380618" target="_blank" >RIV/68407700:21730/24:00380618 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1007/978-981-96-0917-8_7" target="_blank" >https://doi.org/10.1007/978-981-96-0917-8_7</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-981-96-0917-8_7" target="_blank" >10.1007/978-981-96-0917-8_7</a>
Alternative languages
Result language
angličtina
Original language name
NewMove: Customizing Text-to-Video Models with Novel Motions
Original language description
We introduce an approach for augmenting text-to-video generation models with novel motions, extending their capabilities beyond the motions contained in the original training data. With a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our method finetunes an existing text-to-video model to learn a novel mapping between the depicted motion in the input examples to a new unique token. To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos. Leveraging the motion priors in a pretrained model, our method can learn a generalized motion pattern, that can be invoked with novel videos featuring multiple people doing the custom motion, or using the motion in combination with other motions. To validate our method, we quantitatively evaluate the learned custom motion and perform a systematic ablation study. We show that our method significantly outperforms prior appearance-based customization approaches when extended to the motion customization task. Project webpage: https://joaanna.github.io/customizing_motion/.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
N - Vyzkumna aktivita podporovana z neverejnych zdroju
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Computer Vision – ACCV 2024; 17th Asian Conference on Computer Vision, Hanoi, Vietnam, December 8–12, 2024, Proceedings, Part V
ISBN
978-981-96-0916-1
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
18
Pages from-to
1634-1651
Publisher name
Springer
Place of publication
Cham
Event location
Hanoj
Event date
Dec 8, 2024
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—