NewMove: Customizing Text-to-Video Models with Novel Motions
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F24%3A00380618" target="_blank" >RIV/68407700:21730/24:00380618 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1007/978-981-96-0917-8_7" target="_blank" >https://doi.org/10.1007/978-981-96-0917-8_7</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-981-96-0917-8_7" target="_blank" >10.1007/978-981-96-0917-8_7</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
NewMove: Customizing Text-to-Video Models with Novel Motions
Popis výsledku v původním jazyce
We introduce an approach for augmenting text-to-video generation models with novel motions, extending their capabilities beyond the motions contained in the original training data. With a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our method finetunes an existing text-to-video model to learn a novel mapping between the depicted motion in the input examples to a new unique token. To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos. Leveraging the motion priors in a pretrained model, our method can learn a generalized motion pattern, that can be invoked with novel videos featuring multiple people doing the custom motion, or using the motion in combination with other motions. To validate our method, we quantitatively evaluate the learned custom motion and perform a systematic ablation study. We show that our method significantly outperforms prior appearance-based customization approaches when extended to the motion customization task. Project webpage: https://joaanna.github.io/customizing_motion/.
Název v anglickém jazyce
NewMove: Customizing Text-to-Video Models with Novel Motions
Popis výsledku anglicky
We introduce an approach for augmenting text-to-video generation models with novel motions, extending their capabilities beyond the motions contained in the original training data. With a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our method finetunes an existing text-to-video model to learn a novel mapping between the depicted motion in the input examples to a new unique token. To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos. Leveraging the motion priors in a pretrained model, our method can learn a generalized motion pattern, that can be invoked with novel videos featuring multiple people doing the custom motion, or using the motion in combination with other motions. To validate our method, we quantitatively evaluate the learned custom motion and perform a systematic ablation study. We show that our method significantly outperforms prior appearance-based customization approaches when extended to the motion customization task. Project webpage: https://joaanna.github.io/customizing_motion/.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
N - Vyzkumna aktivita podporovana z neverejnych zdroju
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Computer Vision – ACCV 2024; 17th Asian Conference on Computer Vision, Hanoi, Vietnam, December 8–12, 2024, Proceedings, Part V
ISBN
978-981-96-0916-1
ISSN
0302-9743
e-ISSN
1611-3349
Počet stran výsledku
18
Strana od-do
1634-1651
Název nakladatele
Springer
Místo vydání
Cham
Místo konání akce
Hanoj
Datum konání akce
8. 12. 2024
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—