MultIPAs: applying program transformations to introductory programming assignments for data augmentation
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F22%3A00364210" target="_blank" >RIV/68407700:21730/22:00364210 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1145/3540250.3558931" target="_blank" >https://doi.org/10.1145/3540250.3558931</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/3540250.3558931" target="_blank" >10.1145/3540250.3558931</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
MultIPAs: applying program transformations to introductory programming assignments for data augmentation
Popis výsledku v původním jazyce
There has been a growing interest, over the last few years, in the topic of automated program repair applied to fixing introductory programming assignments (IPAs). However, the datasets of IPAs publicly available tend to be small and with no valuable annotations about the defects of each program. Small datasets are not very useful for program repair tools that rely on machine learning models. Furthermore, a large diversity of correct implementations allows computing a smaller set of repairs to fix a given incorrect program rather than always using the same set of correct implementations for a given IPA. For these reasons, there has been an increasing demand for the task of augmenting IPAs benchmarks. This paper presents MultIPAs, a program transformation tool that can augment IPAs benchmarks by: (1) applying six syntactic mutations that conserve the program's semantics and (2) applying three semantic mutilations that introduce faults in the IPAs. Moreover, we demonstrate the usefulness of MultIPAs by augmenting with millions of programs two publicly available benchmarks of programs written in the C language, and also by generating an extensive benchmark of semantically incorrect programs.
Název v anglickém jazyce
MultIPAs: applying program transformations to introductory programming assignments for data augmentation
Popis výsledku anglicky
There has been a growing interest, over the last few years, in the topic of automated program repair applied to fixing introductory programming assignments (IPAs). However, the datasets of IPAs publicly available tend to be small and with no valuable annotations about the defects of each program. Small datasets are not very useful for program repair tools that rely on machine learning models. Furthermore, a large diversity of correct implementations allows computing a smaller set of repairs to fix a given incorrect program rather than always using the same set of correct implementations for a given IPA. For these reasons, there has been an increasing demand for the task of augmenting IPAs benchmarks. This paper presents MultIPAs, a program transformation tool that can augment IPAs benchmarks by: (1) applying six syntactic mutations that conserve the program's semantics and (2) applying three semantic mutilations that introduce faults in the IPAs. Moreover, we demonstrate the usefulness of MultIPAs by augmenting with millions of programs two publicly available benchmarks of programs written in the C language, and also by generating an extensive benchmark of semantically incorrect programs.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
ISBN
978-1-4503-9413-0
ISSN
—
e-ISSN
—
Počet stran výsledku
5
Strana od-do
1657-1661
Název nakladatele
ACM
Místo vydání
New York
Místo konání akce
Singapur
Datum konání akce
14. 11. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
001118262900146