Early Stopping of Non-productive Performance Testing Experiments Using Measurement Mutations

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A10474396" target="_blank" >RIV/00216208:11320/23:10474396 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1109/SEAA60479.2023.00022" target="_blank" >https://doi.org/10.1109/SEAA60479.2023.00022</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/SEAA60479.2023.00022" target="_blank" >10.1109/SEAA60479.2023.00022</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Early Stopping of Non-productive Performance Testing Experiments Using Measurement Mutations
Popis výsledku v původním jazyce
Modern software projects often incorporate some form of performance testing into their development cycle, intending to detect changes in performance between commits or releases. Performance testing generally relies on experimental evaluation using various benchmark workloads. To detect performance changes reliably, benchmarks must be executed many times to account for variability in the measurement results. While considered best practice, this approach can become prohibitively expensive when the number of versions and benchmark workloads increases. To alleviate the cost of performance testing, we propose an approach for the early stopping of non-productive experiments that are unlikely to detect a performance bug in a particular benchmark. The stopping conditions are based on benchmark-specific thresholds determined from historical data modified to emulate the potential effects of software changes on benchmark performance. We evaluate the approach on the GraalVM benchmarking project and show that it can eliminate about 50% of the experiments if we can afford to ignore about 15% of the least significant performance changes.
Název v anglickém jazyce
Early Stopping of Non-productive Performance Testing Experiments Using Measurement Mutations
Popis výsledku anglicky
Modern software projects often incorporate some form of performance testing into their development cycle, intending to detect changes in performance between commits or releases. Performance testing generally relies on experimental evaluation using various benchmark workloads. To detect performance changes reliably, benchmarks must be executed many times to account for variability in the measurement results. While considered best practice, this approach can become prohibitively expensive when the number of versions and benchmark workloads increases. To alleviate the cost of performance testing, we propose an approach for the early stopping of non-productive experiments that are unlikely to detect a performance bug in a particular benchmark. The stopping conditions are based on benchmark-specific thresholds determined from historical data modified to emulate the potential effects of software changes on benchmark performance. We evaluate the approach on the GraalVM benchmarking project and show that it can eliminate about 50% of the experiments if we can afford to ignore about 15% of the least significant performance changes.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)
ISBN
979-8-3503-4235-2
ISSN
2640-592X
e-ISSN
2376-9521
Počet stran výsledku
8
Strana od-do
86-93
Název nakladatele
IEEE
Místo vydání
Los Alamitos
Místo konání akce
Durres, Albania
Datum konání akce
6. 9. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Automatizovaná detekce změn výkonu: Zkušenosti z projektu Mono Context-Tailored Workload Model Generation for Continuous Representative Load Testing Reducing Experiment Costs in Automated Software Performance Regression Detection

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Early Stopping of Non-productive Performance Testing Experiments Using Measurement Mutations

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)