Incorporating statistical and machine learning techniques into the optimization of correction factors for software development effort estimation
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F23%3A63570759" target="_blank" >RIV/70883521:28140/23:63570759 - isvavai.cz</a>
Výsledek na webu
<a href="https://onlinelibrary.wiley.com/doi/10.1002/smr.2611" target="_blank" >https://onlinelibrary.wiley.com/doi/10.1002/smr.2611</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1002/smr.2611" target="_blank" >10.1002/smr.2611</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Incorporating statistical and machine learning techniques into the optimization of correction factors for software development effort estimation
Popis výsledku v původním jazyce
Accurate effort estimation is necessary for efficient management of software development projects, as it relates to human resource management. Ensemble methods, which employ multiple statistical and machine learning techniques, are more robust, reliable, and accurate effort estimation techniques. This study develops a stacking ensemble model based on optimization correction factors by integrating seven statistical and machine learning techniques (K-nearest neighbor, random forest, support vector regression, multilayer perception, gradient boosting, linear regression, and decision tree). The grid search optimization method is used to obtain valid search ranges and optimal configuration values, allowing more accurate estimation. We conducted experiments to compare the proposed method with related methods, such as use case points-based single methods, optimization correction factors-based single methods, and ensemble methods. The estimation accuracies of the methods were evaluated using statistical tests and unbiased performance measures on a total of four datasets, thus demonstrating the effectiveness of the proposed method more clearly. The proposed method successfully maintained its estimation accuracy across the four experimental datasets and gave the best results in terms of the sum of squares errors, mean absolute error, root mean square error, mean balance relative error, mean inverted balance relative error, median of magnitude of relative error, and percentage of prediction (0.25). The p-value for the t-test showed that the proposed method is statistically superior to other methods in terms of estimation accuracy. The results show that the proposed method is a comprehensive approach for improving estimation accuracy and minimizing project risks in the early stages of software development.
Název v anglickém jazyce
Incorporating statistical and machine learning techniques into the optimization of correction factors for software development effort estimation
Popis výsledku anglicky
Accurate effort estimation is necessary for efficient management of software development projects, as it relates to human resource management. Ensemble methods, which employ multiple statistical and machine learning techniques, are more robust, reliable, and accurate effort estimation techniques. This study develops a stacking ensemble model based on optimization correction factors by integrating seven statistical and machine learning techniques (K-nearest neighbor, random forest, support vector regression, multilayer perception, gradient boosting, linear regression, and decision tree). The grid search optimization method is used to obtain valid search ranges and optimal configuration values, allowing more accurate estimation. We conducted experiments to compare the proposed method with related methods, such as use case points-based single methods, optimization correction factors-based single methods, and ensemble methods. The estimation accuracies of the methods were evaluated using statistical tests and unbiased performance measures on a total of four datasets, thus demonstrating the effectiveness of the proposed method more clearly. The proposed method successfully maintained its estimation accuracy across the four experimental datasets and gave the best results in terms of the sum of squares errors, mean absolute error, root mean square error, mean balance relative error, mean inverted balance relative error, median of magnitude of relative error, and percentage of prediction (0.25). The p-value for the t-test showed that the proposed method is statistically superior to other methods in terms of estimation accuracy. The results show that the proposed method is a comprehensive approach for improving estimation accuracy and minimizing project risks in the early stages of software development.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Journal of Software-Evolution and Process
ISSN
2047-7473
e-ISSN
2047-7481
Svazek periodika
neuveden
Číslo periodika v rámci svazku
Neuveden
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
37
Strana od-do
1-37
Kód UT WoS článku
001106698900001
EID výsledku v databázi Scopus
2-s2.0-85169480450