Towards improving the efficiency of software development effort estimation via clustering analysis

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F22%3A63556538" target="_blank" >RIV/70883521:28140/22:63556538 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/document/9803030" target="_blank" >https://ieeexplore.ieee.org/document/9803030</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ACCESS.2022.3185393" target="_blank" >10.1109/ACCESS.2022.3185393</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Towards improving the efficiency of software development effort estimation via clustering analysis
Popis výsledku v původním jazyce
Introduction: The precise estimation of software effort is a significant difficulty that project managers encounter during software development. Inaccurate forecasting leads to either overestimating or underestimating software effort, which can be detrimental for stakeholders. The International Function Point Users Group's Function Point Analysis (FPA) method is one of the most critical methods for software effort estimation. However, the practice of using the FPA method in the same fashion across all software areas needs to be reexamined. Aim: We propose a model for evaluating the influence of data clustering on software development effort estimation and then finding the best clustering method. We call this model the effort estimation using machine learning applied to the clusters (EEAC) model. Method: We cluster the dataset according to the clustering method and then apply the FPA and EEAC methods to these clusters for effort estimation. The clustering methods we use in this study include five categorical variable criteria (Development Platform, Industrial Sector, Language Type, Organization Type, and Relative Size) and the k-means clustering algorithm. Results: The experimental results show that the estimation accuracy obtaining with clustering consistently outperforms the accuracy without clustering for both the FPA and EEAC methods. Significantly, using the FPA method, the average improvement rate from using clustering as opposed to non-clustered was highest at 58.06%, according to the RMSE. With the EEAC method, this number reached 65.53%. The Industry Sector categorical variable achieves the best accuracy estimation compared to the other clustering criteria and k-means clustering. The improvement in accuracy in terms of the RMSE when applying this criterion is 63.68% for the FPA method and 72.02% for the EEAC method. Conclusion: Better results are obtained through dataset clustering compared to no clustering for both the FPA and EEAC methods. The Industry Sector is the most suitable clustering method among the tested clustering methods.
Název v anglickém jazyce
Towards improving the efficiency of software development effort estimation via clustering analysis
Popis výsledku anglicky
Introduction: The precise estimation of software effort is a significant difficulty that project managers encounter during software development. Inaccurate forecasting leads to either overestimating or underestimating software effort, which can be detrimental for stakeholders. The International Function Point Users Group's Function Point Analysis (FPA) method is one of the most critical methods for software effort estimation. However, the practice of using the FPA method in the same fashion across all software areas needs to be reexamined. Aim: We propose a model for evaluating the influence of data clustering on software development effort estimation and then finding the best clustering method. We call this model the effort estimation using machine learning applied to the clusters (EEAC) model. Method: We cluster the dataset according to the clustering method and then apply the FPA and EEAC methods to these clusters for effort estimation. The clustering methods we use in this study include five categorical variable criteria (Development Platform, Industrial Sector, Language Type, Organization Type, and Relative Size) and the k-means clustering algorithm. Results: The experimental results show that the estimation accuracy obtaining with clustering consistently outperforms the accuracy without clustering for both the FPA and EEAC methods. Significantly, using the FPA method, the average improvement rate from using clustering as opposed to non-clustered was highest at 58.06%, according to the RMSE. With the EEAC method, this number reached 65.53%. The Industry Sector categorical variable achieves the best accuracy estimation compared to the other clustering criteria and k-means clustering. The improvement in accuracy in terms of the RMSE when applying this criterion is 63.68% for the FPA method and 72.02% for the EEAC method. Conclusion: Better results are obtained through dataset clustering compared to no clustering for both the FPA and EEAC methods. The Industry Sector is the most suitable clustering method among the tested clustering methods.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
IEEE Access
ISSN
2169-3536
e-ISSN
2169-3536
Svazek periodika
10
Číslo periodika v rámci svazku
Neuveden
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
16
Strana od-do
83249-83264
Kód UT WoS článku
000842087800001
EID výsledku v databázi Scopus
2-s2.0-85133809040

Podobné výsledky(10)

Analyzing the Effectiveness of the Gaussian Mixture Model Clustering Algorithm in Software Enhancement Effort Estimation Improving the performance of effort estimation in terms of function point analysis by balancing datasets Toward applying agglomerative hierarchical clustering in improving the software development effort estimation

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Towards improving the efficiency of software development effort estimation via clustering analysis

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)