Cost-Sensitive Strategies for Data Imbalance in Bug Severity Classification: Experimental Results
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F17%3A00100027" target="_blank" >RIV/00216224:14330/17:00100027 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1109/SEAA.2017.71" target="_blank" >http://dx.doi.org/10.1109/SEAA.2017.71</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/SEAA.2017.71" target="_blank" >10.1109/SEAA.2017.71</a>
Alternative languages
Result language
angličtina
Original language name
Cost-Sensitive Strategies for Data Imbalance in Bug Severity Classification: Experimental Results
Original language description
Context: Software Bug Severity Classification can help to improve the software bug triaging process. However, severity levels present a high-level of data imbalance that needs to be taken into account. Aim: We investigate cost-sensitive strategies in multi-class bug severity classification to counteract data imbalance. Method: We transform datasets from three severity classification papers to a common format, totaling 17 projects. We test different cost sensitive strategies to penalize majority classes. We adopt a Support Vector Machine (SVM) classifier that we also compare to a baseline "majority class" classifier. Results: A model weighting classes based on the inverse of instance frequencies yields a statistically significant improvement (low effect size) over the standard unweighted SVM model in the assembled dataset. Conclusions: Data imbalance should be taken more into consideration in future severity classification research papers.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA) 2017
ISBN
9781538621400
ISSN
—
e-ISSN
—
Number of pages
4
Pages from-to
426-429
Publisher name
IEEE
Place of publication
Not specified
Event location
Vienna
Event date
Jan 1, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000426074600063