Data Mining of Job Requirements in Online Job Advertisements Using Machine Learning and SDCA Logistic Regression
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61988987%3A17310%2F21%3AA2202BH0" target="_blank" >RIV/61988987:17310/21:A2202BH0 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.mdpi.com/2227-7390/9/19/2475" target="_blank" >https://www.mdpi.com/2227-7390/9/19/2475</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/math9192475" target="_blank" >10.3390/math9192475</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Data Mining of Job Requirements in Online Job Advertisements Using Machine Learning and SDCA Logistic Regression
Popis výsledku v původním jazyce
There are currently many job portals offering job positions in the form of job advertisements. In this article, we are proposing an approach to mine data from job advertisements on job portals. Mainly, it would concern job requirements mining from individual job advertisements. Our proposed system consists of a data mining module, a machine learning module, and a postprocessing module. The machine learning module is based on the SDCA logistic regression. The postprocessing module includes several approaches to increase the success rate of the job requirements identification. The proposed system was verified on 20 most searched IT job positions from the selected job portal. In total, 9971 job advertisements were analyzed. Our system’s verification is finding all job requirements in 80% of analyzed advertisements. The detected job requirements were also compared with the Open Skills database. Based on this database and the extension of IT job positions with other typical job skills, we created a list of the most frequent job skills in selected IT job positions. The main contribution is the development of a universal system to detect job requirements in job advertisements.The proposed approach can be used not only for IT positions, but also for various job positions. The presented data mining module can also be used for various job portals.
Název v anglickém jazyce
Data Mining of Job Requirements in Online Job Advertisements Using Machine Learning and SDCA Logistic Regression
Popis výsledku anglicky
There are currently many job portals offering job positions in the form of job advertisements. In this article, we are proposing an approach to mine data from job advertisements on job portals. Mainly, it would concern job requirements mining from individual job advertisements. Our proposed system consists of a data mining module, a machine learning module, and a postprocessing module. The machine learning module is based on the SDCA logistic regression. The postprocessing module includes several approaches to increase the success rate of the job requirements identification. The proposed system was verified on 20 most searched IT job positions from the selected job portal. In total, 9971 job advertisements were analyzed. Our system’s verification is finding all job requirements in 80% of analyzed advertisements. The detected job requirements were also compared with the Open Skills database. Based on this database and the extension of IT job positions with other typical job skills, we created a list of the most frequent job skills in selected IT job positions. The main contribution is the development of a universal system to detect job requirements in job advertisements.The proposed approach can be used not only for IT positions, but also for various job positions. The presented data mining module can also be used for various job portals.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Mathematics
ISSN
2227-7390
e-ISSN
—
Svazek periodika
9
Číslo periodika v rámci svazku
19
Stát vydavatele periodika
CH - Švýcarská konfederace
Počet stran výsledku
32
Strana od-do
—
Kód UT WoS článku
000707511500001
EID výsledku v databázi Scopus
2-s2.0-85116331370