Data Mining of Job Requirements in Online Job Advertisements Using Machine Learning and SDCA Logistic Regression
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61988987%3A17310%2F21%3AA2202BH0" target="_blank" >RIV/61988987:17310/21:A2202BH0 - isvavai.cz</a>
Result on the web
<a href="https://www.mdpi.com/2227-7390/9/19/2475" target="_blank" >https://www.mdpi.com/2227-7390/9/19/2475</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/math9192475" target="_blank" >10.3390/math9192475</a>
Alternative languages
Result language
angličtina
Original language name
Data Mining of Job Requirements in Online Job Advertisements Using Machine Learning and SDCA Logistic Regression
Original language description
There are currently many job portals offering job positions in the form of job advertisements. In this article, we are proposing an approach to mine data from job advertisements on job portals. Mainly, it would concern job requirements mining from individual job advertisements. Our proposed system consists of a data mining module, a machine learning module, and a postprocessing module. The machine learning module is based on the SDCA logistic regression. The postprocessing module includes several approaches to increase the success rate of the job requirements identification. The proposed system was verified on 20 most searched IT job positions from the selected job portal. In total, 9971 job advertisements were analyzed. Our system’s verification is finding all job requirements in 80% of analyzed advertisements. The detected job requirements were also compared with the Open Skills database. Based on this database and the extension of IT job positions with other typical job skills, we created a list of the most frequent job skills in selected IT job positions. The main contribution is the development of a universal system to detect job requirements in job advertisements.The proposed approach can be used not only for IT positions, but also for various job positions. The presented data mining module can also be used for various job portals.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Mathematics
ISSN
2227-7390
e-ISSN
—
Volume of the periodical
9
Issue of the periodical within the volume
19
Country of publishing house
CH - SWITZERLAND
Number of pages
32
Pages from-to
—
UT code for WoS article
000707511500001
EID of the result in the Scopus database
2-s2.0-85116331370