Application of clustering methods to determine the areas of activity of candidates in recruitment for IT-companies

Authors

  • Olena Gavrylenko
  • Viktoriia Dvornyk

DOI:

https://doi.org/10.34185/1562-9945-3-134-2021-14

Keywords:

кластеризація, оптимізація, сфера діяльності, кандидат, рекрутер, підбір кадрів

Abstract

Nowadays the selection of candidates for recruitment from a wide range of candidates is a fundamental issue. Today's HR managers have to handle extremely large amounts of data: portfolio research, social media screening, skill set identification, and, of course, resume research.
Professors Sagar More, Bhamara Priyanka, Mali Puja and Kachave Kalyani were considering the automated classification of resumes using clustering techniques. The solution proposed by scientists uses methods of data mining. The method of data mining clustering is used for classification and calculation.
The aim of the article is to study the methods of clustering and the transformation of the clustering problem into an optimization problem to improve the efficiency and quality of recommendations to recruitment managers.
In the task of determining the areas of activity of employees in recruitment for IT-companies an input information will be summarized in text form, which will contain all the information about the professional career of the employee, as well as cover letters, essays and career guidance tests with free open answers.
At the output we get a set of professional areas of activity of employees with the best resumes selected for them, that is, we get a grouping of input data to certain areas of activity.
It is suggested to use text clustering methods to group and combine input data. For clustering can be used c-means algorithm – a modification of the k-means method.
There is one disadvantage of the method: the need to know the number of clusters in advance. In this case, it is proposed to present the problem of clustering as optimization. The «elbow» method or the «knee» method can be used to determine the optimal number of clusters.
Analysis of the results showed that the use of the c-means method has an important advantage: the ability to determine the degree of belonging of the element to the cluster. And, also with usage of «elbow» method optimal number of clusters can be chosen.

References

Rout, Jayashree & Bagade, Sudhir & Yede, Pooja & Patil, Nirmiti. (2019). Personality Evaluation and CV Analysis using Machine Learning Algorithm. International Journal of Computer Sciences and Engineering. 7. 1852-1857. 10.26438/ijcse/v7i5.18521857.

Prof. Sagar More, Bhamare Priyanka, Mali Puja, Kachave Kalyani. (2019). Automated CV Classification using Clustering Technique. International Research Journal of Engineering and Technology (IRJET). Volume 6, Issue 6, Page No 302-305.

Кластерный анализ [Електронний ресурс] — Режим доступу до статті: http://www.machinelearning.ru/wiki/index.php?title=Кластеризация

Dunn J.C. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters // Journal of Cybernetics. — 1973. — 17 сентября (т. 3, № 3). — С. 32–57. — ISSN 0022-0280. — doi:10.1080/01969727308546046.

Часовских А. Обзор алгоритмов кластеризации данных [Электронный ресурс] — Режим доступа к статье: https://habr.com/ru/post/101338/

Королёв С., Кашницкий Ю. Открытый курс машинного обучения. Обучение без учителя: PCA и кластеризация [Электронный ресурс] — Режим доступа к статье: https://habr.com/ru/company/ods/blog/325654/

Published

2021-04-05