Thesis project: creating HR talent demand insights by classification of job requisitions by applying dictionary matching on job descriptions

Due to the rapidly changing internal and external environment, organizations are constantly challenged to find the most qualified employees. In order the stay ahead of competitors, recruiters must find talent in order to achieve the organizational goals. In order to find qualified employees strategic recruitment decisions are made. For these decisions HR managers require insights in the external supply of talent and the internal demand of talent.

HR analysis enables data-driven decision making for HR managers by creating insights based on the available recruitment data. One of the data sources used by HR analysts is the data from job requisitions, which is used to create insights for recruitment. The structured data alone from the job requisitions, does not contain the necessary data for HR analysts to create insights about the talent demand of the organization. This data is in the unstructured job description, where the skills and studies requirements are specified that an applicant must meet. In order to use this data, the job requisitions should be classified by the skills and study requirements mentioned in the job description.

Different methods have been considered to extract the studies and skills from the job description, but in the end dictionary matching is chosen. Literature research showed that dictionary matching was the most accurate for smaller datasets. This method was applied to a case study with internal job requisition data. This resulted in a process covering the phases: data collection, data preparation, information extraction and classification.

After validation of this process with a precision and recall analysis it is concluded that dictionary matching on job descriptions to acquire classifications for the job requisitions is accurate. In addition, there are indications that the insights created with the job requisitions classified by the skills and study are valuable for data-driven recruitment decisions.

Rody Franken ( is working on this topic.

Thesis project: the prediction of voluntary employee turnover

This thesis examines how the prediction of voluntary employee turnover could bring value to organisations. A case study was performed with data of Deloitte Holding B.V., consisting of employee records. Four classification models were used as predicting methods. CRISP-DM was used as guiding principles for the application of data mining. The data set was re-sampled as it showed to be imbalanced. Based on F1 score as leading performance measure, it was concluded that Random Forest was the best predicting model for Deloitte. Literature pointed out that voluntary employee turnover was shown to be dysfunctional. Hence, there was concluded that decision trees empowers organisations to identify profiles that form a ‘risk’ for the organisation. Organisations can use decision trees as insights in order to develop effective policies and strategies for retaining employees. However, voluntary employee turnover remains a complex phenomenon, which is only able to explain a small percentage of the variance of the actual turnover decision.

Keywords: voluntary employee turnover, classification, imbalanced data

Koen Geerding ( is working on this topic. LinkedIn: