CEU eTD Collection (2025); Joiya, Saad Wali Muhammad: Matching Employees to Projects

CEU Electronic Theses and Dissertations, 2025
Author Joiya, Saad Wali Muhammad
Title Matching Employees to Projects
Summary This project focuses on designing a smart, data-driven employee–project matching system to support staffing decisions at scale. Currently, the client uses a manual process to match suitable employees for incoming projects. This process is not optimal considering time, effort, scalability and human errors. Moreover, an organized framework for projects and employee data collection is also currently absent. There is a need for creating a proof of concept that streamlines end to end data collection from feature list to feature data types, utilizes this data to create a framework of scoring employees against projects and interactively presents these results to the user.
In large organizations, traditional project staffing often relies on intuition, making it difficult to objectively assess candidate suitability for projects — especially when evaluating large teams or highly specialized roles. To address this, our system uses structured and unstructured data to score employees based on their alignment with project requirements. In organizations where projects vary greatly in scope and skill requirements, optimally matching resources to projects is vital for maximizing efficiency and enhancing delivery. The goal is to assist project managers and HR teams in identifying the most suitable employees for a given project by evaluating multiple dimensions of fit, thereby reducing manual effort, bias, and inefficiency in the resource allocation process. Moreover, we can enhance the quality of employee allocation by making more accurate and consistent recommendations based on matching features. This system can be improved over time with innovations as there has been less prior work done in this regard at the client’s end.
We simulated realistic employee and project datasets using thematically aligned roles, summaries, and domain-specific pools for skills, products, certifications, and languages. These were based on industry research and job descriptions inspired by our client’s organizational structure. This thematic structure helped ensure that the matching logic made practical sense — for example, you would not see a legal advisor matched to a cloud infrastructure rollout. Each employee and project record includes detailed features such as job descriptions, role summary, skills with proficiency levels, language fluency, certifications, availability, and soft skills like cultural awareness and leadership experience and more. We then define corresponding features from both tables for scoring purposes.
Matching is performed using a multi-criteria scoring framework. This includes similarity scoring between text-based features, fuzzy matching for fields with human-written inputs (e.g., skills, locations, languages), keyword overlap for list-based features multi-step statistical scoring algorithms for numeric fields and some additional bonus scores for employee values and characteristics. We also incorporated logic for filtering out employees who are not available within a project’s timeline. All these scores are weighted — meaning decision-makers can prioritize what matters most (skills, availability, certifications, etc.) and rolled into a final score that ranks the top best-fit employees per project. The result is a flexible, interpretable, and fair system that offers a scalable foundation for intelligent workforce planning.
For demonstration purposes we implement the complete process in a StreamLit dashboard for interactively displaying top employees for each project based on feature weightages and a detailed comparison and explanation for each proposed employee ranking is provided. This dashboard was deployed on a VM using dockers for easy reproducibility and accessibility.
For easy reproducibility and collaboration, we generated a Capstone Project Report, Jupyter Notebook, StreamLit App and a GitHub repository. The capstone project report contains project details, complete overview of data, matching and scoring mechanisms and dashboard deployment. The Jupyter Notebook contains a proof of concept of the complete process from data creation to final scoring containing all dependencies and explanations. The StreamLit App is deployed on a VM and is publicly accessible for demonstration purposes. Finally, all these artifacts including reports, codes and requirements are uploaded in a public GitHub repository.
The approach implemented is highly interpretable. It is easy to grasp even for people with non-technical backgrounds. Weights were used as needed to control how the employees are ranked, and which features contribute more than others to the final score. The scores were validated using arena approach in collaboration with the client. This method does not have the usual limitations of a Machine Learning model, making it highly dynamic and extendable. Moreover, it provided the leverage to tweak the model based on client expectations in the future. For example, more features can be added over time, fuzzy thresholds can be tweaked and scoring functions can be easily updated. Furthermore, the approach empowers the client to unlock the full potential of their human capital, fostering both project success and employee growth in an increasingly dynamic business environment. This Capstone Project not only addresses a tangible business problem but also contributes to the growing understanding of how AI can positively transform core HR functions.
Supervisor Rubia, Eduardo Arino de la
Department Economics MSc
Full texthttps://www.etd.ceu.edu/2025/joiya_saad.pdf

Visit the CEU Library.

© 2007-2025, Central European University