As quickly and effectively to implement Big Data and Machine Learning in application area of business for a solution of practical tasks, having avoided the popular errors Data Scientist — we understand on the example of the HR direction.
Preparing to implement Big Data in HR
The introduction of any new technology and, especially, methodology is a long and iterative process consisting of several stages, as prescribed by the Cross-Industry Standard Process for Data Mining (CRISP-DM). However, CRISP-DM is a methodological guide for Data Scientist, and from the point of view of business, in particular, for HR-specialist, we need a more abstract approach, without technical features. This option, demonstrating all the necessary stages of the Big Data-project, will be the management decision-making cycle of Deming-Shewhart or Plan-Do-Check-Act (PDCA).
How the cycles PDCA and CRISP-DM are connected?
Process of acceptance and implementation of management decisions on the cycle PDCA is relevant for any business area, including practice of execution of IT projects in the Big Data direction: at first we plan activity, then we realize these plans, later we check extent of achievement of goals and, at last, we adjust the revealed discrepancies between planned and actual targets. Similar steps are included by the CRISP-DM standard: from formulation of an applied problem before expansion of software products. However, these 6 phases do not include continuous monitoring and correction of the received solutions that, in turn, involves an error of degradation of models of machine learning (Machine Learning).
The combination of two approaches will allow to avoid problems at a stage of expansion CRISP-DM due to recurrence of steps of PDCA. Thus, one of iterations on direct implementation of Big Data in the business direction is carried out at the stage Do, observed and adjusted, and then repeats taking into account necessary changes again.
Stages of big Data implementation in HR
As the main sense of HR consists in staffing of the company for execution of its key activity, it is reasonable to consider these steps in an applied context:
- Definition of a specific business problem, for example, increase in efficiency of sales. It is necessary to find factors which promote the high performance of sales managers in a section of HR analytics to attract and employ suitable people, and then to develop their potential.
- Filtering of data, creation of the dictionary of data and cleaning of information: removal of doubles, outdated values, etc. For example, how to determine “the personnel flowability”: whether it should be taken into account people who joined in a command less than half a year back, work incomplete day or left the company in the last day year? Creation of the dictionary of data is the cross-industry project – not only the main HR data (date of hiring, age, experience, information on education), recruiting characteristics (assessment before hiring, an interview), information on productivity (ratings, work distribution) and training (end of programs, certification, estimates) and also data on leadership (leadership skills, feedback coupling) are necessary.
- Creation of hypotheses and their implementation. For example, as the quality of interpersonal communications in a command affects efficiency of its work or what factors provoke professional burning out of employees and as to warn them. At this stage Data Scientist is connected to the HR analyst, organizing further work on the CRISP-DM standard: from the business analysis before implementation of model of machine learning (Machine Learning).
- The analysis of the received results, improvement of quality of the constructed models and their subsequent implementation in other business challenges with necessary adaptation.