Who is a Data Analyst in Big Data: what a data analyst needs to know
Continuing the conversation about where to start entering big data, and what are the IT specialties, today we will tell you what exactly does a Big Data analyst, what he should know and be able to, as well as where and how to get the necessary professional competencies.
What does a Data Analyst do
As a rule, Data Analyst works with information arrays, independently performing a whole set of operations:
•data collection;
• preparation of data for analysis (sampling, cleaning, sorting);
• search for patterns in information sets;
• data visualization to quickly understand existing results and future trends;
• formulation of hypotheses to improve specific business metrics by changing other indicators.
All these tasks are necessary to achieve the main goal of data analytics – extracting information valuable from business from information arrays for making optimal management decisions.
In some companies, the duties of a data analyst also include their modeling, development and testing of Machine Learning models. However, in most cases, Machine Learning is the responsibility of the researcher or data scientist. With a more detailed division of labor, machine learning is done by a separate specialist.
It is also worth noting that sometimes Data Analyst analyzes business processes and works very closely with other IT specialists in describing flows and repositories of corporate information. Thus, the responsibilities of data analytics also include the tasks of Business Intelligence (BI) and the optimization of production processes.
Image 1: Professional Data Analytics portrait
Data Analyst professional competencies: what a Data Analyst needs to know
Based on the above tasks, you can determine the following areas of knowledge required for data analytics:
• information technology – methods and tools for data mining (Data Mining) – programming languages (R, Python, etc.) and SQL-like languages for writing queries to non-relational and relational databases, as well as BI systems, ETL repositories and data marts such as Tableau, Power BI, QlikView, etc., as well as the basics of the Apache Hadoop infrastructure;
• mathematics (statistics, probability theory, discrete mathematics);
• system analysis, quality management, project management and methods of analyzing business processes (lean manufacturing approaches, SWOT, ABC, PDCA, IDEF, EPC, BPMN, BSC, etc.).
In addition, applied knowledge and practical experience specific to the subject area in which Data Analyst will be very useful will be very useful. For example, the basics of accounting are useful for data analytics in the bank, and marketing methods will help in analyzing information about customer needs or evaluating new markets.
Big Data specificity adds to these basic competencies of Data Analyst more skills in working with data lakes (Data Lakes), understanding of information security and data management (Data Governance), as well as knowledge of typical scenarios of digitalization (digital transformation) and the application of big data technologies in various subject areas (use-cases).