Defining the need for Data Products
By definition Data Products give output from a statistical analysis. Data Products are used to automate complex data analysis using technology by the use of relevant algorithms. The technology used could be as simple and ubiquitous as excel or sophisticated and less pervasive as Apache Spark and Zeppelin.
I vividly remember creating and extensively using my first automation tool when I joined my first job after graduation. It was an excel based expense splitting tool which made it easy for me and my fellow flat mates for dividing expenses. It helped avoid using a notebook or plain excel, helped automate complex calculations at the end of month and also told us when someone was in huge debt.How are data products useful over statistical or analytical analysis which is manual? Click To Tweet
A Data Product is something similar. It takes data (could be humongous amount of data as well) as input, does complex statistical and/or machine learning analysis using state of the art algorithms and throws outputs which solve specific business problems for which the product is built for. The business problem could be recommending highly targeted content to visitors of a news website, predicting customer churn for an e-commerce store, finding out Influencing Factors for loss of brand share and much more.
The next thing which comes to mind is ‘So how are data products useful over statistical or analytical analysis which is manual?’
A Data Product is most useful in business scenarios which require continuous solutions at frequent intervals of time and hence makes manual analysis cumbersome, time consuming and liable to errors, for example Sales Forecasting in a CPG company to plan manufacturing related activities, manpower, inventory and raw material. This kind of forecasting need is best solved by a Data Product which automates the process of forecasting, uses data from various sources, runs and compares multiple forecasting algorithms and gives the best result.
This kind of Data Product for forecasting also makes it really fast to run analysis and generate forecasts, it can reduce the time over manual analysis by factor of 5x or more. Not only that, the input data selection, setting various parameters and running the analysis does not require the end user to have any detailed knowledge of what is going on inside the system nor need to understand the algorithms (like Triple Exponential Smoothing, ARIMA et al) which lead to the results. User just needs to have basic understanding of how to use the product to get the desired output. However, some experience in the domain and business analysis helps.
A well-built data product also helps avoid inconsistencies which may arise in manual analysis, reasons being human errors while analyzing the data or ulterior motives to tweak the analysis and results.
One might think all that is fine but ‘how does Data Product help a business analyst to be a citizen data scientist’
To understand this, we need to understand the skill sets and job roles of business analyst and data scientist and then try to assess whether Data Products could help a business analyst Bridge the gap of long learning curve of being a data scientist.
A data scientist is an expert in the processes and systems to extract useful knowledge and insights from the data, convert it into easily understandable form and drive business results or helps solve business problems through it.
A business analyst needs to have in depth understanding of a business domain, it’s processes and systems and may need to understand the business model and its integration with technology. A data product helps business analyst solve a particular business challenge using the related data without knowing the algorithms that go inside the product. If the business analyst has good understanding of the business problem and what is the desired result then for that particular problem, the business analyst equipped with the data product is as effective as any data scientist working on that problem.
Given a set of cognitive data products, which solve multiple problems for a business, a business analyst could perform most of the functions of a data scientist for that business in a more efficient and consistent manner. The challenge lies in developing that comprehensive set of data products, which is time consuming and requires rare skill sets to work together.