Importance of White Paper in a Data Science pipeline

Kopal Sharma
2 min readSep 1, 2020

Q. What is a white paper?
Ans. It is an authoritative report that informs readers concisely about a complex issue

Q. Why do we need it? And specially why in a data science pipeline?

Ans. Now here are my two cents on it.

What one needs to understand that data science is not just about collecting data, applying an algorithm and boom here are some xyz results. Nope.

Before even one thinks of starting applying data science techniques, he needs to formulate a ‘problem statement’. And to formulate this problem statement, one needs ‘research’. This is where the white paper comes in.

The need for this arises when a person works for an organization, in my case the government.
I cannot just go ahead and ‘suggest a solution’ to them.

Before I give them my solution I have to tell them the problem, and before that I need to tell them that why is that particular a problem and how big is it.

Founders, CEOs, decision makers do not think in the terms of a data scientist. For them there are pretty simple equations, why is it a problem, how big is it, how much fund do I invest to tackle it, what is the solution.

To tackle the first 2 questions, we write this entire paper, with the purpose to present research and fact based findings, so that the decision makers can further anticipate if they want to even go ahead with a solution or not.

Another purpose it solves is, during the extensive research, even the data scientist finds out that the problems he was assuming need to be tackled are they even really problems?…do they even need to be tackled? Or were they just assuming?

A white paper is not always necessary, but when working a big problem like tackling the COVID-19 and suggesting solutions to the government, a white papers doesn’t become an option but a mandate.

--

--