Not Everything Data Related Should Be a Data Product

Sep 01, 2022

The best part of applying a framework like data product internally at a company is that it helps create a way of thinking about data science investments and their sustainability. The worst part of it is that framework can make it feel like everything that’s valuable is or becomes a data product. But I strongly disagree with that perspective, and I think it is worth talking about why. Just to be clear, I am focusing here on internal data products.

In many organizations, a key driver of becoming more senior is the “multiplier effect” that an individual has on the organization. For example, senior staff roles at many companies have less to do with increasing levels of technical expertise than scaling the impact of that technical expertise across the company. There’s a parallel here with data product - data products should help scale a capability to a broader group of people if that capability has a multiplier effect on the organization.

Experimentation is a useful example. For many early stage companies, in particular those in the B2B space, running experiments might be interesting, but it is not practically useful. The number of observations is too low and the time to effect decisions is likely too long. If an experiment is necessary, it is likely valuable to have a data scientist setup a fairly hands on process to design, run and interpret that experiment.

Another example is metric definition and management. If a company is at a place where its key metrics number in the single digits, it is probably worth having a subject matter expert manage individual metric definitions and updates to those definitions, perhaps without the support of a platform. Having all the bells and whistles of a metrics product probably saves some time, but it is arguable if it really is necessary for the organization.

Experimentation and metrics are just two examples and if you are reading this, I’m certain you can generate at least 3-4 more in a few minutes time. The key question here is not whether something can be a data product in an organization. The answer is that without enough engineering, data science and product support, it probably can. The question is should it become a data product? This is where the role of a data product manager and leaders in the company really matters - they should decide not only what to invest in, but explicitly define what is not worth investing in now but may be worth investing in later.

There are myriad questions that can help define this boundary like “who in the organization needs this capability?”, “how many people in the organization need it?”, “what is the return we get for going from manual to a product that supports it?”, “what is the cost?”. Productizing something that was done ad-hoc or manually before sounds great in theory, but there’s a tipping point in every organization for when that transition actually makes sense. In some cases, organizations decide early on that they will definitely hit that point and invest before it becomes a problem. In other cases, they wait until they’ve experienced pain for years before investing. I’m not here to say there’s a right way.

My point in writing this is to elevate the importance of discussing if something should become a data product. It is really tempting to automate everything, to democratize a capability, so to speak. It is much harder to say “I don’t think that’s the right investment for us, let’s keep doing it more manually.” But for many companies and organizations, that might be the right answer.

I’m hopeful that as a community, we can talk about how challenging these decisions are and also admit that we get them wrong fairly often. I definitely have.

Burned Out SWO Shipbuilder

Sep 3, 2022

Also consider that the ‘Productizing’ a manual or semi-manual process behind a metric’s assembly and dissemination can remove the experience of categorizing and interpreting data which has inherent value. LSS visual controls are all founded on the idea that the metrics developed (visual controls) are all touched by a human to understand root cause and trends as well as creating the opportunity to throw out ‘wacky’ data. Data literacy also includes the ability to understand what needs to be a finished data product and what needs to be ‘raw’ - ready for humans to sort it before consumption as a metric.

Expand full comment

From Data to Product

Discussion about this post