Great question, so my opinion is a bit more nuanced on this question.
But to sum it up: I think you should collect just "good data", data aligned with a data strategy that is derived from your company strategy. But from "good data" you should collect vast amounts! And some of that might come from your already existing dark data.
In detail:
1. YES! I don't think you should ever just randomly collect data.
3. But since data is becoming more and more important for every company, ONCE the company has a data strategy in place that is derived from the company strategy, dark data is very much a good place to look for data, simply because (if you read the article above) getting data is a hugely expensive enterprise.
Follow up question: Then how would you define a data strategy? What would a data strategy look like, and how would you sell this to your company as an endeavour worth pursuing?
You're not going to like the answer: You're not going to sell it if your company doesn't already has a good goal setting process in mind and invites a top-down & bottom-up approach (like OKRs, or Amazons OP1 & 2s).
Good data strategies need to be built into the whole company, not just the data department, so if you don't know how to sell it, you're likely not going to succeed with it. (I've tried in the past and failed miserably ;) So I might be biased here).
Hahaha good to know. What would be an example of a good set of OKRs we could set at the company level in order to promote a data-first mindset? We currently have OKRs at the company and we're quite good at top-down & bottom-up, we just lack subject matter experts with a lot of experience.
Question - doesn't actively collecting potential dark data run the risk of a "data swamp" ?
Great question, so my opinion is a bit more nuanced on this question.
But to sum it up: I think you should collect just "good data", data aligned with a data strategy that is derived from your company strategy. But from "good data" you should collect vast amounts! And some of that might come from your already existing dark data.
In detail:
1. YES! I don't think you should ever just randomly collect data.
2. But not because I think the "data swamp" is a real fear, but rather because I think a data strategy should be derived from the company strategy (I wrote about this quite some time ago: https://towardsdatascience.com/data-strategy-good-data-vs-bad-data-d40f85d7ba4e?gi=a95e36d2522e). I call this "good data" :-)
3. But since data is becoming more and more important for every company, ONCE the company has a data strategy in place that is derived from the company strategy, dark data is very much a good place to look for data, simply because (if you read the article above) getting data is a hugely expensive enterprise.
Does that make sense?
Yes that definitely makes sense.
Follow up question: Then how would you define a data strategy? What would a data strategy look like, and how would you sell this to your company as an endeavour worth pursuing?
You're not going to like the answer: You're not going to sell it if your company doesn't already has a good goal setting process in mind and invites a top-down & bottom-up approach (like OKRs, or Amazons OP1 & 2s).
Good data strategies need to be built into the whole company, not just the data department, so if you don't know how to sell it, you're likely not going to succeed with it. (I've tried in the past and failed miserably ;) So I might be biased here).
Hahaha good to know. What would be an example of a good set of OKRs we could set at the company level in order to promote a data-first mindset? We currently have OKRs at the company and we're quite good at top-down & bottom-up, we just lack subject matter experts with a lot of experience.
Hm, not sure I have good examples ready, and I was hoping someone else would pitch in ;)
But what I can point you to is my article on Kolibri Games, where I'm sharing the journey guided by three different goals (as I understood it). https://towardsdatascience.com/start-up-data-mesh-blueprint-3-steps-for-becoming-a-data-driven-start-up-through-the-data-mesh-ae9540c1a846.