3 Actionable Tactics To Create A Good Data Strategy
This is the Three Data Point Thursday, making your business smarter with data & AI.
Want to share anything with me? Hit me up on Twitter @sbalnojan or Linkedin.
Let’s dive in!
Actionable Insights
If you only have a few minutes, here’s what will make your business smarter with a good data strategy inspired by Netflix:
Netflix turned itself into a data company. Netflix today is known for being a data powerhouse, but if you think about the beginning, Netflix wasn’t built on data in the early days, unlike companies like Google. There are ways to turn yourself into a data company.
All tactics for coming up with good data strategies are about leverage. No matter what you do to develop your data strategy, your one task is to identify leverage, multipliers, or power you can use to turn data into value.
Tactic 1: Map your value chain and focus on transitions. It’s as easy as that: map out your complete value chain from inputs (supply) to outputs (demand) and then brainstorm methods to improve the transitions.
Tactic 2: Play Perfect World. Take any problem; imagine you have all the information in the world, and now look at how your solution plays out. Now, take a step back and consider how to use data to get close to that “perfect information.”
Tactic 3: Do a Pipeline Analysis. Collect numbers for your value chain or a sub-chain. What would happen if you increase one conversion rate, absolute number, or any number, and then see what happens to the total output of the value you’re producing? That’s how you find many more examples of leverage inside your company.
In 1998, Bill Clinton was the second president ever to face an impeachment trial in his case over the Lewinsky scandal. Al-Qaeda bombed the U.S. embassies in Kenya and Tanzania, and two much more innocent events took place: Google and Netflix were founded.
Fast forward to today, and these businesses have become data powerhouses. But while Google was founded on using the Page rank algorithm to crunch data, Netflix was far from it.
Netflix was born as a mail DVD company, a new concept then. Famously, the two founders, Reid Hasting and Marc Randolph, mailed themselves a DVD to see whether it arrived intact. It did, and Netflix was born.
However, in 1999, Netflix turned to their famous subscription all-you-can-eat model, and a year later, they started to slowly but steadily turn themselves into one of the biggest data companies of our time.
Finding Good Data Strategy Ideas
Good data strategies, as all good business strategies, are simple: They find a fulcrum, a lever, and then push hard! If the data strategy is simple, why do so few companies have one? Because the hard part is finding these sources of power inside your business!
But no worries, that’s why I’m here. I will share three tactics you can use to brainstorm potential good (!) data strategies, just like Netflix did.
Whether you’re a CEO, founder, or product manager, these tactics will help you brainstorm effective business strategies for your data-heavy products.
But first, let’s take a quick look at Netflix and data…
The early signs, data strategy
While Netflix wasn’t born a data company, a particular affinity for data always existed. Reed came in with a Masters in computer science, and he hired Stan Lanning very early in the company as chief of recommendations to work on the data and algorithms.
Netflix developed experimentation systems in 1999 (one year after launching), a very successful recommendation algorithm called Cinematch in 2000, a rating system to make this even stronger in 2001, started to launch into multiple algorithms in 2002, and much more, culminating in the famous 1 million dollar Netflix prize in 2006 for anyone who improves the recommendation algorithm by a tiny percentage point.
So, how did Netflix develop its data strategy? I’d argue it all started with the 1999 pivot into the subscription model.
Tactic 1: Mapping out the value chain
Reed and Marc always had their new value chain for their business in their heads.
This graphic depicts Netflix's value chain in 1999. In 1999, Netflix had a DVD acquisition and customer acquisition business model. These steps you see above are the value chain going from inputs to outputs, supply to demand, and adding value to the final product Netflix produces at each step. There was nothing fancy in between; they got DVDs, marketed to customers, and then got DVDs to their customers, that’s it.
Transitioning
Value chains add value to your product over time, each step, each segment inside the chain. Between each step, there is a transition, and something happens.
Marc and Reed didn’t create content, but once content was created, they still needed to find content to acquire it; that’s the transition from Content Creation => Content Acquisition.
Inside the Toyota Production System, the concept of the Andon Cord is now famous. It is a robe that any worker at the assembly line can and will pull to stop the whole production line, knowing that each defect must be fixed immediately or otherwise the downstream consequences will worsen.
If you look through the eyes of data strategies, however, you see one truth: What Toyota realized was that the transitions inside the value chain are fulcrums. You're toast if you move from one step to the other and pass a defective part. The data collected from the individual workers are the lever that uses these fulcrums. And sometimes, they push with all their weight (literally).
Netflixes transitions
In 1999, Marc and Reed looked at their value chain and realized it was weak compared to the physical DVD rental place. The value they added was lower than that of the physical store; they started with limited DVDs and customers and weren’t great at the transition phase.
So they asked themselves:
How can we use data to improve these transitions?
Well, in 1999, there were three potential transitions Netflix could’ve tackled:
finding better content
targeting potential customers better
OR the one transition they already controlled completely: content acquisition to customer acquisition.
The key to their subscription business for Netflix in 1999 was simple: get content, get customers, and match the two together. Matching a vast collection of DVDs doesn’t do any good if customers don’t know which to choose. Likewise, many customers with only a few DVDs won’t work; the question is, what DVD should be recommended to which customer?
Reed and Marc decided to focus on their core transition. However, that doesn’t mean you should discard any other transition. You’ll have to figure out for yourself what the impact of the others would be. Reed and Marc reasoned they get the most impact from working on their core transition.
To transition to match is an opportunity to use data. And that’s just what Reid did, as he hired Stan Lanning to work on algorithms and data to match these two together as one of the very early employees.
“Pretty quickly, the men realized that if the website could be programmed to act similar to a savvy video-store clerk—directing people to movies they loved, not just ones they liked—subscribers would keep returning. But unlike a Blockbuster employee who had an entire store’s worth of VHS movies to refer to, Netflix’s business rested entirely on DVDs, and the number of titles on the new format was limited. So the company’s virtual clerk had to really know customers’ tastes to match them to the thin catalog of films.” (Source)
Netflix’s first data strategy
The Andon Cord at Toyota isn’t exactly a big data tool. In fact, it is a data collection tool that only rarely collects data, tiny data, so to speak. But that is precisely the magic of Toyota; they didn’t try to use lots of data; they created the simplest solution, the simplest lever to push on the fulcrum they discovered.
Netflix did just that. Stan Lanning, fully capable of building machine learning systems at scale, didn’t build one. He didn’t try to use masses of data, colossal DVD catalogs, or customer data. Instead, he saw the transition, the fulcrum, and designed the most straightforward lever he could think of. The true challenge was to derive the customers' tastes, so he built a simple rating system, a plain data collection tool.
Data strategy 1: Any data strategy derived from a value chain transition is plain and simple: Increase the value to the customer by making one key transition smoother.
Netflix carries this strategy to the extreme, now matching customers with TV shows, movies, and games, matching potential movie scripts,... and much more. This tactic creates more data-powered products and features, thus driving profit at Netflix. It can do the same for your business, too.
Find one key transition as a lever, then push the lever with all your data weight.
Note: While looking from today’s perspective, the whole business model is a platform model, back in the day, it wasn’t. It could’ve gone in a million directions, and that’s where this brainstorming exercise helps.
Tactic 2: The Perfect World
The Cinematch rating system was all about getting something Netflix didn’t have at the time - the customer tastes. After all, the team didn’t talk to their customers like a DVD Store clerk did. But they needed to know how much each customer liked a particular DVD. From the information perspective, information is like hidden inputs inside the value chain. At every step, you need the information to make it smooth. In particular, if you consider the steps outside your domain, like content creation for Netflix, it is easy to miss how much you genuinely rely on information.
The Perfect World
Besides Andon Cords, another essential concept of the Toyota Production System is Just-in-Time. But for car manufacturers like Toyota, JIT ain’t as easy as it sounds. Because they rely on hidden inputs. Cars are mostly assembled by the car manufacturer relying on a heavy network of suppliers and distributors. So, to pull off JIT, you’ll need to know tons of information from both suppliers and distributors. That information is hard to get; Toyota had to help suppliers and distributors get set up with good tracking systems for inventory and then get them to transmit that data to shared interfaces.
Anyone working on interfacing IT projects will realize this is a huge task.
It is not that no one before Toyota has thought about getting better data from suppliers or distributors; I bet all car manufacturers would’ve loved that. They probably tried to get their partners to deliver it but failed.
The difference for Toyota was a change in mindset: They played Perfect World. Instead of thinking, “More information would be nice,” they went all in. They asked themselves:
Imagine a perfect world where you know everything, literally everything. What steps of your value chain would become 10x more valuable? What problem you’re tackling would go away?
They didn’t just think about distributor information, which is almost worthless without the supplier data. They thought through multiple layers and identified that the change in outcome would be huge if they could get supplier and distributor data in close to real-time.
Netflix’s Perfect World
If you think through Netflix’s value chain in the early 2000s, you know they matched existing content to existing customers. They used the data from their rating system to recommend content they had to more customers. Of course, many more shows and movies are produced than what Netflix offers. New content is vital to the game Netflix plays.
But at the same time, something unexpected happened; the big studios started to block Netflix by raising the content prices astronomically. Netflix was forced to find other ways of getting content. One prominent idea was for Netflix to produce the content itself. But, the follow-up question became: What content could Netflix produce that would do great?
Without an exercise in Perfect Worldism, producing content yourself should seem insane to Netflix. How should they be able to compete with professional studios with a hundred years of experience?
But just like Toyota, Netflix thought further: in a perfect world, they should know what content does well before it is produced!
Netflix’s 2nd data strategy
Netflix saw that the combination of producing/ sourcing content themselves, combined with the fulcrum of finding the right content and the lever of their big masses of data by then, would be the kernel of a good data strategy.
“Based on its knowledge of members' tastes, Netflix predicts that 100 million members will watch Stranger Things and invested $500 million in the series.
The data science team predicted 20 million watchers for the quirky adult cartoon Bojack Horseman, so Netflix invested $100 million in this animated TV series.
Based on a prediction that one million members will watch Everest climbing documentaries, Netflix invests $5 million in this genre.
Netflix has a huge advantage in its ability to right-size its original content investment, fueled by its ability to forecast how many members will watch a specific movie, documentary, or TV show.” - A brief history of personalization at Netflix
Data strategy 2: Solve the bottleneck in your value chain, the big business problem that relies on getting information, by throwing data & algorithms at it.
The ideas you get from Perfect World probing usually only work if you go deep, go to the outer corners of your value chain, and think through additional information that would be nice.
As you can see with JIT at Toyota and content creation at Netflix, these strategies tend to combine a chain of steps, not just a single one with more data. The data alone wasn’t valuable enough for car manufacturers to invest in getting it. The idea was to build up a huge JIT-based factory to use this data.
Similarly, Netflix didn’t just build a system to rate movies they wanted to buy; they also invested heavily in producing them.
Tactic 3: Pipeline Analysis
Andrew Ng, the godfather of machine learning, teaches a simple technique in his machine learning 101 classes that is entirely underrated: It’s called pipeline analysis.
In a pipeline analysis, you take a straightforward approach; you map out all the steps you take to derive a result, an outcome.
Next, you take any step and simply imagine you’re 10x the output of it, or making the quality 100%, or the accuracy 100%, even if you don’t know how to do it.
Then you ask yourself, what would then happen to the outcome?
Pipeline Analysis
If I make breakfast, I go through a pipeline:
I make coffee for me to wake up
I drink my coffee ;)
I put plates, forks, etc., on the table
I make an omelet,
I arrange everything on the table.
These steps happen in a sequence that depends on the order. I cannot drink a coffee before it is made. And my wife would get furious if I put an omelet on the table before the plate was there.
Most people think sequence means each step is essential, and they are. But they are not equally important! If you just look from step to step, like from plate to omelet, you might think each step is equally important because each step is vital for the next one.
But, in the final outcome, not all steps are equal. One or a set of two will be 10x more important than all the others. And all that matters is the final outcome.
Imagine I made the best coffee in the world. Would it change a lot? Well, the only purpose of the coffee is to wake me up, so it would delight me, but that’s not why I’m making breakfast. The reason I’m making breakfast is to make my family happy. So, it likely wouldn’t change the outcome much.
If I make a fantastic omelet, however, my family will be delighted, so a 10% quality increase in the omelet is much better than working on the coffee.
The critical insight for the pipeline analysis is that steps depend on the output of the previous one. The breakfast does rely on me being awake; without plates, there’s no food on the table,... but a 10% increase in one step might not produce a 10% increase in the next, and certainly not in the outcome. There are bottlenecks, and you gotta find them, and the Pipeline Analysis is an excellent way of doing so.
Andrew Ng uses this to find the critical step in a data pipeline to increase the accuracy of classifiers, but it works equally well to define excellent data strategies.
Pipelines at Netflix
In 2005, Netflix had invested heavily in data, with machine learning teams building recommendation systems for every part of the system, data scientists recommending actions to management, and marketing automation working its magic on me.
The most important pipeline had become the show & movie pipe:
Content & Idea Sourcing
Content Creation
Matching Content to Customers
Customer Sourcing
Netflix had a solid system to pull off (3). After all, it was their first good data strategy, which they repeatedly improved.
I’m almost certain someone at Netflix asked this question in 2006: If I go through our value chain and increase any output by a couple of percentage points, what increases the watch time the most?
For Netflix to grow, they knew they had to focus on one outcome only: Increase the number of minutes people watched content. What Netflix almost certainly did is to pull a bunch of numbers from their systems and put them into these steps:
Content & Idea Sourcing: 10% better content => 0-10% more minutes watched
Content Creation: 10% faster content creation => 10% less costs, more money available, => 0-1% more minutes watched
Matching Content to Customers: 10% better matching => 20-30% more minutes watched (!)
Customer Sourcing: 10% better sourcing => 10% more money available => 0-1% more minutes watched
The Pipeline has a simple bottleneck: you can save money by making steps more efficient, but even if you create better content, you’re not going anywhere if you cannot match it to the right customers. The matching content to customers step is the bottleneck. If you improve that, you’re getting a multiplier on your efforts.
Netflix’s 3rd data strategy
A fraction of a percentage point of accuracy on the recommendation algorithm would drive up margins by millions. But that insight alone was not enough. Netflix realized they weren’t able to drive up the percentage points more. But since they identified a fulcrum, they knew they could throw serious money at it, And thus, the Netflix prize was created - a competition to beat the baseline created by Netflix for a cool million dollars.
The key to this approach is different from the other two techniques; it relies on many numbers. You’ll need to take your value chain or a sub-value chain apart and add many numbers. How many customers come to your website? What’s the conversion rate? What is the churn rate? All of those numbers will give you a map to work with.
Once you have the map, take a stab at hypothetically increasing a few numbers to find the actual most essential points you want to work with.
The magic in a data strategy derived from this process is that with algorithms and data, you’ll quickly get an idea of which strategies can work and which won’t.
Data Strategy 3: Use data & algorithms to increase the bottleneck component of your value pipelines by a meaningful percentage.
The creators of the Netflix price knew what they needed to do but didn’t know how. They didn’t know how to use data and algorithms to improve the recommendation system.
That is also true for this technique. This technique won’t give you a silver bullet, just many starting points to explore.
It’s about generating potential ideas!
What you get from these three techniques are ideas that can form the kernel of a good data strategy. Those strategies help you create unique data-heavy products. But there’s no silver bullet!
All of the ideas generated here will share many of the core elements needed to make great data strategies; they will be naturally simple, focus on a few leverage points, and thus create multiplier effects.
However, it is still your job to review the list of 20+ ideas and create good data strategies and product ideas. That means you still need to do the hard product work, talk to customers, and look seriously at the competition and your company.
But if you do, you might just end up with an excellent good data strategy that puts you ahead by 10x.
Notes on resources
If you’re interested in the idea of kernels of good (business) strategies in general, I can recommend the book “Good Strategy Bad Strategy” by Richard Rumelt, a brilliant and clear strategy thinker. One of the most underrated business books ever written. And I keep on rereading it regularly.
You can also check out the HBS case on the Netflix price for more details on the business thinking behind it.
Here are some special goodies for my readers
👉 The Data-Heavy Product Idea Checklist - Got a new product idea for a data-heavy product? Then, use this 26-point checklist to see whether it’s good!
The Practical Alternative Data Brief - The exec. Summary on (our perspective on) alternative data - filled with exercises to play around with.
The Alt Data Inspiration List - Dozens of ideas on alternative data sources to get you inspired!