Thoughtful Friday #10: The Collison Brothers, Diataxis & Documentation

Apr 15, 2022

“At YC we use the term “Collison installation” for the technique they invented. More diffident founders ask “Will you try our beta?” and if the answer is yes, they say “Great, we’ll send you a link.” But the Collison brothers weren’t going to wait. When anyone agreed to try Stripe they’d say “Right then, give me your laptop” and set them up on the spot.

There are two reasons founders resist going out and recruiting users individually. One is a combination of shyness and laziness. They’d rather sit at home writing code than go out and talk to a bunch of strangers and probably be rejected by most of them.” from “Do Things that Don’t Scale by Paul Graham”

I’m Sven, and this is Thoughtful Friday. The email that helps you understand and shape the one thing that will power the future: data. I’m also writing a book about the data mesh part of that.

But unlike the Thursday mails, this is a deep dive into one of my thoughts. It’s rough and my perspective on this will likely change. So for you, I think you should deeply think about this topic this week:

Why documentation is so important to any company’s success
Why sales material is mixed up with docs so often
Why the data space has a lot of improvement potential here
How the diataxis framework can be used to get substantial improvements, in less time

🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮🔮

The data space right now is filled with hundreds of start-ups, many of them having an inclination to openness and in particular open-source. Their challenge is to both, make the company and its product, as well as an open-source project successful.

And yet, most of them suck at documentation. Why should that be important? Because without great documentation, no one will use either your product or your OS project.

Key Point: Documentation is about getting people from the decision to use your project, to full usage of most of the features.

This space however has a second challenge I observe, because it’s not just about documentation, getting people to use the product extensively. It’s also about getting them to use the product or project at all. For that companies utilize demos, free trials, end-to-end examples, case studies, lots and lots of content, and (possibly also human) resources.

And yet, a lot of them fail here as well, because they are mixing up the sales & marketing material with the actual documentation.

Key Point: Sales & Marketing material is about showing the value proposition. About getting people to make the decision to use your project or product. It’s step 0, before documentation.

The convolution of sales & marketing material seems more common than I thought at first. Because it happens everywhere where there are no dedicated Sales & Marketing resources, and no clear boundaries. Start-ups, open-source projects, evangelists, internal development teams, and product teams working on new stuff, all of them are subject to this convolution problem.

Key Point: At the intersection of data, technical topics, open-source, and start-ups, the data space right now is particularly prone to bad documentation and the convolution of marketing & sales materials with documentation itself.

So what I’d like to do here is to show you, how a common way of thinking about documentation called “diataxis”, created by Daniele Procida, can be applied to think and structure all of these materials better, and faster.

The Knowledge System

“It’s not actually a secret and it certainly shouldn’t be: documentation needs to include and be structured around its four different functions: tutorials, how-to guides, technical reference, and explanation. Each of them requires a distinct mode of writing. People working with software need these four different kinds of documentation at different times, in different circumstances - so software usually needs them all, and they should all be integrated into your documentation.”

What I like to do is to add a fifth category here, because as said above, this is about people “working” with a piece of software. Step 0 however is always to make them aware of a piece of software, to get them to try it!

Second, I don’t like to call it a “documentation system”. Let’s call it a knowledge system, because docs kind of imply written things, but knowledge can be transmitted verbally, through videos, working demos, and so many other mediums.

But what’s the point of having a framework without knowing why? If the point of the docs framework is to get people to use it, I’d say the point of the whole framework is to:

1. Get people from not even knowing of a product’s existence to full usage

2. , as fast as possible!

To be able to do that we need to do two things, which the Collison brothers excel:

1. To provide the knowledge inside all five categories, well-separated

2. To then focus on moving between the categories as fast as possible

The Collison brothers in Action: Category 0->1

What the Collison brothers did differently than other YC founders was that they didn’t just focus on Step 0, selling people on trying something.

They went one step further and got them, on the spot, to learn how to use Stripe. They had both pieces of knowledge and moved extremely fast between them.

Another example: When I do workshops/talks I really like to have them in two parts, one being focused on selling the value of for instance “Infrastructure as Code”, and a second one focused on getting people to experience it, getting started as quickly as possible. So that at the end of the day, people know why to use it, and also know how to do the very first steps.

What the Collison brothers did differently than other YC founders was that they didn’t just focus on Step 0, selling people on trying something.

They went one step further and got them, on the spot, to learn how to use Stripe. They had both pieces of knowledge and moved extremely fast between them.

Category 1: Get Started as quickly as possible

The Django framework does a great job at organizing their knowledge to get people to start as quickly as possible.

They leverage three ideas. One is the idea of an “Overview”, but a short one. Second, they separated “from scratch” and “first useful steps” into different categories, because after all, you might’ve already created an account or installed the software earlier.

Third, they break their lessons into super small pieces, each one of them an actual individual useful lesson.

Let’s take a look at a set of docs that might need some improvement in this first step, https://docs.kensu.io/.

Note: All that stuff is still new as far as I know, so I think Andy et. al. don’t mind me poking around it a bit.

I was playing around with Kensu and looked at the docs and simply got the feeling, that stuff is missing. So let’s see, using this framework, what is missing.

Ah, so basically, kensu.io right now only has a small subset of the “tutorial” docs ready. They provided something they call a “use case”. But I am not sure, what category it aims at.

If you look at the docs, it will be really hard to get started. It will also be equally hard to answer a specific question because there are no “how-tos”. I love “use cases” but they are for me mostly marketing material, category 0.

So what’s missing here is a good and filled tutorial section, something that shows me in multiple little lessons how to use Kensu.io and its library.

Category 2: Solving specific problems with How-Tos

Ok now once I get started with something, I learned that cooking is important, and I learned how to handle pots, and spaghetti in general, I will need a spaghetti & tomato sauce recipe to solve my actual problem.

This is a very practical thing, I got started, and did my first few steps, but they weren’t relevant to my actual problem. Now I got an actual problem and want it solved.

Let us pick a good problem: I want to use hex and create a graphic from a CSV data set. That’s a pretty specific key problem I hope. Head over to the docs and try to find the solution.

This is a very practical thing, I got started, and did my first few steps, but they weren’t relevant to my actual problem. Now I got an actual problem and want it solved.

Let us pick a good problem: I want to use hex and create a graphic from a CSV data set. That’s a pretty specific key problem I hope. Head over to the docs and try to find the solution.

This is a very practical thing, I got started, and did my first few steps, but they weren’t relevant to my actual problem. Now I got an actual problem and want it solved.

Let us pick a good problem: I want to use hex and create a graphic from a CSV data set. That’s a pretty specific key problem I hope. Head over to the docs and try to find the solution.

Note: I really like how hex is working on organizing their knowledge base, so again I don’t think they mind me poking around in it a bit for the purposes of this thought.

As you can see, for this main recipe, I’ll have to do some serious work to get it done. And that is the case for all major use cases because from what I can tell, the hex docs right now throw a lot of these categories together. And that makes it really hard to find what you’re looking for.

So what could a cool doc space look like?

If you happen to know the hex docs, you might notice that they actually have the content there, it’s just not organized into these categories.

Key point: organizing knowledge using this 5 cell system is actually faster & easier because you end up writing the content anyways, it might just be lost without a clear system that aligns with the reader’s intent.

Now that we know how to make spaghetti, let’s look at organizing the rest of the knowledge base.

Category 3 & 4: References & Explanation

Turns out, if you look at the core of the hex.tech documentation, a lot of it is a mix of references & explanations. That’s very much like the django documentation, the explanations, or topic guides as they call them, make up a lot of the framework itself. It’s opinionated and you’ll need to understand the reasons behind things.

But there is one key difference, the django docs aim to clearly separate the reference from the explanations because references always aim at a very practical “I’m working on this thing and I need to know what this attribute means” kind of question whereas explanations aim to answer things like “I just want to know in general what these views are” kind of questions.

These are very different intents of the reader of the docs, and yet this is very often thrown into the same bucket.

I just stumbled through the meltano documentation and while it’s pretty messy, it at least has the placeholders for these two categories:

One in-depth “concepts” explanation section and one reference section.

Key point: References & Explanations are often thrown together, but they serve very different readers’ intents, so it’s important to separate them. That makes writing them a lot easy btw.! References can usually be generated out of code automatically, while explanations are a whole different thing and might even contain architectural decisions.

Separating Category 0 from the rest

I also stumbled over the starting page of the hex docs, which you can see below. Interestingly this with the Collison story above prompted me to include a category 0 into the framework. Notice the “use cases” section below?

Turns out, if you think about it, use cases and “end-to-end” examples usually are often meant for marketing. In some cases, they end up being like a tutorial or a how-to guide, but most of the ones I see, like the one from Kensu above, are meant to spike your interest in the product.

But here a lot of products fall into a fallacy, they think because they already covered “how to load CSVs into the system and turn it into a graph” inside some use case that they use for marketing, they don’t need to cover it in the docs.

But the truth is, since use cases aim at a very different audience (usually), they leave out important parts, and at the same time don’t tell the whole story.

So it is important to realize this category is there and needs to be separated. And FWIW, as you can see, the hex team already separated the use cases out of the “true docs”, they just appear on the starting page.

Key point: Don’t throw together use cases (meant to get people to consider your product) with the rest of your knowledge base (meant to get people to use your product to the full extent).

—

Are you ready to get more people onto your product by thinking about your knowledge base differently? If so, don’t leave feedback and just get to work! Otherwise, I’d love some feedback.

It is terrible | It’s pretty bad | average newsletter… | good content… | I love it, will forward!

By Sven Balnojan

Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.

Tweet Share

In order to unsubscribe, click here.

If you were forwarded this newsletter and you like it, you can subscribe here.

Three Data Point Thursday

Discussion about this post