How to NOT ChatGPT; Three Data Point Thursday #86
Investment is funnelled into products with little perspective for the future. More data mesh research. OS data start-ups might be the place for you!
I’m Sven and I’m writing this to help you (1) build excellent data companies, (2) build great data-heavy products, (3) become a high-performance data team & (4) build great things with open source.
Every other Thursday, I share my opinion on three pieces of content about the data world.
Let’s dive in!
“As always, awesome stuff! I enjoy learning about completely foreign concepts in easy-to-swallow pieces of information” - (anonymous rating of a Thoughtful Friday issue)
This response made my heart jump for joy. My first response was to click the like button. Then I realized, there isn’t one ;) So consider this a “like”, whoever you are. It’s exactly what I try to do here.
gAI products & startups are appearing everywhere
Most of the investment (other than in OpenAI) is funneled into products with little perspective for the future
More data mesh research is happening in Germany
Open-source data startups might be the best investment for your time
🐰🐰🐰🐰🐰🐰🐰🐰🐰🐰🐰🐰🐰🐰🐰
(1) How to build the wrong thing with generative AI
What: The economist takes a stab at ChatGPT. Outside of OpenAI, we’ve already seen $11bn poured into generative-AI startups. Most of that is in the Chatbot & text section as well as other platforms like ChatGPT. OpenAI of course tops it all with a new $10bn investment from Microsoft. According to the article, the timing of the crypto-winter together with the emergence of ChatGPT makes for a Cambrian explosion of investments and startsup in the generative-AI sector.
My perspective: Here is a very simple guide on what to build and what not to build with generative AI:
Do build platforms (“you can’t go wrong selling picks and shovels in a gold rush”).
Do build aids for humans to bridge the gap between a bottom 50% professional and a top 5% one. (GitHub CoPilot, etc.)
Do build replacements for humans with gAI (arguably art created by AI, more and more newsletters use images created by WALL-E instead of stock images, Descript AI)
Do not build humans aids for things we don’t want to be experts in (Ask Seneca, Notion AI)
Do not build something based on generically available data & models (Descript AI again...)
Why?
Sadly, most of the investment seems to fall into the areas of 4 & 5. (5) are easy to attack business models. As such, as standalone products these have next to no chance standing long-term.
Descript AI, the cheap Adobe Premiere video editing software has a great transcribe feature, as long as you love the test of their features. But once you hit a wall there, you will upgrade to Premiere Pro and enjoy their just as good transcribe feature. After all, voice to text isn’t proprietary technology.
I enjoy that gmail proposes text snippets to use, but I wouldn’t pay for it. It’s a nice cherry on top, not a stand alone value. The reason simply is, it’s not taking me out of the loop, I’m not trying to be an email writing expert, so the value of having this is relatively small. The same goes for assisted driving. We want autonomous driving, not assisted.
That’s a big difference to things we do want to be experts in. I’m a big believer in things like coding assistance like provided by GitHub CoPilot.
If you do want to provide gAI for things humans don’t want to be an expert in, you’ll have to build something that actually replaces the human.
Resource: https://www.economist.com/business/2023/02/28/investors-are-going-nuts-for-chatgpt-ish-artificial-intelligence (paywalled)
(2) More Data Mesh Research
What: A team of researchers from Germany led another investigation into the Data Mesh topic uncovering motivations, best practices and challenges along the way.
The paper is just 12 pages short and packed with results. It is based on 15 semi-structured interviews, so take it with a grain of salt.
I applaud every effort into more fundamental Data Mesh research.
Resource: https://arxiv.org/abs/2302.01713
(3) Runa Capital ROSS Index
What: Runa Capital publishes the “ROSS” index of fastest growing OS start ups based on GitHub star growth.
My perspective: What I enjoy most about these indices is that I know almost none of these startups. I just got a bunch of new interesting case studies to look into ;) The share of data startups is very low though, just 13/44. That’s a stark contrast to the number of successful OS companies that go public (that share is more like 50% on the side of data).
Sounds like either things have changed since 2021, or you should go into data if you consider an OS startup.
Resource: https://runacap.com/ross-index/annual-2022/
How was it?
New articles by me
Shameless plugs of things by me
Check out Data Mesh in Action (co-author, book)
and Build a Small Dockerized Data Mesh (author, liveProject in Python).
And on Medium with more unique content.
I truly believe that you can take a lot of shortcuts by reading pieces from people with real experience that are able to condense their wisdom into words.
And that’s what I’m collecting here, little pieces of wisdom from other smart people.
You’re welcome to email me with questions, raise issues I should discuss. If you know a great topic, let me know about it.
If you feel like this might be worthwhile to someone else, go ahead and pass it along. Finding good reads is always a hard challenge, they will appreciate it.
Until next week,
Sven