In Case You Missed It, July Recap
FDE, Surgical Fine-Tuning, Top DE Newsletters, LLM architectures & how to become a data business
This is the Three Data Point Thursday making your business smarter with data & AI.
This special in-between issue serves as a recap of the last editions and more of my writing. It features a few quotes from each piece so you can get a feel for them and quickly jump around the Three Data Point Thursday universe.
Enjoy!
What is Surgical Fine-Tuning and Why You Should Care
“TL;DR: Surgical fine-tuning makes an ML algorithm relevant to your specific business context faster & better through precision changes.
That’s an advancement over regular fine-tuning.
Fine-tuning: take any general-purpose ML model (ChatGPT, Image things,...). Then give it your data set to learn your "kinks"
How to choose your LLM architecture - Yes, you should have one
“Building something with LLMs? If not, you should. Here’s how from a technical perspective.
TL;DR: There’s a clear emerging architecture for starting out.
Architecture:
Use in-context learning;
Use OpenAIs API to create an embedding;
Use Pinecone to store your embedding;
Use an OpenAI model as LLM.
And use either LangChain or LlamaIndex for the orchestration.”
How to become a data business
“What’s data business? Data business = the Amazons, Netflixes, AirBnBs, and Spotifys of this world. The businesses that bring value to customers with the help of data.
Why care? We have a simple opinion: Every business has to become a data business or go under. There is no good reason to build any other startup or invest in new products that are not data-heavy.
Founding is easy, but where do you start as an established business?
The six typical use cases: [...] but only one is a good one! ”
Functional Data Engineering 101
“Are you a data engineer looking to level up your game? By bringing hot software engineering practices into your life? Then you’ve come to the right place. Functional Data Engineering (FDE), pioneered by Maxime Beauchemin, is a vast topic - we’re here so you don’t waste any time getting started.
Are your data systems troubled by… a lack of reproducibility? No service continuity? Lacking testability? Or aren’t your data pipelines as disposable as you’d like?
Functional data engineering (FDE) might be your way out.”
Top Data Engineering Newsletters
Interested in data engineering per se? We collected our top choices for data engineering-focused newsletters we love, besides like the dozens of surprising sources we digg through to create this newsletter.