Data Mesh Research, BeautifulPlaces.ai, a16z data architectures in 2022; ThDPTh #70
I’m Sven, and this is the Three Data Point Thursday. The email that helps you understand and shape the one thing that will power the future: data. I’m also writing a book about the data mesh part of that.
Time to Read: 9 min
Another week of data thoughts:
- Current data meshes vary a lot, but all fall into one generic framework
- We’re still not dealing with the data that’s so crucial in 5-10 years from now
- A16z updated their great default data architectures to the 2022 standard
A Master Thesis on the Data Mesh
What: I’m sharing the summary of a master thesis written about data meshes at the Technical University of Berlin by Jan Nitsche. His LinkedIn still says he’s in “discovery mode” and he truly is. His Master Thesis explores the principles behind the data mesh and their connection to each other. Is one more important? What happens first and so on. To get to his conclusions, Jan conducted a series of interviews with data meshy people.
My Perspective: I think it’s great that finally, some actual research is happening into the data mesh idea. The main outcomes of his work, to put it into my words are the four principles as outlined by Zhamak:
can be used independently with varying degrees, and they are.
that the idea of distributed domain ownership usually is the first step in the data mesh journey as kind of a foundation for the other
that the other three principles, platforms, data as a product, and federated computational governance, basically then slowly add to an already working socio-technical architecture to solve scale issues that may or may not come up.
decentralization is the central tool used in the data mesh to solve the value gaps between data producers & consumers.
I really already enjoyed reading through his summary and encourage you to do the same.
Oh yeah, and Jan can now put “Master of the Data Mesh” on his CV which might serve him well.
Ressource: the included PDF below (which will probably down the opening rate for this email from 90% to 40%!)
What: This website features a collection of papers that use deep learning to understand the beauty of places. Sounds cool right? It’s run by Chanuki Illushka Seresinhe, the current head of data science at Zoopla.
My perspective: I see the research done by Serensihe et. al. as a great example of how AI creeps into every corner of our existence, and yet how little of it is turned into “proper” businesses yet. I’ve shared the implementation of one effort in the same direction by the german comparison website idealo.de previously which basically judges whether an image is of “good quality” or “bad quality”.
And I am still wondering why these large engines aren’t provided for me to use. In essence, most of these efforts bring “structure to unstructured data”, and since unstructured data will make up almost all of the data the world produces in a couple of years, these tools are exactly what we need to dig through our mountains of data.
Just imagine the ability to run queries like
“SELECT MOST_BEAUTIFUL_IMG(Limit 10) FROM UNSTRUCTURED_IMAGE_BASE”
on your unstructured object storage to analyze your “unstructured” data.
So I’m waiting for that - structure in the unstructured, and haven’t seen it implemented yet, even though almost everyone already needs it.
Modern Data Infrastructure 2022 a16z
What: a16z updated their great post about modern data infrastructures to catch up with 2 years of stuff happening.
My perspective: Apparently, stuff became more complex. Seriously though, I love to see this update of a great post. It simply provides good guidance in a world where so much legacy stuff is haunting all of us and so much is happening that it’s really hard to see the “good default” options. I encourage everyone to check it out.
🎄 Thanks => Feedback!
Thanks for reading! I’d also love it if you shared this newsletter with people whom you think might be interested in it.
And of course, please provide me with feedback: