🐰 #38 Data Meshes, Data Meshes, and more Data Meshes! ThDPTh #38 🐰
Today we will take a closer look at a bunch of data meshes, at Saxo Bank, at ABN Amro, and finally an example from a great talk given by Samia Rahman et. al.
Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.
If you want to support this, please share it on Twitter, LinkedIn, or Facebook.
… so, data meshes are a (“socio-technological”) tool, so when is it the right tool???
… some would say, “when you have complex domains/complex data”
… from what I’ve seen, I tend to disagree here.
That’s a problem-focused perspective that doesn’t account for the value of data. In fact, I don’t think complex domains make data meshes work well at all. Complex domains make data meshes expensive.
Data meshes are a great tool when you have a lot of use for complex data! That’s a pretty sharp difference as I see a lot of companies applying data meshes where they have complex data. But turns out, they usually just have that complex data, no one who actually wants to use that complex data (at least not at the cost it comes at….).
Sometimes your company’s data journey is stuck, simply because your company doesn’t really want or can extract that much value from data after all. That might feel really frustrating inside an energetic data team, but for the company overall, that’s totally fine.
FWIW, I outlined in my articles on good data vs bad data, that I think every company & industry will sooner or later align its strategy with data, make it the core of everything, but if it hasn’t, you’re simply not there yet.
🔮(1) Saxo Banks Data Mesh
As a bank, Saxo Bank seems true to be a good candidate for a data mesh. In contrast to what sometimes is said about the data mesh “it’s good for when you have complex domains”, I believe that the data mesh is potentially a good tool for when you have “complex data value propositions”. You can have all the complex data domains in the world, if you will not derive value from it, there is no good reason to implement a data mesh.
With that said, Saxo Bank seems to be a good place to start a data mesh. That is also highlighted by the fact that they started their journey before Zhamak published her data mesh articles. I’ll just highlight a few points I find interesting:
“Given the pace of change across the organisation, we knew that we couldn’t rely on a central team to create and populate a canonical data model for the enterprise. Our approach must scale. Instead, we federated the ownership of domain data and their representations and centralised oversight.”
They carried over the idea of platform thinking from DevOps….
“a high DevOps evolution correlates strongly with self-service capabilities, as it allows application teams to be more efficient, controls to be improved, and platform teams to focus on continuous infrastructure improvement.”
They also took a shot at standardization across data language, a key problem in many orgs…
“A standard language to emerge and ensure that information can be efficiently used across the business; this standard or “ubiquitous” language is central to the idea of domain-driven design (DDD) as a means for removing barriers between developers and domain experts”
I like the focus they put on schemata and the idea that the optimal schema is one that allows for semantics…
“One consideration that is often overlooked is the ease with which semantic annotations (aka metadata) can be embedded into the schema.”
Interestingly they built their data mesh on top of Kafka, just like the company Gloo.us, albeit on a much larger scale. Anyway, if you’re interested in a finance application of data meshes or in a Kafkaspecific tech stack, go check the article out!
Resources:
(2) 🎁 ABN AMROs Data Mesh on Azure
ABN AMRO, the third-largest bank in the Netherlands has a really interesting data mesh story to share. Particularly because they started their journey in 2016, and partly because the architect behind it (Piethein Strengholt) already wrote a book about it! Their data mesh is built on top of Azure. Again I’ll just share a few highlights. The one thing I like the most is their take on communication between Providers & Consumers. Basically, they don’t see the “data” or the data mesh as a separate thing, it just is integrated somewhere in the three different ways of communication. I really like the idea of not having something “special for data ‘’ but rather simply optimizing the communication across teams inside the company. (And yes, the data mesh with its “data ports” certainly tries to make data something special and different.)
(Image from ABN Amro, Piethein Strengholt)
I like the way they describe the typical challenges that arise from traditional data integration styles ….
“Chances are relatively high that the meaning of data differs across different domains, departments, and systems. Data elements can have the same names, but their meaning and definitions differ, so we either end up creating many variations or just accepting the differences and inconsistencies. The more data we add, and the more conflicts and inconsistencies in definitions that arise, the more difficult it will be to harmonize.”
“Enterprise data warehouses (EDWs) behave like integration databases. They act as data stores for multiple data-consuming applications. This means that they are a point of coupling between all the applications that want to access it”
This quote I find particularly important, data warehouses introduce coupling, that’s kind of the point of it. And also something we really try to avoid in software engineering!
“ABN AMRO, is an architecture which allows domains or teams to change and exchange data more independently in a federated and self-service model.”
They really understand the idea of a platform that enables other teams to speed up and yet still keep a keen eye on governance….
“By ensuring that everything flows through the same single logical layer, maximum transparency and increased speed of consumption is created. Within ABN AMRO, we deployed a sheer of data management capabilities in this single logical layer, for security, observability, discoverability, linage and linkage, quality monitoring, orchestration, notification, and so on.”
I also really like how they go deep into metadata.
Governance is often an important topic in data mesh, often people feel like decentralization means losing central governance. I like a quote from ABN AMRO which basically says, data mesh is the way to ensure governance:
““We have established a data governance body within our company, and no data is allowed to be distributed or consumed without clear ownership. So, for each data set we onboard on the central platform, we want to ensure data accountability.”
That’s an interesting thought to end things, data mesh means ownership, and only with ownership comes true governance.
Resources
ABN Amro’s data journey with Piethein Strengholt
(3) 😍 A Starter Data Mesh by Samia Rahman
Samia Rahman & Stefanie Cappello put up a really good example for “Women Who Code” of a data mesh and its self-serve infrastructure. It’s supplied with code and really goes into the details.
So in essence, you get a little starter data mesh just by following along with them. I like how they explain a lot of concepts with examples.
One thing that I see a little differently is a part in the beginning where she says that (I’m paraphrasing) “data meshes work well, when you have complex domains”, as explained above.
The hands-on approach is great, explaining the use of config files which are an extremely simple way of abstracting away lots of things as well as the benefits & cons of using a Pull Request model. It’s a great walk through the basics of self-serve platforms and managing the first 1–2 iterations.
Resources:
Women Who Code Self-serve infrastructure talk by Samia Rahman & Stefanie Cappello.
🎄 In Other News & Thanks
Thanks for reading this far! I’d also love it if you shared this newsletter with people whom you think might be interested in it.
P.S.: I share things that matter, not the most recent ones. I share books, research papers, and tools. I try to provide a simple way of understanding all these things. I tend to be opinionated. You can always hit the unsubscribe button!
Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue