Discover more from Three Data Point Thursday
🐰 #29 Sorry JPMorgan! MeltanoHub, Data Platforms; ThDPTh #29 🐰
Data Mesh at JP Morgan Chase, Meltanos Hub got launched and a quick and dirty guide to data platforms.
Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.
If you want to support this, please share it on Twitter, LinkedIn, or Facebook.
🎁 (1) Data Mesh at JP Morgan Chase
I just watched JPMCs talk about their data mesh journey. It’s an interesting watch, and there are definitely some important parts in it, but I’d like to highlight four things:
JPMC moves from an on-premise network to the cloud and calls it “Data Lake via Data Mesh”, so they do both, a cloud move as well as a data mesh transformation.
They seem to lack a connection to company strategy. I really like the HelloFresh approach I shared in a previous newsletter, deriving the need for a data mesh from their flywheel (which is essential to the company strategy). In the JPMC case, it actually seems that their “goals” are a bit contradicted with the data mesh paradigm. They seem to aim for lower cost (in terms of storage?), whereas the data mesh of course is more expensive than a centralized data lake.
They seem to try to build a very AWS-centric data mesh, using a lot of AWS products. I usually issue a warning to produce such a high level of coupling because it looks to me, they are aiming at both, coupling on components and on the cloud. That will make them unable to switch any component at all, should a better solution emerge.
I still think at their scale they will very likely need a data mesh, they might just end up building the wrong one on their first try.
JPMC is building the data mesh on top of AWS, taking a common infrastructure that also allows for ingestion of data and then put every domain/ data product inside one AWS account. They use AWS lake formation for access rights etc. management of their new data lake but also open up things for a hybrid cloud strategy for quite a while. They will use AWS Athena for SQL-based access and GraphQL for application-level access.
One thing that they figured out early on, was that “data product thinking” is the hardest part of all of this. They put quite some thought into that so be sure to listen to these segments.
Three very senior leaders at JPMorgan Chase …
🔥 (2) Meltano Hub launched!
In one of my earlier newsletters, I talked about the three challenges Meltano & Airbyte have to address which Singer currently doesn’t address to even pick up a fight. Turns out, Meltano now is starting to address number two (after already building a CDK), a hub or package managing system for “taps”.
The hub is decent, it offers some information on recency, open PRs, etc. It also features both meltano & singer taps, so it’s a good starting point!
MeltanoHub for Singer launches with the largest library of open source connectors of any protocol.
🔮🔮🔮 Data Company Corner 🔮🔮🔮
Stuff that might be interesting for anyone at the front line of the data world, inside a data company, inspired by much positive feedback from my article on commercial open source software data companies.
📣 (3) Sven: Developing & Pricing OSS Products
(this is an article by me) I just followed up on my open-source data article with a detailed dive into pricing & product development models, following a lot of what GitLabs CEO Sid Sijbranji talks about.
It’s all about how to avoid the “service-wrapping” danger by hyper-clouds, which I don’t think is a real danger, and how different pricing & product development models work that are at use at databricks, GitLab or Dbt Labs.
Be sure to read it if you enjoyed my last article!
Dbt Labs, formerly Fishtown Analytics, recently did a large Series C. In the announcement blog post, Tristan Handy, CEO of dbt Labs outlined the major risk…
🎄 In Other News & Thanks
Thanks for reading this far! I’d also love it if you shared this newsletter with people whom you think might be interested in it.
P.S.: I share things that matter, not the most recent ones. I share books, research papers, and tools. I try to provide a simple way of understanding all these things. I tend to be opinionated. You can always hit the unsubscribe button!
Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue