🐰 #33 Data Visualizations with Bertini, Dbt CodeGen, Open Source Commercialization on a16z; ThDPTh #33 🐰

Aug 19, 2021

How data visualizations help machine learning, a useful Dbt tool, and timescaleDB on open source commercialization.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, read it.

(1) Prof. Bertini on Data Visualization

“A basic scatter plot usually does the job for me …. No one wants my upsell on the idea…maybe visualization is our hope to help with machine learning interpretability/understandability…”

This episode of the data skeptic, which I found on RudderStacks’s summer reading list, I really enjoyed. I recently sat down to write up a few lessons of product management I had learned the last couple of years, and one that came up is a “visualizations first/GUIs first” approach to data product management.

Because you simply can neither touch, nor see data, the best you can do is to give people something that they can see or feel. With that in mind, I try to do the products/features I do with a draft of a GUI first, even if I don’t intend on building a GUI at all. Because the visual story matters. This episode is kind of about that, so check it out.

Data Skeptic: AI, ML, DS

A podcast about data science, artificial intelligence, machine learning, and data.

dataskeptic.com • Share

(2) Dbt Codegen

We all love dbt. The first step to get a new data source, a new table, is to stage it into a raw format. That usually requires writing a plain and simple SQL-Jinja statement. And this can feel quite repetitive depending on the number of tables you want to stage.

Dbt Codegen is a cool tool that takes away some of that hassle by generating the base models, model YAMLs, and so on. If you work with Dbt I encourage you to check this tool out, it looks really helpful to me.

GitHub - dbt-labs/dbt-codegen: Macros that generate dbt code

Macros that generate dbt code. Contribute to dbt-labs/dbt-codegen development by creating an account on GitHub.

github.com • Share

🔮🔮🔮 Data Company Corner 🔮🔮🔮

Stuff that might be interesting for anyone at the front line of the data world, inside a data company, inspired by much positive feedback from my article on commercial open source software data companies.

(3) Open-Source Talk on a16z

Peter Levine lays out a very extensive framework for going from open-source projects to commercial companies. There are really just three things I’d like to say before letting you dive into the talk:

I wish I would’ve read that earlier.
I love how he explains that “your code is not a competitive advantage, your community is!” (which very much underpins my light criticism on Dbt Labs current strategy).
I wish he would’ve made one point clearer: If you’re in the data space, you have to enter the open-source space. If you’re in the open-source space, you’re either playing to WIN or not playing at all. Remember, don’t try to become AltaVista.

Open Source: From Community to Commercialization - Andreessen Horowitz

We are in the midst of an open source renaissance. But how do you turn an open source project

a16z.com • Share

🎄 In Other News & Thanks

Thanks for reading this far! I’d also love it if you shared this newsletter with people whom you think might be interested in it.

P.S.: I share things that matter, not the most recent ones. I share books, research papers, and tools. I try to provide a simple way of understanding all these things. I tend to be opinionated. You can always hit the unsubscribe button!

By Sven Balnojan

Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.

Tweet Share

In order to unsubscribe, click here.

If you were forwarded this newsletter and you like it, you can subscribe here.

Three Data Point Thursday

Discussion about this post

Ready for more?