💥 Neural IMA, Explainable AI, Time Series Forecasting; ThDPTh #14 💥

Apr 08, 2021

Lots of Machine Learning Libraries to assess image quality, produce explanations for your models, or forecast & classify time series.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1) 🚀 Image Quality Assessment Implementation

The German price comparison website idealo.de provides an implementation of some interesting applied Google research from 2018 called “NIMA: Neural Image Assessment”. The paper describes two neural networks the team open-sourced. The first network aims to establish the aesthetic looks of an image, while the second takes a guess at the technical looks.

So basically, these two networks help you determine how pretty an image is or whether its quality sucks. I tried the code myself and found it easy to use. The typical use cases I can think of would deal with images you don’t have good control over like catalogs uploaded to your platform of homes, articles, etc. If you happen to work with any of these use cases, pay the GitHub repository a visit.

GitHub - idealo/image-quality-assessment

Convolutional Neural Networks to predict the aesthetic and technical quality of images. - idealo/image-quality-assessment

github.com • Share

(2) 📣 ELI5 for Explainable AI

Netflix launched a nice feature about a year ago which really makes their recommendations appealing: Over each one you now see something like “Because you watched — — Marvel’s The Avengers — -”. In short, an explanation on why you get this recommendation, and I love them! They make these recommendations much more appealing to me.

So I found the python package ELI5 just as appealing when I stumbled over it. Basically, ELI5 allows you to get some kind of explanation for the predictions common frameworks like Keras, XGBoost, etc. produce. Of course, these are nowhere perfect, they often result from feature importances and weights, but still, I believe having any kind of “explanation” is much better than having none.

A typical example could be a sales forecast you provide. I believe these things are much more used if we help people understand how they are produced. And telling them that “last week’s sales + current interest rates” are the most important determinants of next week’s sales goes a long way.

Another example would be any kind of customer classification you run to select targets for certain marketing campaigns. In all cases, displaying the 3–4 important features that selected one particular customer goes a long way in earning the trust of the sales/marketing person on the other side, just as Netflix “because you watched ….” phrases do for me.

Overview — ELI5 0.11.0 documentation

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions. It provides support for the following machine learning frameworks and packages:

eli5.readthedocs.io • Share

(3) 🐰 tsfresh for easy time series forecasting

I believe for 90% of machine learning use cases good enough is good enough. That means out-of-the-box solutions are all that’s needed with a bit of tweaking by a few skilled machine learners.

So I enjoy every single out-of-the-box solution that hits the open-source market. Tsfresh makes machine learning on time series much easier. Why is it “hard” in the first place? For the trained machine learning engineer, it actually isn’t really hard. But in my opinion, it is cumbersome. Because time series simply don’t come in the typical pieces & batches. You gotta cut them yourself by e.g. calculating rolling time windows. In addition feature calculation isn’t as straightforward as it is with other types of data sets.

That’s what tsfresh tackles, it takes care of the calculations of your batches and it has a big standard set of features it calculates. In essence, you can get a first classifier to train with 3–5 lines of code instead of the 50 it takes without it.

tsfresh — tsfresh documentation

tsfresh automatically calculates a large number of time series characteristics, the so called features. Further the package contains methods to evaluate the explaining power and importance of such characteristics for regression or classification tasks.

tsfresh.readthedocs.io • Share

🎄 In Other News & Thanks

Thanks for reading this far! I’d also love it if you shared this newsletter with people whom you think might be interested in it.

P.S.: I share things that matter, not the most recent ones. I share books, research papers, and tools. I try to provide a simple way of understanding all these things. I tend to be opinionated. You can always hit the unsubscribe button!

By Sven Balnojan

Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.

Tweet Share

In order to unsubscribe, click here.

If you were forwarded this newsletter and you like it, you can subscribe here.

Three Data Point Thursday

Discussion about this post