Three Data Point Thursday

Share this post

šŸ’„ Neural IMA, Explainable AI, Time Series Forecasting; ThDPTh #14 šŸ’„

thdpth.substack.com

šŸ’„ Neural IMA, Explainable AI, Time Series Forecasting; ThDPTh #14 šŸ’„

Sven Balnojan
Apr 8, 2021
Share
Share this post

šŸ’„ Neural IMA, Explainable AI, Time Series Forecasting; ThDPTh #14 šŸ’„

thdpth.substack.com
Three Data Point Thursday

Lots of Machine Learning Libraries to assess image quality, produce explanations for your models, or forecast & classify timeĀ series.

Data will power every piece of our existence in the near future. I collect ā€œData Pointsā€ to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1) šŸš€ Image Quality Assessment Implementation

The German price comparison website idealo.de provides an implementation of some interesting applied Google research from 2018 called ā€œNIMA: Neural Image Assessmentā€. The paper describes two neural networks the team open-sourced. The first network aims to establish the aesthetic looks of an image, while the second takes a guess at the technical looks.

So basically, these two networks help you determine how pretty an image is or whether its quality sucks. I tried the code myself and found it easy to use. The typical use cases I can think of would deal with images you don’t have good control over like catalogs uploaded to your platform of homes, articles, etc. If you happen to work with any of these use cases, pay the GitHub repository a visit.

GitHub - idealo/image-quality-assessment

Convolutional Neural Networks to predict the aesthetic and technical quality of images. - idealo/image-quality-assessment

github.com  •  Share

(2) šŸ“£ ELI5 for Explainable AI

Netflix launched a nice feature about a year ago which really makes their recommendations appealing: Over each one you now see something like ā€œBecause you watchedā€Šā€”ā€Šā€” Marvel’s The Avengersā€Šā€”ā€Š-ā€. In short, an explanation on why you get this recommendation, and I love them! They make these recommendations much more appealing to me.

So I found the python package ELI5 just as appealing when I stumbled over it. Basically, ELI5 allows you to get some kind of explanation for the predictions common frameworks like Keras, XGBoost, etc. produce. Of course, these are nowhere perfect, they often result from feature importances and weights, but still, I believe having any kind of ā€œexplanationā€ is much better than having none.

A typical example could be a sales forecast you provide. I believe these things are much more used if we help people understand how they are produced. And telling them that ā€œlast week’s sales + current interest ratesā€ are the most important determinants of next week’s sales goes a long way.

Another example would be any kind of customer classification you run to select targets for certain marketing campaigns. In all cases, displaying the 3–4 important features that selected one particular customer goes a long way in earning the trust of the sales/marketing person on the other side, just as Netflix ā€œbecause you watched ….ā€ phrases do for me.

Overview — ELI5 0.11.0 documentation

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions. It provides support for the following machine learning frameworks and packages:

eli5.readthedocs.io  •  Share

(3) 🐰 tsfresh for easy time series forecasting

I believe for 90% of machine learning use cases good enough is good enough. That means out-of-the-box solutions are all that’s needed with a bit of tweaking by a few skilled machine learners.

So I enjoy every single out-of-the-box solution that hits the open-source market. Tsfresh makes machine learning on time series much easier. Why is it ā€œhardā€ in the first place? For the trained machine learning engineer, it actually isn’t really hard. But in my opinion, it is cumbersome. Because time series simply don’t come in the typical pieces & batches. You gotta cut them yourself by e.g. calculating rolling time windows. In addition feature calculation isn’t as straightforward as it is with other types of data sets.

That’s what tsfresh tackles, it takes care of the calculations of your batches and it has a big standard set of features it calculates. In essence, you can get a first classifier to train with 3–5 lines of code instead of the 50 it takes without it.

tsfresh — tsfresh documentation

tsfresh automatically calculates a large number of time series characteristics, the so called features. Further the package contains methods to evaluate the explaining power and importance of such characteristics for regression or classification tasks.

tsfresh.readthedocs.io  •  Share

šŸŽ„ In Other News &Ā Thanks

Thanks for reading this far! I’d also love it if you shared this newsletter with people whom you think might be interested in it.

P.S.: I share things that matter, not the most recent ones. I share books, research papers, and tools. I try to provide a simple way of understanding all these things. I tend to be opinionated. You can always hit the unsubscribe button!

By Sven Balnojan

Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.

Tweet Ā Ā Ā  Share

In order to unsubscribe, click here.

If you were forwarded this newsletter and you like it, you can subscribe here.

Powered by Revue

Share
Share this post

šŸ’„ Neural IMA, Explainable AI, Time Series Forecasting; ThDPTh #14 šŸ’„

thdpth.substack.com
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

Ā© 2023 Sven Balnojan
Privacy āˆ™ Terms āˆ™ Collection notice
Start WritingGet the app
SubstackĀ is the home for great writing