PART 1: Building a Food Image Classifier
USING TABLEAU PREP AND EINSTEIN PLATFORM SERVICES
Connecting To Unstructured Data with Tableau Prep
Chances are likely you have been connecting mainly to structured data with Tableau Prep by now. And by structured data we mean “classic” rectangular table structures with data in it, “normal data” as you might like to call it. But what if we want to analyse unstructured data like images, video or even sound using Tableau Products? Is that actually even possible?
Isn’t that just one step beyond the possibilities of Tableau Prep? And if possible, how would/could that work in Tableau Prep? Stay with us, In this series we will investigate the potential of combining Tableau Prep and Python with the Einstein Platform Services.
The Einstein logo as used by Salesforce.
Computer Vision: Giving Tableau Prep a neat set of eagle eyes.
Einstein what? The Einstein product line and Einstein brand is becoming increasingly present in the Salesforce Ecosystem. It basically represents all the neat smart AI features across the Salesforce products which try to make our lives easier, in line with the intellect and resemblance of the historical person and the intelligence of the worldwide known scientist Albert Einstein. What is lesser known is that these AI features are also available as standalone API services, ready to be implemented in your apps, software and...Tableau Pep.
Making use of these services once can deploy neat endpoints which perform tasks in the context of Computer Vision and NLP.
In this example we focus on how to harness the power of image recognition to solve an array of use-cases using pre-trained classifiers or custom models.
Einstein Object Detection
Einstein Image Classification
Food Classifier <--
More specifically we’ll have a look at this pre-trained food classifier which is capable of identifying up to 500 different sorts of food.
An internal Biztory use-case: A bunch of food lovers united
Like most people in this world our consultants at Biztory LOVE (but not necessarily share) food. And therefore none of us is hesitant in actually visually showing our dishes in our dear beloved #biztory-food channel on Slack. However, due to our very diverse backgrounds,
Congratulations, human: you have correctly guessed “Square Pizza’s”.
it may not always be that straightforward to correctly guess what the other colleagues are eating. And at this point..you might be too afraid to ask?
So, wouldn’t it be neat if we have our own tidy Tableau Dashboard which visualizes the foods that my colleagues are eating this week with Einstein helping us to predict what it actually is?
Computer says: Let’s give it a try!
Let’s build that classifier
Before you start, you’ll need an Einstein Platform services account. Einstein Platform Services have a free tier which allows you to hundreds of queries per month for free, so ideal to get you going and play around with it. You can get it at: www.einstein.ai
You’ve signed up to the Einstein Platform Services and created an access token
You’ve set up tableau prep to connect to python
Note that we are going to make an “image classifier”, which means it classifies the images as a whole. It is important at this stage to know the difference with “object detection” which not only classifies objects as an image but additionally also would tell you where in the image the object is found. In this use case we are not generating “bounding boxes” we merely let the computer guess which foods it detects for the image as a unit on itself. In that sense we are performing a classification without any sense of localisation.
STEP 1: COLLECT IMAGES WITH A GOOGLE FORM
We create a google form for people to upload their food images to. This will store the uploaded images in the form creator’s google drive and add a link to the files in the response sheet. In practice this can be any file system really, as long as it is possible to generate a link which points directly to the actual image. As we will see the default URL google generates needs a bit of tweaking in order for the API and endpoint to be able to deal with it.
Google Form with file upload possibility
Each uploaded image gets its own URL
Note that all these images actually are stored at the same location of the google form we created. So yes, you are right we are not directly connecting Tableau Prep to actual images. But hey, this allows us to use the neat google drive connector to “connect to images” without too much coding, because in the specific case our endpoint only requires the URL to the image.
STEP 2: CONNECT TO THE GOOGLE FORM RESPONSE SHEET FROM TABLEAU PREP
In Tableau Prep, select Google Drive as the data connection option and locate the response sheet. When pulled in, it should look something like this:
Loading the Google drive data into Tableau Prep
STEP 3: ADJUST THE GOOGLE DRIVE IMAGE LINK
The image URL is not in the correct format to be ingested by the API endpoint. This can be easily adjusted by creating a calculated field with a calculated field. We squeeze in the
“uc?export=download”. This field has been named ‘source’.
the id of the image is concatenated with the base-url
STEP 4: ADD A PYTHON SCRIPT TO THE PREP FLOW
This script can be broken down into 4 sections:
Import the required packages: you’ll need ‘pandas’ to treat your data as a dataframe and ‘requests’ to query the Einstein endpoints.
Define the output schema. This is always needed when using python with Tableau Prep. It tells Tableau Prep what we expect back from the API endpoint. In this case, we want a label (the computer’s prediction of what the food is), a probability (how sure the computer is that this is correct) and the source (the image URL).
Note: we will get 5 predictions (label + probability) back for each image
Query the Einstein endpoint. FoodImageClassifier is an out of the box, pretrained algorithm in the Einstein offering. Nothing needs to be adjusted – it is ready to use. The return line says that if there is no label for the image then return ERROR with a probability of 1.
Output as a tableau data frame – this is what will be sent back to tableau prep.
STEP 5: CHECK THE OUTPUT OF THE PYTHON SCRIPT
Selecting one source image here will show the 5 predictions below.
The computer does not make only one food classification guess per image, it makes plenty of guesses together the respective probabilities are in a range between 0 and 1. The word “Between” is key here, no probabilities of 1 should be allowed here. The API returns only the top 5 of that label probability list. What you will see is that >90% probabilities are not always required to have accurate classifications. Probabilities as low as 20% or 30% could already provide you with the right classifications. But of course a lot of factors are in play here and as a human you want to work with only one choice at the end. Deciding what cut-off works best for you depends on a range of factors and the use-case you have in mind using the classifier.
As a final step in your flow, write to a published datasource on Tableau Serve in order to start vizzing.
STEP 6: VISUALIZING THE PREP OUTPUT
Let’s be honest, the output on itself isn’t the most sexy data you’ve witnessed. Therefore let’s visualize it in Tableau Desktop! By creating a dynamic parameter with the source URL’s in combination with a web object you can select and show the source images on google drive. Then we created a bar chart in the bottom right of our dashboard which shows the five predicted labels together with their probabilities.
Hey look, it’s Astro checking your classifier results.
Right below Astro we created a sheet which shows Einstein’s best guess by returning at all times the label with the highest probability. What we learn here is that a CHURRO must have very discriminating “features” as it seems that the Einstein API has a probability associated with the label “CHURRO” of no less than 99%! What is also striking is that the 2nd guess is “YOU TIAO”.
“YOU TIAO” or according to google: “Chinese Donuts”
Let’s be honest, I needed to google than one too, but you’ll surely agree that isn’t the worst second bet either, taking into account the common visual characteristics of these “Chinese Donuts” compared to a “Churro”.
Yes, Tableau Prep can be used to perform image classification tasks when combining Tableau Prep with the Einstein platform services. With a minimal knowledge of Python and the ‘requests’ package one can quickly set up to start playing around with the endless possibility the Computer Vision API provides us.
With the convergence of Salesforce with Tableau products exciting times are ahead of us as we could potentially foresee a future where these kinds of services can be called natively from within Tableau Prep without the need of bringing in a Python script which performs a POST request for us.
Also, it’s important to point out we have only scratched the surface of what is possible here as we only started from an out of the box pre-trained model. The Einstein API’s allow you to programatically train, test and deploy your own custom image classifiers as well. Because, which company on earth is in actual need of a food classifier, right? I could think of a few though!
I could imagine a lot of companies in the domain of quality control, security or even marketing are in need of analysing huge piles of images and are not sure whether they should buy or make a solution tailored to their specific use-case. Services as provided by the Einstein Platform are in my opinion to be positioned in the sweet spot of this make-buy dichotomy as they allow you to jump start making data science driven applications with enough versatility for adjusting it to your own business specific needs.
But, most importantly:
Now you know how to find out what your colleagues are eating in during your next virtual meeting! :)
Technical Consultant @ Biztory
Stay with us in PART 2 where we will dive into how we can make use of Tableau Prep + Einstein Vision API to read text from images and store them in a database! Exciting!
Join the Data Jam
92% of companies fail to scale their analytics, which likely includes you. We have studied the patterns in hundreds of client engagements and cracked the code for a modern data stack that guarantees success.
We'll uncover this in the most original webinar you've attended this year.