Back in 2013, Tableau 8.1 was released with the ability to connect with R, a popular statistics software and then later in version 10.1 Python integration was also available. This blog series should demonstrate just a fraction of the capabilities of R integration. We will kick off this journey here with steps on how to get the connection up and running, and some more information on how to get started.
R is a free, open-source language for statistical analysis. There are many libraries, packages and even saved models available in R. It is possible to utilise those in Tableau by using a calculated field to call the R engine using the Rserve package (a server allowing external programs to use R). Passing values from Tableau as an array for R to use, once R has calculated its results, they are returned to Tableau to be used in visualisations.
It is recommended that you are already somewhat proficient in coding to utilise R in Tableau. It is definitely beneficial to have some proficiency in R programming to take advantage of R’s more complex capabilities. I am currently doing a series of courses on DataCamp on R and I would recommend it if you like to learn by practicing.
So now that all the preamble is out of the way, let’s get started with the set up. First things first, you’ll need to have R on your computer. It can be downloaded here, and you can choose which version and installer of R to download (depending on your operating system).
Now that your R download is complete, a fun thing to know about R is that each version has a release name. The person who chooses the names is a Peanuts (the comics) fan, many of the names are references to the comics or films. The current stable version is Action of the Toes; some of my favourites are: Warm Puppy, Sock it to Me, Very Secure Dishes, and Sincere Pumpkin Patch (the very first version of R I downloaded).
Getting back on track, the next step is to install Rserve. We need to open the R console which will look something like the image below.
In the console we need to install the Rserve package by typing:
into the console, then hit enter. You may be prompted to select a CRAN mirror, R advises us to select the mirror that is closest to your location to minimise the load. It will look like this:
Once the package is installed, we have to run it before we can use it, this is done with the following line of code:
Now that the R server is up and running, let's hop into Tableau Desktop and define the connection to integrate R. Once in Desktop, go to the Help menu and locate "Manage External Service Connection..." option (shown below).
This will open the External Service Connection dialogue box, change the external service drop-down to Rserve and specify the server as localhost and port as 6311 (as in the gif below).
You can check whether Tableau can connect to RServe by hitting the Test Connection button. If you get a ‘Success’ message then you’re all set up, you are ready to start using R in Tableau!
In this blog I have demonstrated the scripting in RGUI (R graphical user interface); it is not the most friendly UI to write or draft scripts that are longer than a couple of lines. For scripting R code, RStudio is a great IDE (integrated development environment) to use. In RStudio you can write multiple scripts, have help with debugging and you can even write reports. It can be downloaded from their website, also note that for those using Mac OS X to use some R packages will require downloading XQuartz.
As mentioned above, R is integrated into the calculated fields in Tableau, there are four different calculations that you can use to call R:
- this means that the output is a real number
- the output is an integer
- returns a string output
- which will return a Boolean result
The calculation will then be formed of your choice of script (one of the four listed above), your R script, and the arguments to go in the script.
The example calculated field shows the Tableau instructions and an example of how to format the script. The R code is put inside " ", and then the fields/arguments in the script are replaced by .arg#, where the # is replaced with consecutive integers from 1, for however many arguments you intend to have (the example shows two). When telling Tableau which Tableau field is the input for the R script you just list them after the code. Note that the fields have to be aggregated.
Another consideration is the output from the calculation, when you start using more complex or interesting functions in R the output tends to be a data frame however a calculation in Tableau can only have one result for each row of data. Since SCRIPT_ functions are table calculations, that means that each row of data in your view can only have a resulting vector of size 1. In the above script, the output will give one result per pair of arguments. Though other scripts will require the desired result to be extracted from the data frame or concatenated so that Tableau can process the output.
The example above is easily doable in Tableau of course, but there are many more complex possibilities that R is capable of that we can utilise in Tableau. We have more blogs about R and Tableau here.
Join the Data Jam
92% of companies fail to scale their analytics, which likely includes you. We have studied the patterns in hundreds of client engagements and cracked the code for a modern data stack that guarantees success.
We'll uncover this in the most original webinar you've attended this year.