Back in 2013, Tableau 8.1 was released with the ability to connect with R, a popular statistics software and then later in version 10.1 Python integration was also available. This blog series should demonstrate just a fraction of the capabilities of R integration. We will kick of this journey here with steps on how to get the connection up and running some more information on how to get started.
R is a free, open-source language for statistical analysis. There are many libraries, packages and even saved models available in R. It is possible to utilise those in Tableau by using a calculated field to call the R engine using the Rserve package (a server allowing external programs to use R). Passing values from Tableau as an array for R to use, once R has calculated its results, they are returned to Tableau to be used in visualisations.
It is recommended that you are already somewhat proficient in coding to utilise R in Tableau. It is definitely beneficial to have some proficiency in R programming to take advantage of R’s more complex capabilities. I have taken online courses on DataCamp and Coursera, I wrote a blog previously with my thoughts of Coursera here. I am currently doing a series of courses on DataCamp on R and I would recommend it if you like to learn by practicing.
So now that all the preamble is out of the way, let’s get started with the set up. First thing’s first, you’ll need to have R on your computer. It can be downloaded here, there you can choose which version and installer of R to download (depending on your operating system).
Now that your R download is complete, a fun thing about R is that each version has a release name. The person who chooses the names is a Peanuts (the comics) fan, many of the names are references to the comics or films. The current one (at time of writing) is Eggshell Igloo; some of my favourites are: Warm Puppy, Sock it to Me, Very Secure Dishes, and Sincere Pumpkin Patch (the very first version of R I downloaded).
Getting back on track, the next step is to install Rserve. We need to open the R console which will look something like the image below.
In the console we need to install the Rserve package by typing:
into the console, then hit enter. You may be prompted to select a CRAN mirror, R advises us to select the mirror that is closest to your location to minimise the load. It will look like this:
Once the package is installed, we have to run it before we can use it, this is done with the following lines of code:
Now that the R server is up and running, let's hop into Tableau Desktop and define the connection to integrate R. Once in Desktop, go to the Help menu and locate "Manage External Service Connection..." option (shown below).
This will open the External Service Connection dialogue box, change the external service drop-down to Rserve and specify the server as localhost and port as 6311 (as in the image below).
Once that is all set up, you are ready to start using R in Tableau!
In this blog I have demonstrated the scripting in RGUI (R graphical user interface); it is not the most friendly UI to write or draft scripts that are longer than a couple of lines. For scripting R code, RStudio is a great IDE (integrated development environment) to use. In RStudio you can write multiple scripts, have help with debugging and you can even write reports. It can be downloaded from their website, also note that for those using Mac OS X to use some R packages will require downloading XQuartz.
As mentioned above, R is integrated into the calculated fields in Tableau, there are four different calculations that you can use to call R:
The calculation will then be formed of your choice of script (one of the four listed above), your R script, and the arguments to go in the script.
The example calculated field shows the Tableau instructions and an example of how to format the script. The R code is put inside " ", and then the fields/arguments in the script are replaced by .arg#, where the # is replaced with consecutive integers from 1, for however many arguments you intend to have (the example shows two). When telling Tableau which Tableau field is the input for the R script you just list them after the code. Note that the fields have to be aggregated.
Another consideration is the output from the calculation, when you start using more complex or interesting functions in R the output tends to be a data frame however a calculation in Tableau can only have one result for each row of data. Since SCRIPT_ functions are table calculations, that means that each row of data in your view can only have a resulting vector of size 1. In the above script, the output will give one result per pair of arguments. Though other scripts will require the desired result to be extracted from the data frame or concatenated so that Tableau can process the output.
The example above is easily doable in Tableau of course, but there are many more complex possibilities that R is capable of that we can utilise in Tableau. We will explore some of these capabilities in more detail in the following weeks.
If you would like to find out more or want bespoke training on using R in Tableau please contact us.