If you are a runner/cyclist and enjoy working with data (like myself), you might enjoy the insights that Runkeeper, Strava and other similar sports tracking websites offer. Personally, I find that while all of them provide a few interesting dashboards, none of them really cover the whole gamut of insights. Some focus on deeper analytics on a specific run or ride, while others are more suitable for tracking your growth and evolution. Now the goal isn't to plug sports tracking websites, but I'd suggest you check out Smashrun for the broadest range of insights.
Anyway, this post isn't about the pre-built charts and visualizations that these websites offer, but about the fact that you can download the data behind them and... That's right, connect to that data in Tableau to build your own stuff!
- If you use Runkeeper, go to your Account Settings and select "Export Data" on the bottom left; or log in and use the following link: https://runkeeper.com/exportDataForm. Then select a range of dates, and export the data. The page will auto-refresh and you'll obtain a download link.
- If you use Strava, there is a similar option under Settings, on the bottom right ("Download Your Data"). The procedure will generate an export of all your data, and send you an email when it's done.
- Other sites offer similar export features. Refer to their documentation or reference pages if necessary. Not all of them do though, but it's possible to transfer your data from one site to another using tools like Tapiriik.
Runkeeper and Strava both present you with a zip file containing one GPX file per run or ride. A GPX file contains very basic information (open one with Notepad to get an idea). It basically tracks your location every N seconds, depending on the settings of your tracking app or device. If you have a heart rate monitor or other devices, this data will show up with each measurement (track point) as well:
Now, there is just one issue: Tableau cannot read GPX files. Not to worry though; there are several options available to convert these GPX files into file formats that are readable by Tableau. Even better, some of these tools also merge the GPX files for you. If you convert them separately, you can make use of Tableau's union functionality; which will be even more powerful in version 10!
- One converter that works very well in my opinion is GPS Visualizer. This one even allows you to upload a zip file containing multiple GPX files, to merge and convert them all. Even better: you can have GPS Visualizer calculate the speed and distance between points beforehand! Saves you some work in Tableau... The input size is limited to reduce the strain on the server, but in my experience it works well with batches of about 20 files. To play around in Tableau, I'd suggest you just try a few files at first.
- GPS Utility is a program you install locally, which will also allow you to merge a large amount of files. The catch here is, you need to register to import files that are longer than 500 track points.
- You'll find a lot more when Googling "GPX to CSV" or similar terms.
- If you also have access to Alteryx, there's a lot of advanced possibilities there. You combine your data, and use the Geo capabilities to add speed and distance as well. You can even enrich your data with the location you have run based on the coordinates, and more. The choices are endless here.
In this example, we'll use data downloaded from Runkeeper, and converted+enriched (speed, distance) using GPS Visualizer. For easiest results, try the same; feeling adventurous, try it your way!
When importing the data into Tableau, we see there's some tweaking to do:
Fortunately, these problems are easily remediated by changing their data types, and assigning a geographic role to the Latitude and Longitude fields. Perhaps you are not even encountering these problems; on my laptop, these are due to the locale (local settings) used when reading the file. As this is set to Dutch in my case, Tableau expects commas [,] as decimal separators instead of points [.]. Play around with the locale settings and the data types to get the data to look like it should. English (US) is usually a good choice.
This looks better now:
The only remaining problem may be the name column of the run. While we don't necessarily need this, using this row may prove useful to determine the start of a recording.
When creating a first view in Tableau to check the contents of the name column, it becomes apparent that something strange might be going on:
It looks as if there are a ton of rows that don't have a run name (Null). That's normal: we already found out that the name is only shown on the first row of a run. However, for some reason, there's 18 runs with the name "name". Where'd those come from? Let's use View Data to find out:
Notice how just like "name" in the name field, you'll find "desc" in the desc column and "type" in the type field. This indicates that the headers might be repeated within the text file, which may be due to the usage of the GPS Visualizer conversion tool. The quickest way to exclude these rows from all of the views we'll be creating, is through the addition of a Data Source Filter. We can simply exclude "name" to get rid of the repeated headers:
There's just one more "annoying" thing to fix before we can really start playing around with the data: we do not have a field yet, which clearly delimits which records belong to which run. While we have the name field, this one is incomplete as the "name" of a run is not repeated for all of its rows. For me, this is quite easy to fix. I know I run once per day, so I can create a custom date field which will allow me to group each run's measurements.
In case you run more than once per day, this can be a bit more complicated. Let me know (comment) if that's the case and I'll help you come to an alternative! Anyway, the data is almost completely prepared for analysis:
You'll notice there are still Null records in the screenshot above. To get rid of these, we can use the exact same "Data Source Filtering" technique as shown above for the "name issue". Once that's done, it's time to start playing around! I've created the dashboard below in about 5 minutes, but just analyzing your data and using Tableau's great tools to discover insights is just great for this. See what you can do with your data!
And my favorite thing to do... Mapping multiple runs at the same time! Which is super easy as we have the latitude and longitude data in our data, and Tableau can use those to plot our track points on a map! You can then color code per run, based on speed, etc...
A few hints:
- Make sure you use the right aggregation for the right fields. E.g. if you want to see your distance per run, keep in mind that this field is a running total in the data. Hence, use the MAX aggregation when showing this per run!
- Similarly, at a run level, you're not going to want to show the SUM of your speed at each track point. Rather, the AVG would be interesting in this case.
- Still related to the two items above, you can set the default aggregation of a field by right clicking it and selecting "Default Properties" > "Aggregation" > "..."
- The speed and HR lines you see in the dashboard above are "smoothened" by using a Moving Average (right click the pill in the rows > "Quick Table Calculation" > "Moving Average"; right click again "Edit Table Calculation" and customize). Otherwise you'll see a very "spiky" line.