In this blog post I show how to get started with Tableau to do some simple yet interesting data visualisations. Using a list of all Summer Olympic Games medal winners since 1896, I used Tableau Public to uncover some simple yet interesting facts about the games and about recent history.
So what is Tableau?
Tableau is an interactive data visualization tool-set focused on business intelligence – it allows users to analyse, visualise and share data. Tableau Public is a free version for trial/evaluation use that only supports saving your data onto Tableau’s public website – this is what I used for our Olympics example. Once your business installs Tableau, a Server licence allows users to create reports and dashboards based on the data sources that have been created – these will typically be based on company databases (e.g. the corporate CRM system, billing system, Data Warehouse or similar) or the sources can simply point to an Excel spreadsheet as in our example. However you’ll need a special Desktop licence to create data sources. The end-reports you create can be read (and interacted with if you’ve added controls) by anyone who downloads ‘Tableau Reader’ which is available for free.
Tableau is one of many tools for business Analytics/Business Intelligence – and is near the top corner of the Gartner Magic Quadrant as it combines an excellent feature-set with a relatively simple interface and produces reasonably good-looking output. In particular Tableau is recommended for investigative analyses where you may not know the answers to your problems and want the flexibility to chop and change data visualisations easily, rather than to produce the best-looking charts and graphs in the world.
Summer Olympic Games Medals Example
My example uses a Summer Games medal winners list (1896-2008) as a data source into Tableau Public. This was my first time using Tableau so I certainly won’t claim to be an expert – but with a background of using Excel, it was clear that the worksheets in Tableau were not unlike the PivotChart/PivotTable functions in Excel, just with more advanced features and a more powerful data layer that allows you to create calculated metrics (Measures). Once the Excel data was loaded up in Tableau (a process similar to File -> Open), a quick entity join is required to link the medal winners to their country names (winners are listed with their IOC country codes) so that we can use a filled map later. The join is a drag and drop affair that is very intuitive to do – assuming you have a basic understanding of joins (left/right/inner/full etc.) – which may explain why creating data sources is a separate function in terms of the Tableau licencing – you need to know what you’re doing here to minimise risk of messing up before you start!
A very quick drop of the Events, Year of the games, and (host) City columns betrays a growing number of Olympic events over the years. Straight away we see that Antwerp in 1920 had an unusually high number of events relative to the games directly before and after– a quick look in the Excel file shows that Archery, Boxing, Figure Skating, Hockey, Ice Hockey and Vaulting appearing in 1920 but not in 1912 or 1924. A closer look at the Years list reminds us that there were no Olympic winners at all in 1916, 1940 or 1944 as the games were among the casualties of the two World Wars.
Reading between the lines
The raw list of winners list medals in team sport several times because the list contains the names of all individual medal recipients – so I had to create a calculated field to get a view of events won (rather than physical medals handed over). Setting the rows and columns to winner’s Country and Olympic year (Edition), and dropping in the total number of events won onto the Color button, we see can the medals-per-county over the years on a highlight table. I dragged the same measure to the Label button and to show the number on the plot.
This highlight table plot may not be the prettiest, but it was created in just a few seconds and its shape and colours show up some interesting facts:
- Many of the greenest spots represent home advantage where host countries win more medals than usual at their home games (France 1900, USA 1904, Britain 1908, Belgium 1920, Germany 1936, Rome 1960, China 2008 etc)
- The US-led boycott of the 1980 Moscow games causes an intermittent gap in the 1980 column, and correlates to Russia winning its biggest medal haul ever – in the absence of USA, Japan, Canada, Norway, South Korea etc
- The tit-for-tat boycott of the 1984 games in Los Angeles shows in the next column – with athletes from Russia, Poland, Bulgaria, Cuba etc missing out on any medals, whilst other countries pick up more than they’d usually do in their absence
- Present-day Germany competed under different IOC codes between 1956 and 1988 (e.g. as East Germany (DDR) and West Germany (FRG)) and is missing from the graph in those years (presumably because my IOC code list doesn’t include old codes). Germany’s record haul at Barcelona in 1992 occurred just after re-unification in 1990.
- Many new countries appear on the chart from 1992/1996 due to the break-up of the USSR in 1991 and Yugoslavia in the same period e.g. Belarus, Ukraine, Lithuania, Croatia, Kazakhstan, Georgia etc
- “Russia” won no medals in 1992 as athletes from Russia and some other former USSR states competed as the ‘unified’ team (EUN) that year. (Note that for this plot I rather bluntly assigned all medals won by the USSR from 1952-1988 to ‘Russia’ for all-time comparison purposes with USA)
- China’s almost complete avoidance of the games until 1984 is evident, as is the increasing medal hauls each year, till they topped the medal tables at the Beijing games.
Tableau is designed to make the most of mapping data and is smart enough to map data containing geo co-ordinates, city names or country names automatically once you drop the right field into your work area and select a map as your graphing type. That said, the standard maps are present-day and don’t work too well for historical data where countries no longer exist – like East and West Germany and Yugoslavia! As this was a quick analysis, this data, and 1992 data for unified Russian states is unfortunately missing from the maps.
Once I right-click on the ‘Country’ dimension and set to be Geographic, Tableau automatically resolves each country that it can to allow me to map the medal data. For this example I show the same event wins data using a filled map.
Give or take the border politics that have changed the world map over the last 120 years, we see USA and USSR/Russia dominating the map, with a good spread of medals over much of the world, though many African nations are still without Olympic metal. Interestingly, Bolivia is the only South American nation winning no medals in the modern games.
Recent medal trends
We can easily filtering our data from Barcelona 1992 to Beijing in 2008 to see countries are cleaning up in which sports in more recent years by dropping the appropriate field into the Filters area and selecting the last 5 games. Usefully, when you drop the Sports dimension onto the Rows, Tableau will automatically create a series of maps showing the medal distribution for each sport. This uncovers some interesting regional patterns – for example China’s Gymnastics and Table Tennis domination, Brazil’s strength in Volleyball and Australia’s excellent swimming pedigree.
One general draw-back of geo-mapping in any visualisation package comes down to the shape of our world – smaller countries are just harder to spot on the map! So you might be forgiven for not noticing Jamaica’s substantial athletics success and, even more so, Cuba’s virtual domination of Olympic boxing based on the size of those nations in the above maps. This is where the treemaps area chart comes in useful, here’s the boxing one for the same data as above:
No missing Cuba on that chart. Tableau will let you click and drill down into both the treemap and geo map charts where the data allows – for example if we had the athletes’ home towns in our data we could continue to drill to see the provinces and towns with the most medal winners in each country.
In general it proved very easy to create interesting charts that showed trends (e.g. home advantage) or raised questions that were subsequently answerable with a bit of research into the 29,900 rows in the spreadsheet (Antwerp events) or a quick internet search (China’s non-participation). I feel I’ve only scratched the surface of Tableau (and the Summer Olympics) in this example, but I must admit I never expected to see this type of information flowing out of a medals spreadsheet. And we didn’t even have time to look at a single athlete. Oh, very well then if you insist, here’s the top-end of the all-time medals table:
Now I just need to find out a little more about Vilheim Carlberg – Sweden’s sharp-shooting gymnast – and the first person with medals in two different sports on this graph! A drill into his medal details shows topped the medal table in the 1912 games, followed ably by his brother Eric. Both won medals in several Olympics, but they both won most of their medals at Stockholm. Home advantage, eh?
- Tableau Public edition is available from https://public.tableau.com/s/download
- Olympic medal data was retrieved from the tableau website resources page at https://public.tableau.com/s/resources.
- Tableau reader can be downloaded from http://www.tableau.com/products/reader.
- Gartner’s magic quadrant for Business Intelligence and Analytics Platforms is at https://www.gartner.com/doc/reprints?id=1-2XXET8P&ct=160204&st=sb