What is Data Visualisation?
I like the definition in TechTarget: Data visualisation is a general term that describes any effort to help people understand the significance of data by placing it in a visual context. Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualisation software.
Data Preparation for Data Visualisation
I will start with data preparation, as this step is very critical and time consuming (can be 80% of the time, depending on the data set complexity) but still very underrated when it comes to data visualisation, so if you are a beginner you need to set up clear expectations around the time that it will take you to produce accurate data visualisations as graphs, dashboards, etc.
The main steps on data preparation are:
Also called Data Scrubbing, during this step the idea is to remove or amend errors such as duplicates, incorrect formatting, misspelling, blank fields, incorrect/missing data labeling, and to identify anomalies.
There are different paid and open source tools that can help us on this step. One of the most popular and accessible tools is Excel, follow this link to find the best practices to Prepare Data for Visualisation and Analysis, also in this support page from Microsoft you will find the Top Ten ways to Clean your Data.
This is also an important step on data preparation. Excel also provides a very quick and easy method to format data as text, date/time, numbers, currencies, and customise formatting.
Once the data is in a better shape we can jump into the art of Data Visualisation!
Data Visualisation Tools
As a beginner, probably the best way to start is to use the most available tools such as:
Google Fusion Tables
“Google Fusion Tables is a cloud-based service for data management and integration. Fusion Tables enables users to upload tabular data files (spreadsheets, CSV, KML), currently of up to 100MB. The system provides several ways of visualising the data (e.g., charts, heat-maps, and timelines) and the ability to filter and aggregate the data. It supports the integration of data from multiple sources by performing joins across tables that may belong to different users. Users can keep the data private, share it with a select set of collaborators, or make it public and thus crawlable by search engines. The discussion feature of Fusion Tables allows collaborators to conduct detailed discussions of the data at the level of tables and individual rows, columns, and cells”
See here the full document: Data Management Integration and Collaboration in the Cloud.
Visit my blog post to learn how to create an intensity map using Fusion Tables.
Probably the most popular used tool for data management. With Microsoft Excel you can create spreadsheets, store data, reformat, rearrange, analyse, create graphs, highlight trends and patterns, create heat-maps, etc.
Tableau Software is currently the trendiest Data Visualisation Software, it offers you the flexibility of importing data from different sources and formats, and connect to SQL servers. Very easy to use, and offer very sleek and eye catching visualization formats. You should give it a go with the tableau public software. If you want to see a practical example using Tableau Public, do not miss my next blog post Getting Started with Tableau.