...

Using IBM dashDB with Tableau

by user

on
Category: Documents
12

views

Report

Comments

Transcript

Using IBM dashDB with Tableau
Using IBM dashDB with Tableau
What this tutorial is about
In today's fast-paced world of constant changes, business owners need a reliable and simple
solution that enables them to make sound business decisions in a timely fashion. IBM dashDB
is the latest data warehousing offering from IBM that allows you to find the answers you need
at your fingertips. Combining with the intuitive graphical interface provided by Tableau, the
integrated solution allows you to explore data and draw conclusions within minutes.
What you should be able to do
At the end of this tutorial, you should be able to:
 Load data into dashDB.
 Download DB2 driver from Tableau.
 Connect Tableau to IBM dashDB.
 Specify a dashDB data source in Tableau.
 Visualize the data to gain insights.
What you need before you start
 An IBM Bluemix account
 Tableau
Page 1
Using IBM dashDB with Tableau
Step 1: Obtain the raw data
Download the sample data set from the US Department of Transportation regarding flight
delays in January 2015.
http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time
The entire data set contains around 470k records once the file is downloaded. For the purpose
of this tutorial, only the flights that were delayed are of interest, so the following records are
filtered out:
•
•
on-time flights where the difference between scheduled and actual arrival times are
equal to zero
incomplete data where the cause of the delay is unknown
This leaves just under 100k records in the data set for demonstration with the following fields.
Page 2
Using IBM dashDB with Tableau
Step 2: Load the data into dashDB
1. Once you have obtained the data set, Click Load->Load from Cloud to open up the panel
to load the data into dashDB. Note that there are other ways for loading data, for instance,
you can load from the desktop or from Cloudant. For this tutorial, you will load data from the
cloud.
2. Select the file flight.csv as the data source from Swift file system.
Page 3
Using IBM dashDB with Tableau
3. The target table name is FLIGHT_DELAY. Follow the prompts to finish loading.
Page 4
Using IBM dashDB with Tableau
Page 5
Using IBM dashDB with Tableau
Step 3: Download DB2 driver from Tableau
The DB2 driver file can be found at http://www.tableau.com/support/drivers.
This is required in order to make a connection from Tableau to dashDB.
Step 4: Launch Tableau Desktop and connect to IBM dashDB
Type the dashDB server name, port number, database name, and corresponding credentials.
Page 6
Using IBM dashDB with Tableau
Step 5: Select data source from Tableau
Once a connection is established, you can select the FLIGH_DELAY table as your data source
for analysis.
Step 6: Example 1 – Find out which day of the week has the most incidents of flight
delays
1. Create a new worksheet.
Page 7
Using IBM dashDB with Tableau
2. Select Day of Week as the column and Number of Records as the row from the panel on
the left.
3. You can now see that Friday has the most number of flight delays.
Page 8
Using IBM dashDB with Tableau
Page 9
Using IBM dashDB with Tableau
Step 7: Example 2 – Find out which destination city has the most incidents of delay
caused by weather
1. Create a new worksheet.
2. Select the Weather Delay and Destination City fields from the panel.
3. Since the Weather Delay field represents the delay in minutes, change the quantifier
formula from Sum to Count, as you are interested to see the number of incidents rather
than the total number of minutes of delay time.
4. The data contains actual city names in the US, choose the map option for a better
presentation.
Page 10
Using IBM dashDB with Tableau
5. You can use the color and size options on the left panel to customize the view.
6. From this map, you can see that Chicago is one of the worst destination cities in terms of
weather delays.
Page 11
Using IBM dashDB with Tableau
Page 12
Using IBM dashDB with Tableau
Step 8: Example 3 – Find out which carrier has the most flight delays
1. Create a new worksheet.
2. Select Carrier for the column and Number of Records for the row.
3. In this case, a packed bubble graph provides a better visual comparison.
4. The graph shows that for this particular month, Southwest Airline has more delays
compared with the other airlines.
Page 13
Using IBM dashDB with Tableau
Page 14
Fly UP