Comments
Description
Transcript
Using IBM dashDB with Tableau
Using IBM dashDB with Tableau What this tutorial is about In today's fast-paced world of constant changes, business owners need a reliable and simple solution that enables them to make sound business decisions in a timely fashion. IBM dashDB is the latest data warehousing offering from IBM that allows you to find the answers you need at your fingertips. Combining with the intuitive graphical interface provided by Tableau, the integrated solution allows you to explore data and draw conclusions within minutes. What you should be able to do At the end of this tutorial, you should be able to: Load data into dashDB. Download DB2 driver from Tableau. Connect Tableau to IBM dashDB. Specify a dashDB data source in Tableau. Visualize the data to gain insights. What you need before you start An IBM Bluemix account Tableau Page 1 Using IBM dashDB with Tableau Step 1: Obtain the raw data Download the sample data set from the US Department of Transportation regarding flight delays in January 2015. http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time The entire data set contains around 470k records once the file is downloaded. For the purpose of this tutorial, only the flights that were delayed are of interest, so the following records are filtered out: • • on-time flights where the difference between scheduled and actual arrival times are equal to zero incomplete data where the cause of the delay is unknown This leaves just under 100k records in the data set for demonstration with the following fields. Page 2 Using IBM dashDB with Tableau Step 2: Load the data into dashDB 1. Once you have obtained the data set, Click Load->Load from Cloud to open up the panel to load the data into dashDB. Note that there are other ways for loading data, for instance, you can load from the desktop or from Cloudant. For this tutorial, you will load data from the cloud. 2. Select the file flight.csv as the data source from Swift file system. Page 3 Using IBM dashDB with Tableau 3. The target table name is FLIGHT_DELAY. Follow the prompts to finish loading. Page 4 Using IBM dashDB with Tableau Page 5 Using IBM dashDB with Tableau Step 3: Download DB2 driver from Tableau The DB2 driver file can be found at http://www.tableau.com/support/drivers. This is required in order to make a connection from Tableau to dashDB. Step 4: Launch Tableau Desktop and connect to IBM dashDB Type the dashDB server name, port number, database name, and corresponding credentials. Page 6 Using IBM dashDB with Tableau Step 5: Select data source from Tableau Once a connection is established, you can select the FLIGH_DELAY table as your data source for analysis. Step 6: Example 1 – Find out which day of the week has the most incidents of flight delays 1. Create a new worksheet. Page 7 Using IBM dashDB with Tableau 2. Select Day of Week as the column and Number of Records as the row from the panel on the left. 3. You can now see that Friday has the most number of flight delays. Page 8 Using IBM dashDB with Tableau Page 9 Using IBM dashDB with Tableau Step 7: Example 2 – Find out which destination city has the most incidents of delay caused by weather 1. Create a new worksheet. 2. Select the Weather Delay and Destination City fields from the panel. 3. Since the Weather Delay field represents the delay in minutes, change the quantifier formula from Sum to Count, as you are interested to see the number of incidents rather than the total number of minutes of delay time. 4. The data contains actual city names in the US, choose the map option for a better presentation. Page 10 Using IBM dashDB with Tableau 5. You can use the color and size options on the left panel to customize the view. 6. From this map, you can see that Chicago is one of the worst destination cities in terms of weather delays. Page 11 Using IBM dashDB with Tableau Page 12 Using IBM dashDB with Tableau Step 8: Example 3 – Find out which carrier has the most flight delays 1. Create a new worksheet. 2. Select Carrier for the column and Number of Records for the row. 3. In this case, a packed bubble graph provides a better visual comparison. 4. The graph shows that for this particular month, Southwest Airline has more delays compared with the other airlines. Page 13 Using IBM dashDB with Tableau Page 14