Using IBM dashDB With IBM Embeddable Reporting Service
by user
Comments
Transcript
Using IBM dashDB With IBM Embeddable Reporting Service
Using IBM dashDB With IBM Embeddable Reporting Service What this tutorial is about In today's mobile age, companies have access to a wealth of data, stored in JSON format. Leading edge companies are making key decision based on that data but the challenge for many users is how to perform efficient and insightful analytics or data visualization on this type of data. With game changing technology, IBM has provided the ability to move JSON data loaded into a Cloudant NoSQL database into a highly efficient in-memory columnar database that can perform extremely fast analytics. With data available in a data warehouse, you can make use of IBM Embeddable Reporting Studio which is powered by Cognos Studio to design and generate reports for Cloudant NoSQL databases. What you should be able to do At the end of this tutorial, you should be able to: Create a Cloudant and Embeddable Reporting Service instances in IBM Bluemix. Create an application, and bind services to the application. Access the Cloudant service and load sample data. Create a dashDB warehouse to transform JSON data into relational tables View the warehouse in the dashDB web console. Use Cloudant as a connect source for Embeddable Reporting Service. Use dashDB as a data source for Embeddable Reporting Service. Add visualizations from the IBM Analytics Zone to build reports. Generate reports in the Embeddable Reporting Service. What you need before you start • • • • A Bluemix account Familiarity with JSON documents and cURL REST commands. Familiarity with basic SQL select query commands. Sample data sets which can be downloaded from: SODA API of open data https://data.sfgov.org/resource/tmnfyvry.json?$select=incidntnum,category,descript,dayofweek,date,pddistrict,resolution&$l imit=50000 Page 1 Using IBM dashDB With IBM Embeddable Reporting Service Create Cloudant instance in IBM Bluemix 1. Log in to your Bluemix account, and view the Catalog. In the Data Management category, locate Cloudant NoSQL DB. 2. Select the Cloudant NoSQL DB Service, and provide the following information: For App, select Leave unbound. For Service name, type a name for the instance. For Selected Plan, select the Shared plan. 3. Click Create to create the service. Page 2 Using IBM dashDB With IBM Embeddable Reporting Service Create Embeddable Reporting Service instance in IBM Bluemix 1. Similarly, from the Bluemix Catalog, in the Business Analytics category, locate Embeddable Reporting. 2. Select the Embeddable Reporting service, and provide the following information: For App, select Leave unbound. For the Service name, type a name for the instance. For Selected Plan, select the Shared plan. Click Create to create the Embeddable Reporting service instance. Page 3 Using IBM dashDB With IBM Embeddable Reporting Service Create an application to get user credentials, or VCAP_SERVICES, to access Cloudant and ERS instances 1. From the IBM Bluemix Dashboard, click Create App to create a new application. 2. Click WEB for the kind of application you are creating. 3. Next, select Liberty for Java for the technology you want to start with, and click Continue. 4. Type a unique name for the application you are creating, for example, ERforCloudant and then click Finish. Wait for the application to finish staging. 5. Once the application is created and staged, go back to the Dashboard, and make sure your application is listed under the Applications section. 6. Select your application to open it, and bind the Cloudant and Embeddable Reporting service instances that you created earlier. Page 4 Using IBM dashDB With IBM Embeddable Reporting Service 7. Click BIND A SERVCE OR API, select the Cloudant service, and click ADD. Click RESTAGE to restage the application now. 8. Repeat the previous step to bind the ERS service instance to your application. Page 5 Using IBM dashDB With IBM Embeddable Reporting Service Access Cloudant Service and load sample data sets In this tutorial, you will be leveraging data based on the San Francisco Police Department crime incidents open data to show crime categories and crime trends across years in each district. With JSON data loaded related to incidents, you will be able to perform data analysis using IBM Embeddable Reporting Studio to provide you with insights, for example, what are the top crimes, what are the number of incidents that occurred over the given years, and so on. 1. In the IBM Bluemix Dashboard, in the Services section, select the Cloudant service instance to open the service detail view. 2. In the Cloudant service detail view, click Launch to open the Cloudant dashboard. 3. On the Databases tab, click Add New Database. Type a database name (which should be all lowercase, no special characters), for example, incidents, and click Create. You should see a message Database created successfully. Page 6 Using IBM dashDB With IBM Embeddable Reporting Service 4. Now load some JSON documents into the database. For this tutorial, you can load one of the open public JSON data sets containing San Francisco crime incidents. a) Access the JSON data set from a browser using this URL: https://data.sfgov.org/resource/tmnfyvry.json?$select=incidntnum,category,descript,dayofweek,date,pddistrict,resolution&$l imit=50000 b) Save this page as incidents.json. Below is a sample of the JSON data: { { "category" : "ASSAULT", "pddistrict" : "MISSION", "descript" : "BATTERY", "dayofweek" : "Wednesday", "resolution" : "NONE", "date" : "2005-04-20T00:00:00", "incidntnum" : "050436712" } } c) Prepare to a do a bulk load of the JSON documents into the new Cloudant database using the _bulk_docs API endpoint. Edit the incidents.json file, and add the following to the top of the file: { “docs” : Add } at the end of the file, and save the file. d) You can do this by issuing the following command using cURL. Alternatively, you can use a browser add-on such as RESTClient or POSTMan. curl -k -X POST -H 'Content-Type: application/json' -d @<JSON FILE> -u <Username>:<Password> https://<Cloudant Host>/<Cloudant NoSQL DB>/_bulk_docs Page 7 Using IBM dashDB With IBM Embeddable Reporting Service You can find your username, password, and host values for the Cloudant DB Service by opening your Bluemix application, and viewing the Environment Variables page. The cloudantNoSQLDB credentials are under the VCAP_SERVICES tab as shown below. e) After you run the POST cURL command to load the JSON documents, you see a response similar to the following response. curl -k -X POST -H 'Content-Type: application/json' -d @incidents.json -u 3138462-97ce-38d8-8e4a-bcf984c4eopf-bluemix: d86064e46591b18bcd64d49c0e1212eac4d3f36b0a50d1c9c5ee9731543b24ce https:// 3138462-97ce-38d8-8e4a-bcf984c4eopf Bluemix.cloudant.com/incidents/_bulk_docs {"ok": true,"id":"f159900c9064fef6d162a95ed48838d5","rev":"1a22fde4737cae127696d64c7f6de0cbc"} e) Once the documents are all loaded, you can access the Databases tab in the Cloudant Dashboard to see that the database contains 50,000 documents. Page 8 Using IBM dashDB With IBM Embeddable Reporting Service Create a Warehouse to transform JSON data into relational tables One of the advantages of using IBM Cloudant to store your JSON data is its built-in ability to transform the data into relational tables in IBM dashDB on demand. IBM Cloudant Warehousing allows you to load JSON data into IBM dashDB using its continuous transformation method. Cloudant performs a schema discovery process against a sample size of JSON data loaded to continuously transform data into relational data so you can perform relational data analytics. 1. From the Cloudant Dashboard, navigate to the Warehousing tab, and click New Warehouse. 2. Select the incidents database you created earlier. 3. Next, provide the authentication credentials you use to access your Bluemix account. 4. Click Create Warehouse. Page 9 Using IBM dashDB With IBM Embeddable Reporting Service 5. On this screen, you can see that documents are being copied. You can tell that the numbers are still going up, and the load is not complete yet. A green status means things are working well, and all of the documents are loaded into dashDB from the source database. The warehousing service automatically subscribes to the changes being made in the Cloudant databases so that every update to every document in Cloudant automatically updates the record in dashDB. Page 10 Using IBM dashDB With IBM Embeddable Reporting Service Access Warehouse in new IBM dashDB instance created by Cloudant 1. Click Visit warehouse to open the warehouse in dashDB to check the relational tables for the JSON data loaded. 2. In the IBM dashDB web console, on the Tables tab, look under your schema to access the tables created by Cloudant. IBM dashDB will create tables based on the source database name and the JSON object parent key name. For example, if incidents is the source database name, and incidents is the key name for JSON object values, the table name would be incidents. Page 11 Using IBM dashDB With IBM Embeddable Reporting Service Use Cloudant as connect source for Embeddable Reporting service 1. In order for the Embeddable Reporting Service (ERS) to store its configuration using the Cloudant service, you will need the Cloudant URL information that you previously obtained from your Bluemix application, for example, https://3138462-97ce-38d8-8e4abcf984c4eopfbluemix:d86064e46591b18bcd64d49c0e1212eac4d3f36b0a50d1c9c5ee9731543b24ce@ 3138462-97ce-38d8-8e4a-bcf984c4eopf-Bluemix.cloudant.com. 3. From the Bluemix Dashboard, open your application containing ERS. 4. Click the Embeddable Reporting Service instance. 5. In the Embeddable Reporting service page, for the Repository URI, type the URL for the Cloudant NoSQL DB service. Then click Start. Page 12 Using IBM dashDB With IBM Embeddable Reporting Service Add IBM dashDB Connection details in ERS service instance as data source 1. Now that you have provided the Cloudant URL information, from the ERS Service instance, click New Package to create a package which is a data source. 2. A dialog box displays requesting the datasource information. In this case, you need to provide the IBM dashDB information which is available from the dashDB web console under Connect->Connect Settings. Page 13 Using IBM dashDB With IBM Embeddable Reporting Service 3. Back in the New Package dialog box, provide the following information, and then click Create. Name and Description: type an appropriate name and description for this package. JDBC URL: type the JDBC URL String from the previous step without the user and password details, for example, jdbc:db2://awh-yp-small03.services.dal.Bluemix.net:50000/BLUDB User Name and User Password: type the user name and password details from the previous step. If you want you to use a Secure Connection port for the IBM dashDB data source, you can switch to port 50001 and set the sslConnection option to true in the JDBC URI. For example, jdbc:db2://awh-ypsmall03.services.dal.Bluemix.net:50001/BLUDB:sslConnection=true Page 14 Using IBM dashDB With IBM Embeddable Reporting Service 4. If you’d like to go back and view the data source information at a later time, you can find that information on the Data Sources tab. Page 15 Using IBM dashDB With IBM Embeddable Reporting Service Add Visualizations from IBM Analytics Zone to beautify reports 1. Download the Visualizations from this URL: http://www.ibm.com/web/portal/analytics/analyticszone/downloads/filter/?cV=/AnalyticsZon e_Content/AZCatalog/Visualization/ 2. From the ERS Service instance, click the Visualizations tab, and then click New Visualization to upload the visualizations. 3. Browse to select the visualization package you previously downloaded, and click Open. 4. Type a name for the visualizations, and click Create. Create and generate SQL-based reports using the Embeddable Reporting Studio You can now create SQL-based reports from tabular data in the IBM dashDB instance created and loaded using Cloudant's transformation. Based on the table definition created by the Cloudant transformation, you can now generate reports to check the number of crime incidents being reported across the years, or report the resolution found for crime incidents being reported in each crime category. Page 16 Using IBM dashDB With IBM Embeddable Reporting Service Report 1: Generate a SQL-based column chart for the number of incidents reported in each year for each district 1. To generate new reports, access the Embeddable Reporting Studio. From the Bluemix Dashboard, open your application, and select the Embeddable Reporting Service. 2. Access the data sources, and on the Report definitions tab, click New Report Definition, and you will be redirected to the Embeddable Reporting Studio: https://erservice-beta1.stage1.ng.Bluemix.net/rs/pat/rsapp.htm 3. In the New dialog box, select SQL Blank so that visualizations can be added later to the query, and then click OK. 4. Next, select the data source which is your dashDB connection, and then click OK. 5. In the SQL dialog box, type the following SQL statement: select * from “incidents”, and Page 17 Using IBM dashDB With IBM Embeddable Reporting Service then click Validate to verify your query. Click OK to load the data. 6. Click the Data Items tab in the top left pane of the ER Studio to see the loaded query. 7. From the top menu, click View->Queries, and then select Query1 in the right pane. 8. On the left pane, click the Toolbox tab, and drag Data Item to the list of data items in the right pane to build a new data item for your report. Data items are like new aggregations that might be needed which are not originally available from the data items loaded from query. 9. Change the name of the data item to count_incidents, and type count([SQL1].[IncidntNum]) for the Expression Definition. This will count the total number of crime incidents. Click the green checkmark to validate the expression, and the f(x) tab to learn more about the summarize option being used here. Page 18 Using IBM dashDB With IBM Embeddable Reporting Service 10. Click OK to create the new data item. Here Data items are not necessarily in the data source, but are new calculated items. 11. Similarly, create a new data item called year from the Date defined as substr([SQL1].[Date],7,10) 12. Navigate to the reporting page from the top menu; View->Report Pages. Select Page1 from the list of Report Pages. 13. Click the Toolbox tab, and drag a Visualization on to the right pane. Page 19 Using IBM dashDB With IBM Embeddable Reporting Service 14. Select the Column chart from the available visualizations, select Query1 to map it to the column chart, and then click OK. 15. Now define the report based on the new data items you previously created. Use the count_incidents data item for Y-axis or Values, Year for x-axis or Categories, and PdDistrict for the Series. 16. Now from the toolbar, click the Play button to run the report and generate the graph. The Page 20 Using IBM dashDB With IBM Embeddable Reporting Service graph will show the number of incidents that have occurred in each year for each district in the San Francisco area. Report 2: Generate a Stacked column chart showing number of incidents and resolution found for incidents for each category Now let’s create another report showing the number of incidents and resolutions as a stacked column chart. Follow steps 1-14 as you did to create Report 1, then continue with the following steps. 1. Define the report based on the new data items you previously created. Use count_incidents for Y-axis or Values, Category for the x-axis or Categories, and Resolution for the Series. 2. Select the graph and change the graph properties called Stacked to Yes. 3. Click the Play button to generate stack chart. Page 21 Using IBM dashDB With IBM Embeddable Reporting Service 4. Once the reports are generated, you can save the report by clicking the Save button on the toolbar. 5. You can then accessed the saved reports directly from Embeddable Reporting service. Page 22