The Business Value of XML 1 Data Management – DB2 9.5 pureXML
by user
Comments
Transcript
The Business Value of XML 1 Data Management – DB2 9.5 pureXML
IBM Software Group The Business Value of XML Data Management Solutions Information Management IBM Software Group 1 © 2008 IBM Corporation IBM Software Group Data Management – DB2 9.5 pureXML ® Summer/Fall 2008 Data Management Solutions Information Management IBM Software Group 2 © 2008 IBM Corporation 1 Agenda Where & Why XML data is used What’s driving the need for XML data management? Industry solutions and application scenarios DB2 9 early feedback and competitive comparison 3 © 2008 IBM Corporation You will store XML! Resistance is Futile. 4 © 2008 IBM Corporation 2 XML is Everywhere! Integration of diverse data sources Information exchange between applications & organizations eForms and workflow processing Content and document management Message-based transactions, web services, SOA XML documents as business objects / transaction records (digital signatures, auditing, regulatory compliance) XML as the better data model (for multi-values, hierarchical and complex data) 5 © 2008 IBM Corporation Who uses XML? Everybody! Financial ACORD XML for Insurances FIXML Financial Information eXchange protocol FPML Financial Product ML FUNDSML Funds Markup Language XBRL eXtensible Business Markup Language Life Sciences AGAVE Architecture for Genomic Annotation, Visualization and Exchange BSML Bioinformatic Sequence Markup Language CML Chemical Markup Language Publication etc. SportML Sport Markup Language NewsML News Markup Language XBITS XML Book Industry Transaction Standards XPRL eXtensible Public Relations Language Other LandML Land Development Markup Language MODA-ML Middleware tOols and Documents to Enhance the textile/clothing supply chain through xML MatML Materials Property Data Markup Language JXDM Global Justice XML Data Model ebXML Electronic Business using eXtensible Markup Language ... ... 6 http://www.acord.org/standards/lifexml.aspx http://www.fixprotocol.org/cgi-bin/Spec.cgi?menu=4 http://www.fpml.org/spec/index.asp http://www.funds-xml.org/html/download.htm http://www.xbrl.org/r http://www.lifecde.com/products/agave/ http://www.bsml.org/resources/default.asp http://www.xml-cml.org/ http://www.sportsml.com/specifications.php http://www.newsml.org/pages/spec_main.php http://www.xmlbits.org/docs.asp http://www.xprl.org/ http://www.landxml.org/spec.htm http://www.moda-ml.net/modaml/repository/schema/V20031/default.asp?lingua=en http://www.matml.org/schema.htm http://it.ojp.gov/jxdm/3.0/index.html http://www.ebxml.org/specs/ ... © 2008 IBM Corporation 3 XML Example: Financial Data (FIXML) Buying 1000 Shares of IBM Stock.. 8=FIX.4.2^9=251^35=D^49=AFUNDMGR^56=ABROKER^34=2 ^52=20030615-01:14:49^11=12345^1=111111^63=0^64=2003 0621^21=3^110=1000^111=50000^55=IBM^48=459200101^22= 1^54=1^60=2003061501:14:4938=5000^40=1^44=15.75^15=USD ^59=0^10=127 Old FIX Protocol New FIXML Protocol extensible lower appl development & maintenance cost 7 © 2008 IBM Corporation Analyst reports Gartner Group – 4Q07 insurance industry survey: • 75% of firms have implemented XML standards • 74% of firms believe XML implementations have yielded business and technology benefits • > 50% of firms basing XML initiatives wholly or partially on ACORD – 1Q08 DITA (Darwin Info Type System) assessment • “DITA can reduce the time and cost required to create content, while increasing the value of content it identifies. These will drive enterprises to adopt DITA-aware applications.” The Burton Group (3Q07) – “XQuery is likely to become for XML content what SQL is for relational data.” 8 © 2008 IBM Corporation 4 Industry standards and mandates Single Euro Payments Area (SEPA) – Major European payments initiative with global implications – Mandates XML-based message formats for exchange of payments data between banks in Euro zone. Based on ISO 20022. – Expected to reach “critical mass” by end of 2010. OTC (over the counter) derivatives processing – International Swaps and Derivatives Association (ISDA) survey (2005): 75% of large firms responding to survey use FpML (Financial Products Markup Language) – Operations Management Group (OMG) consisting of buyside firms and service providers endorse FpML use as a means to support “operational scalaibilty” for trading in credit derivatives (1Q08) 9 © 2008 IBM Corporation Where is your XML? In files… Storage not managed and not secure In LOBS… Content and business value locked up XML ONL Y Shred to tables Complex and fragile mapping 10 XML DB Scalability & integration concerns © 2008 IBM Corporation 5 Why XML data management? XML is pervasive – Integration and messaging applications • OTC derivatives (FpML) • Single Euro Payments Area (SEPA, UNIFI / ISO 20022) • Securities trading (FIXML) • Insurance (ACORD) – E-Forms, Web applications – SOA engagements XML data is a corporate asset – Messages = business artifacts/transactions (e.g., order, trade) – Customer profiles, behavior patterns – Audit trails for regulatory compliance – Archiving 11 <? xml version=“1.0” ?> <purchaseOrder id='12345” secretKey='4x%$^'> <customer id=“A6789”> <name>John Smith Co</name> <address> <street>1234 W. Main St</street> <city>Toledo</city> <state>OH</state> <zip>95141</zip> </address> </customer> <itemList> <item> <partNo>A54</partNo> <quantity>12</quantity> </item> <item> <partNo>985</partno> <quantity>1</quantity> </item> </itemList> </purchaseOrder> XML must be managed, shared, analyzed & protected like any other important data © 2008 IBM Corporation XML in the Database Data that’s inherently hierarchical or nested in nature – Example: Medical data, Bill-of-materials, etc., OO & Multivalue Data sets with sparsely populated attributes – Example: FIXML, FpML, Customer profiles Schema evolution – Example: Frequently changing services/products/processes Variable schemas, many schemas – Example: Data integration, consolidation of diverse data sources Combining structured & unstructured data – Example: CM, Life Sciences, News & Media 12 © 2008 IBM Corporation 6 XML-Enabled Databases: Two Main Options Shredding CLOB/Varchar XML DOC Extract selected elements/attr. XML DOC "Decomposition" Fixed Mapping Shredder Side Tables XML DOC XML DOC XML DOC Regular tables for faster lookup Varchar or CLOB column Regular relational tables 13 © 2008 IBM Corporation Shredding: A simple case <DEPARTMENT deptid="15" deptname="Sales"> <EMPLOYEE> <EMPNO>10</EMPNO> <FIRSTNAME>CHRISTINE</FIRSTNAME> <LASTNAME>SMITH</LASTNAME> <PHONE>408-463-4963</PHONE> <SALARY>52750.00</SALARY> </EMPLOYEE> <EMPLOYEE> <EMPNO>27</EMPNO> <FIRSTNAME>MICHAEL</FIRSTNAME> <LASTNAME>THOMPSON</LASTNAME> <PHONE>406-463-1234</PHONE> <SALARY>41250.00</SALARY> </EMPLOYEE> </DEPARTMENT> Department DEPTID DEPTNAME 15 Sales Employee DEPTID EMPNO FIRSTNAME 15 27 MICHAEL 15 10 CHRISTINE 14 LASTNAME PHONE SALARY THOMPSON 406-463-1234 41250 SMITH 408-463-4963 52750 © 2008 IBM Corporation 7 Shredding: A schema change… "Employees are now allowed to have multiple phone numbers…" <DEPARTMENT deptid="15" deptname="Sales"> <EMPLOYEE> <EMPNO>10</EMPNO> <FIRSTNAME>CHRISTINE</FIRSTNAME> <LASTNAME>SMITH</LASTNAME> <PHONE>408-463-4963</PHONE> <PHONE>415-010-1234</PHONE> <SALARY>52750.00</SALARY> </EMPLOYEE> <EMPLOYEE> <EMPNO>27</EMPNO> <FIRSTNAME>MICHAEL</FIRSTNAME> <LASTNAME>THOMPSON</LASTNAME> <PHONE>406-463-1234</PHONE> <SALARY>41250.00</SALARY> </EMPLOYEE> </DEPARTMENT> Requires: • Normalization of existing data ! • Modification of the mapping • Change of applications Phone EMPNO 27 10 10 PHONE 406-463-1234 415-010-1234 408-463-4963 Department DEPTID DEPTNAME 15 Sales Costly! Employee DEPTID EMPNO FIRSTNAME 15 27 MICHAEL 15 10 CHRISTINE LASTNAME PHONE SALARY THOMPSON 406-463-1234 41250 SMITH 408-463-4963 52750 15 © 2008 IBM Corporation DB2 9 Technology Leadership Native XML hierarchical storage – No shredding, no CLOBs, no BLOBs required – Optimized for XPATH and XQuery processing High performance – Superior indexing technology – No parsing of XML data at query runtime Fully integrated XML and relational processing – Seamlessly query various types of data at once – No internal translation of XQuery into SQL Schema flexibility – Changes don’t force unload / reload of data – Multiple schemas allowed per XML column 16 © 2008 IBM Corporation 8 DB2 pureXML Detailed Features List XML data type for columns – Stored as native XML, with options for inlining and compression Language bindings for XML type in programming languages – cobol, c, java, etc.. XML indexes An XML schema/DTD repository – Support for multiple schemas, schema validation triggers, check constraints, compatible schema evolution Support for XQuery as a primary language as well as: – – – – Sub-document update (transform function) Support for SQL within XQuery Support for XQuery with SQL Support for new SQL/XML functions XSLT Support Performance, scale, and everything else you expect from a DBMS 17 © 2008 IBM Corporation DB2 pureXML Detailed Features List (continued) XML Import, Export and Load XML Runstats (w/ initial optimizer support) XML type support in stored procedures XML type supported by HADR .NET add-in to support DB2 XML type JDBC/ODBC support for Xquery, JDBC 4.0 (JSR 221) in 9.5 XML type for CLI, Embedded SQL in C++ and Cobol, PHP. Ruby on Rails and Perl Queue Replication support Federation support for XML and more… 18 © 2008 IBM Corporation 9 Benefits of managing XML data with DB2 9 Lower Development Costs – Reduced code and development complexity – Improved developer productivity Æ Quicken solution development and gain cost savings Greater Business Agility – Easily accommodate changes to data and schemas – Update applications rapidly and reduce maintenance costs Æ Respond quickly to dynamic conditions and get faster time to value Improved Business Insight – Access to “hidden gems” (data) in unexploited documents – Unprecedented application performance Æ Gain 19 competitive advantage through better and quicker information © 2008 IBM Corporation XML Data Needs Relational Maturity XML Data Needs Protection – Backup and recovery features to ensure continuity – Data is protected using database security 5 Simplified XML Data Access – Centrally store and access difficult to retrieve data – SQL or XQuery can be used to retrieve data – Join XML data with it’s related relational data Search Speed – Search documents quickly and efficiently using proven search optimization engine of mature database Optimize Existing Investments – Use existing technology infrastructure and skills to store and manage both relational and XML 20 © 2008 IBM Corporation 10 XML Usage Scenarios … 21 © 2008 IBM Corporation XML Usage Scenarios 1. Industry standards and data exchange applications 2. Web services, SOA data transport and message persistence 3. Business object / transaction record 4. Integration of diverse data sources 5. Forms and workflow processing 6. Document storage and querying 7. XML Feeds and Web 2.0 Syndication 8. Mapping XML in relational applications 9. Better data model for certain types of data 10. Rapid application prototyping and development … and many more! 22 © 2008 IBM Corporation 11 XML - the foundation for SOA and Web Services XML is the transport for messages and data in SOA XML DBs can provide SOA data services SOA messages/data often need to be persisted – Temporary Cache – Audit Logs – Compliance Records – Insight XML Service Requestor Service Provider <xml> 23 © 2008 IBM Corporation XML Transaction Records / Business Objects Transactions being conducted as XML – Within SOA environments – Between value chain members Æ Need to store the transaction record and query later Many business objects being represented as XML – Purchase orders – Invoices – Insurance policies Æ Need to store XML business objects intact 24 © 2008 IBM Corporation 12 Integration of Diverse Data Sources XML database as integration hub – XML schema flexibility Æ integrate data with differing formats – XQuery language Æ excellent for joining different data sources Integration using SOA environments – Services Oriented Integration (SOI) Applications, Services, Employee/ Customer Portals, Suppliers, Distributors, Partners, Agencies DB2 9 <xml> <xml> <xml> Z E O </xml> 25 © 2008 IBM Corporation Forms and their processing Forms exist for virtually all types of goods and services – Insurance applications, bank loans, tax filings, … Paper forms being replaced by electronic forms in XML format: – IBM Lotus Forms Store entire form (XML document) as a whole in XML database rather than shred into relational column v pro Ap Application Form DB2 9 e Audit <xml> Insi ght Broker Status </xml> 26 © 2008 IBM Corporation 13 XML Feeds and Syndication Syndication is heartbeat of Web 2.0 RSS/ATOM Feeds – encapsulated as XML Use XML database for serving and storing feeds E.g. Stock ticker feeds, inventory feeds, etc. Web Server XML XML Web Server XML XML ATOM/RSS Reader DB2 9 XML ATOM/RSS Provider 27 © 2008 IBM Corporation Mapping XML for relational applications “Shredding” may be ok if: “Simple” data / Schema not complicated Stable schema, no evolution XML is merely a transport i.e. XML structure not relevant Existing SQL Apps have only relational APIs <product id=“129”> <name>Acme</name> <price>12.99</price> </product> DB2 9 ID Name Price 129 Acme 12.99 … … … – E.g. BI apps, reporting tools Æ DB2 Annotated Schema Shredding 28 Insight © 2008 IBM Corporation 14 XML as a better data model XML provides a better data model for many new apps – Flexibility, schema versatility, hierarchical nature Semi-structured or unstructured data – E.g. healthcare records, biological data, contracts, insurance claims, etc. Inherently hierarchical, nested or complex data – E.g. manuals, books, catalogs, bills of materials, land records, etc. Data with changing or evolving schemas – E.g. Forms, changing industry standard documents, new product versions, etc. Data with Null, Multiple or Unknown values – E.g., Phone numbers (home, office, mobile), in patient records, etc. Î pureXML database a natural choice for XML data 29 © 2008 IBM Corporation pureXML for Rapid Application Prototyping and Development Represent multiple elements as a single object e.g.: Purchase Order Relational: – Many tables: Customer, Product, Shipping, … – Normalization – Foreign key relationships – Insert involves many columns – Complex queries with joins – Conform to column definition XML: – Single Purchase Order column – Easily access individual elements Î Write 30 less code with pureXML © 2008 IBM Corporation 15 A DB2 customer experience… 31 © 2008 IBM Corporation Profile One of Norway’s largest providers of insurance and financial services. Early adopter of SOA, Web Services and XML Challenge Improve cost effectiveness, speed time to market, increase product customization. “Development time using the XML native store is overall radically improved over shredding. Benefits Task Before With DB2 Development of search & retrieval business processes CLOB: 8 hrs 30 min. Shred: 2 hrs Add field to schema 1 week 5 min. Relative lines of I/O code (65% reduction) 100 35 Queries 24 - 36 hrs 20 sec - 10 min Search preparation Shred: 1 week ½ day 32 Also, shredding often results in complex mappings, which mean that the developer needs deep competence in constructing SQL.” Senior Enterprise Architect Thore Thomassen See “Managing XML for Maximum Return,” IBM White Paper, www.ibm.com/db2/xml © 2008 IBM Corporation 16 Proof-of-concept at North American securities firm Goal: investigate XML databases to simplify and enable SOA POC Objectives – Evaluate the loading and querying speed of DB2 v9 – Use realistic data and queries to ensure valid results – Gain experience with XQuery and SQL/XML Initial Expectations: – Having evaluated some of the other products in the market we were not very optimistic as to what kind of performance figures to expect Source: charts presented by securities firm at IOD Conference, Oct. 2006 33 © 2008 IBM Corporation Results Majority of POC objectives met with little or no additional tuning due to DB2 9 new autonomic self tuning and managing capabilities All tests required less than 16GB of memory Requirement to load approximately 500,000 XML documents per day achieved in less than 1 hour on DB2 9 All transactional queries completed sub-second (per record) – Retrieve XML doc for any specific trade (by trade number) – Retrieve all trades for a counterparty (by counterparty_ptynbr) – Retrieve all trades by trade create time – Retrieve all trades by maturity date range – Retrieve trades for a given acquire day range, and trade number range Source: charts presented by securities firm at IOD Conference, Oct. 2006 34 © 2008 IBM Corporation 17 DB2 9 Architecture XML Integrated in All Facets of DB2! XML Developer “I see a sophisticated XML repository that also supports SQL." Familiar Programming Models SQL Developer "I see a sophisticated RDBMS that also supports XML." New storage model Familiar tooling New indexing, optimization Reliability, scalability, high performance New query support New XML applications benefit from • Ability to seamlessly leverage relational investment • Proven Infrastructure that provides enterprise-class capabilities 35 © 2008 IBM Corporation Information Fidelity Integration Schema Flexibility Performance/Scale Programming Models Manageability 36 Pure XML Hybrid CLOB XML db Industry bundles Shred DB2 9 pureXML® Storage vs. the Competition 8999 9889 8 = 99 = 88 9 88 = 9 9889 © 2008 IBM Corporation 18 IBM Software Group Data Management – DB2 9.5 pureXML ® Summer/Fall 2008 Data Management Solutions Information Management IBM Software Group 37 © 2008 IBM Corporation IBM Software Group The Business Value of XML Data Management Solutions Information Management IBM Software Group 38 © 2008 IBM Corporation 19