Search and update Google Base with PHP a custom Web application
by user
Comments
Transcript
Search and update Google Base with PHP a custom Web application
Search and update Google Base with PHP Use PHP to process and integrate data from Google Base with a custom Web application Skill Level: Intermediate Vikram Vaswani Founder Melonfire 09 Feb 2010 Google Base allows users to store any type of content online in Google's version of a massive online database. Web application developers are able to access and search this content through the Google Base Data API. This article introduces the Google Base Data API and demonstrates it in the context of a PHP application, explaining how to use SimpleXML and the Zend_Gdata module to search, retrieve, add, and edit different types of data on Google Base. Introduction Most search engines work by crawling the Web, indexing and filtering the content they find into massive databases, and searching these databases to find results matching a particular search query. This crawling/indexing process is performed without human intervention, both to maximize efficiency and to avoid biases in data collation and categorization. Frequently used acronyms • API: application programming interface • CSS: Cascading stylesheets • DOM: Document Object Model • HTML: HyperText Markup Language Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 1 of 30 developerWorks® ibm.com/developerWorks • HTTP: Hypertext Transfer Protocol • REST: Representational state transfer • URL: Uniform Resource Locator • XML: Extensible Markup Language That's why Google Base is so unusual. Launched in late 2005, Google Base is an online database that allows users to directly upload content, tag it with various descriptive attributes, and make this information searchable through the main Google search engine. The system has a set of pre-defined item types—for example, "jobs", "travel", "events" and so on—and it allows users to describe content either using these pre-defined types or by creating new ones. Any type of data can be uploaded, including such random bits of information as your best friend's name and birthday, when and where the next local "battle of the bands" competition will take place, and which job opportunities are currently available at your workplace. Of course, providing online storage for information is just one part of the puzzle; the other part is making it searchable and accessible. Content uploaded to Google Base is automatically indexed and made publicly available through the Google search engine. More importantly, Google Base content is also available through the Google Base Data API, allowing application developers to search and retrieve this user-created content and integrate it into custom applications. This API, which follows the REST model, can be accessed through any XML-capable development toolkit, and already has client libraries for many common programming languages, including PHP, .NET, Python and Java ™ technology. This article will introduce you to the Google Base Data API, showing you how to search Google Base for information in a variety of different categories, and integrate and use search results with a custom PHP application. It includes examples of searching for data using various attributes, and adding, updating, and deleting data in the system. Come on in, and get started! Understanding Google Base feeds Before you start to develop applications with Google Base, you need to understand how it works. As with all REST-based services, things get rolling with an HTTP request to a designated resource. This HTTP request contains a query with one or more input parameters; the server replies to the query with an Atom feed, suitable for parsing in any XML-aware client. To see how this works, try accessing the URL in your favourite Web browser: http://www.google.com/base/feeds/snippets? Search and update Google Base with PHP Page 2 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® bq=product+manager[itemtype:jobs][location:CA] Note: For formatting purposes, the preceding URL is split to multiple lines. Use it as a single string. (View the URL as a single string.) This request returns a list of entries from Google Base—in this case, a list of Product Manager jobs in the state of California. The raw XML response to this method (which you can view in the source code of the resulting page) contains detailed information on each entry, and might look something like Listing 1: Listing 1. An example Google Base feed <?xml version='1.0' encoding='UTF-8'?> <feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:gm='http://base.google.com/ns-metadata/1.0' xmlns:g='http://base.google.com/ns/1.0' xmlns:batch='http://schemas.google.com/gdata/batch'> <id>http://www.google.com/base/feeds/snippets</id> <updated>2010-01-14T06:50:03.819Z</updated> <title type='text'>Items matching query: product manager[itemtype:jobs][location:CA]</title> <link rel='alternate' type='text/html' href='http://base.google.com'/> <link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.google.com/base/feeds/snippets'/> <link rel='http://schemas.google.com/g/2005#batch' type='application/atom+xml' href='http://www.google.com/base/feeds/snippets/batch'/> <link rel='self' type='application/atom+xml' href='http://www.google.com/base/feeds/snippets?start-index=1& max-results=25&bq=product+manager%5Bitemtype%3A jobs%5D%5Blocation%3ACA%5D'/> <link rel='next' type='application/atom+xml' href='http://www.google.com/base/feeds/snippets?start-index=26 &max-results=25&bq=product+manager%5Bitemtype%3Ajobs %5D%5Blocation%3ACA%5D'/> <author> <name>Google Inc.</name> <email>[email protected]</email> </author> <generator version='1.0' uri='http://base.google.com'>GoogleBase </generator> <openSearch:totalResults>11299</openSearch:totalResults> <openSearch:startIndex>1</openSearch:startIndex> <openSearch:itemsPerPage>25</openSearch:itemsPerPage> <entry> <id>http://www.google.com/base/feeds/snippets/6996651848910375403</id> <published>2010-01-08T12:57:46.000Z</published> <updated>2010-01-13T12:31:03.000Z</updated> <category scheme='http://base.google.com/categories/itemtypes' term='Jobs'/> <title type='text'>SR. PRODUCT MANAGER: Internet Video Markets</title> <content type='html'>As a SR. PRODUCT MANAGER, you're a take-charge person. A born leader. A people person who sees the big picture, as well as the minute details. You generate powerful ideas and know how to get them implemented. If this describes you, you'll want to ...</content> <link rel='alternate' type='text/html' href='http://www.net-temps.com/job/37zk/USA_5192/sr_product_manager_internet. html?r=goo'/> <link rel='self' type='application/atom+xml' href='http://www.google.com/base/feeds/snippets/6996651848910375403'/> <author> <name>Net-Temps</name> Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 3 of 30 developerWorks® ibm.com/developerWorks </author> <g:job_function type='text'>Marketing</g:job_function> <g:location type='location'>San Francisco, CA, us</g:location> <g:employer type='text'>Manpower</g:employer> <g:label type='text'>analysis</g:label> <g:label type='text'>business</g:label> <g:label type='text'>product</g:label> <g:label type='text'>access</g:label> <g:label type='text'>management</g:label> <g:label type='text'>marketing</g:label> <g:label type='text'>manager</g:label> <g:label type='text'>temporary job</g:label> <g:label type='text'>temp</g:label> <g:education type='text'>masters</g:education> <g:education type='text'>bachelors</g:education> <g:item_language type='text'>EN</g:item_language> <g:id type='text'>1237852</g:id> <g:job_type type='text'>contract</g:job_type> <g:job_type type='text'>contractor</g:job_type> <g:target_country type='text'>US</g:target_country> <g:expiration_date type='dateTime'>2010-01-20T12:31:03Z </g:expiration_date> <g:job_industry type='text'>Marketing</g:job_industry> <g:customer_id type='int'>1106811</g:customer_id> <g:item_type type='text'>Jobs</g:item_type> </entry> ... </feed> Like other Google APIs, the Google Base Data API responds to REST requests with an Atom feed containing the requested data. There are two main feeds of interest: a public feed (the snippets feed), which contains a list of all Google Base items, and a private feed, which contains a list of items uploaded by a specific user. The former is publicly accessible and searchable; the latter is only available to the feed owner after successful authentication. This article will discuss both feeds. Listing 1 illustrates an example of the snippets feed, generated in response to a search request for Product Manager jobs in California. This is a standard Atom feed, with the outermost <feed> element containing <link> elements with URLs for the current, next, and previous pages of the result set, and <openSearch:> elements with summary statistics for the search. The outermost <feed> element also encloses one or more <entry> elements, each representing a result item matching the search query. Each entry contains descriptive metadata, including a title, a block of text or HTML content, and a URL for more information. Each <entry> also contains <link> elements, which provide URL links to related information. Perhaps most importantly, each entry includes a set of attributes (the <g:> namespaced elements) containing additional information relevant to the item type. Google Base offers a number of pre-defined item types, with recommended attributes for each. These attributes are important for two reasons: from a publisher perspective, they make it possible to tag or mark up each entry with additional descriptive information and, from a user perspective, they can be used to filter search results. Search and update Google Base with PHP Page 4 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® To understand this better, look at the URL used to generate Listing 1. The location and itemtype attributes included in the request URL serve as filters, to restrict the result set to only those entries that belong to the jobs item type and are tagged with the state code CA for location. The information requirements for job listings is, of course, very different from the information requirements for, say, event listings or real estate listings. So it's worth pointing out that the list of attributes available differs for each item type supported by Google Base. For example, the jobs item type incorporates attributes such as location, employer, education, and salary, while the recipes item type incorporates attributes such as course, ingredients, number of servings and preparation time. The Google Base Data API Reference Guide includes a discussion of how to obtain a complete list of item types and suggested attributes for each (see Resources for a link). Parsing Google Base feeds with SimpleXML With this background information out of the way, look at integrating Google Base data with a PHP application. The simplest (pardon the pun) way to do this is with PHP's SimpleXML extension, which provides an object-oriented API to access data encoded in XML. If you prefer, you can also use PHP's DOM or XMLReader extensions to perform the same task. To illustrate, consider Listing 2, which uses SimpleXML to parse the XML feed in Listing 1 and convert the data encoded within it to a Web page: Listing 2. Parsing a Google Base feed with SimpleXML <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Retrieving Google Base snippets</title> <style type="text/css"> .title { font-weight: holder; } .attr { margin-left: 15px; } .result { margin-bottom: 5px; float: left; width: 450px; margin-left: 20px; } </style> </head> <body> <?php // define snippet feed URL $url = 'http://www.google.com/base/feeds/snippets?bq= product+manager[itemtype:jobs][location:CA]'; Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 5 of 30 developerWorks® ibm.com/developerWorks // get feed XML and load into SimpleXML object $sxml = simplexml_load_file($url); // get summary counts from opensearch: namespace $counts = $sxml->children('http://a9.com/-/spec/opensearchrss/1.0/'); $total = $counts->totalResults; ?> <h2><?php echo $sxml->title; ?></h2> <div id="summary"> <?php echo $total; ?> result(s) found. </div> <br/> <div id="results") <?php $count = 1; ?> <?php foreach ($sxml->entry as $entry): ?> <?php // iterate over attributes // get employer, location, industry and type $attrs = $entry->children('http://base.google.com/ns/1.0'); $valsEmployer = array(); $valsLocation = array(); $valsIndustry = array(); $valsEducation = array(); $valsType = array(); foreach ($attrs as $key => $value) { switch ($key) { case 'employer': $valsEmployer[] = $value; break; case 'location': $valsLocation[] = $value; break; case 'job_industry': $valsIndustry[] = $value; break; case 'job_type': $valsType[] = $value; break; case 'education': $valsEducation[] = $value; break; } } ?> <div class="result"> <div class="title"> <?php echo $count; ?>. <?php echo $entry->title; ?> </div> <div class="attr"> Employer: <?php echo implode(', ', $valsEmployer); ?> <br/> Location: <?php echo implode(', ', $valsLocation); ?> <br/> Industry: <?php echo implode(', ', $valsIndustry); ?> <br/> Type: <?php echo implode(', ', $valsType); ?> <br/> Education: <?php echo implode(', ', $valsEducation); ?> <br/> </div> </div> <?php $count++; ?> <?php endforeach; ?> </div> </body> </html> Search and update Google Base with PHP Page 6 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® Figure 1 demonstrates the output you might see. (View a text-only version of Figure 1.) Figure 1. The results of a Google Base search using SimpleXML Listing 2 begins by using the simplexml_load_file() object to send a request to the feed URL and convert the response into a SimpleXML object. It then iterates over the <entry> elements in the response, processing each one using a foreach() loop. For each entry, attributes are stored in the <g:element> node collection. SimpleXML's children() method is used in conjunction with the g: namespace to return this node collection as a SimpleXMLElement object named $attrs. It now becomes possible to retrieve individual attributes and values from this object Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 7 of 30 developerWorks® ibm.com/developerWorks using SimpleXML. For example, the job location, which is stored in the <g:location> element for each entry', is accessible as $attrs->location associative array. This information is then combined into a composite HTML page, with some simple CSS styling to make it usable. Parsing Google Base feeds with Zend_Gdata A second technique for parsing the Google Base feeds involves dropping the entire manual-feed-parsing-with-SimpleXML in favour of the Zend Framework's Zend_Gdata client library, which is designed specifically for developers trying to integrate PHP applications with Google Data APIs. You can download the Zend_Gdata library either as part of the Zend Framework or as a stand-alone package (see Resources for a link). It includes a module specifically for working with the Google Base Data API, providing pre-defined classes and methods to simplify data access and authentication. Not only does this library provide a solid, community-tested codebase for your application, but using it also allows you to focus on core application functions, rather than on the details of navigating XML trees or handling custom namespaces. Listing 3 illustrates the Zend_Gdata client library in action, using it to produce a result equivalent to that of Listing 2: Listing 3. Parsing a Google Base feed with Zend_Gdata <?php // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); try { // initialize service object // no authentication needed for public snippets feed $service = new Zend_Gdata_Gbase(); // prepare and execute search query on snippets feed $query = $service->newSnippetQuery(); $query->setBq('product manager[location:CA][itemtype:jobs]'); $feed = $service->getGbaseSnippetFeed($query); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Retrieving Google Base snippets</title> <style type="text/css"> .title { font-weight: bolder; } Search and update Google Base with PHP Page 8 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® .attr { margin-left: 15px; } .result { margin-bottom: 5px; float: left; width: 450px; margin-left: 20px; } </style> </head> <body> <h2><?php echo $feed->title; ?></h2> <div id="summary"> <?php echo $feed->totalResults; ?> result(s) found. </div> <br/> <div id="results") <?php $count = 1; ?> <?php foreach ($feed as $entry): ?> <?php // iterate over attributes // get employer, location, industry, education and type $valsEmployer = array(); $valsLocation = array(); $valsIndustry = array(); $valsType = array(); $valsEducation = array(); foreach ($entry->getGbaseAttributes() as $attr) { if ($attr->getText()) { switch ($attr->getName()) { case 'employer': $valsEmployer[] = $attr->getText(); break; case 'location': $valsLocation[] = $attr->getText(); break; case 'job_industry': $valsIndustry[] = $attr->getText(); break; case 'job_type': $valsType[] = $attr->getText(); break; case 'education': $valsEducation[] = $attr->getText(); break; } } } ?> <div class="result"> <div class="title"> <?php echo $count; ?>. <a href="<?php echo $entry->getLink('alternate')->getHref(); ?>" > <?php echo $entry->getTitle(); ?> </a> </div> <div class="attr"> Employer: <?php echo implode(', ', $valsEmployer); ?> <br/> Location: <?php echo implode(', ', $valsLocation); ?> <br/> Industry: <?php echo implode(', ', $valsIndustry); ?> <br/> Type: <?php echo implode(', ', $valsType); ?> <br/> Education: <?php echo implode(', ', $valsEducation); ?> <br/> Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 9 of 30 developerWorks® ibm.com/developerWorks </div> </div> <?php $count++; ?> <?php endforeach; ?> </div> </body> </html> Listing 3 first loads the Zend class libraries, and then initializes an instance of the Zend_Gdata_Gbase service class. This class serves as the control point for all subsequent interactions with the Google Base Data API. Since the current plan is to only access the public snippets feed, no authentication credentials are needed by the service object; however, this will change once you start to work with the user's private feed (later in this article). The Zend_Gdata_Base method that you're most likely to use with the public snippets feed is the getGbaseSnippetFeed() method, which returns a feed of items matching a search query. This method is passed an instance of a configured Zend_Gdata_Gbase_SnippetQuery object, with the query string set through the setBq() class method. The response to the getGbaseSnippetFeed() method is an Atom feed similar to the one displayed in Listing 1; this feed is automatically parsed and converted into an array of Zend_Gdata_Gbase_SnippetEntry objects, each representing one <entry> in the feed. The individual attributes of each entry are represented as Zend_Gdata_Gbase_Extension_BaseAttribute objects, each of which exposes getName() and getText() methods. You can obtain a complete collection of these attributes with the getGbaseAttributes() of the method Zend_Gdata_Gbase_SnippetEntry object. It is now quite simple to iterate over this collection, extract the values needed, and display them in an HTML page. Notice also the getLink() method, which returns a link to third-party URL of each entry. Figure 2 shows the output of Listing 3. (View a text-only version of Figure 2.) Figure 2. The results of a Google Base search using Zend_Gdata Search and update Google Base with PHP Page 10 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® With this basic understanding in place, it's quite easy to modify Listing 3 to make it more interactive. Listing 4 demonstrates, adding a search form that can be used to search different item types for matches to user-supplied keywords: Listing 4. Searching Google Base with user-supplied criteria <?php if (isset($_POST['submit'])) { // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); try { // initialize service object // no authentication needed for public snippets feed $service = new Zend_Gdata_Gbase(); Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 11 of 30 developerWorks® ibm.com/developerWorks // prepare and execute search query on snippets feed $query = $service->newSnippetQuery(); $queryStr = $_POST['q'] . '[itemtype:' . $_POST['itemtype'] . ']'; $query->setBq($queryStr); $feed = $service->getGbaseSnippetFeed($query); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } } ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Searching Google Base snippets</title> <style type="text/css"> .title { font-weight: bolder; } .attr { border-collapse: collapse; margin-top: 3px; } .result { margin-bottom: 5px; float: left; width: 450px; margin-left: 10px; } </style> </head> <body> <h2>Search</h2> <form method="post"> Search for: <input type="text" name="q" /> in section: <select name="itemtype"> <option value="events and activities">Events</option> <option value="housing">Housing</option> <option value="jobs">Jobs</option> <option value="news and articles">News</option> <option value="personals">Personals</option> <option value="recipes">Recipes</option> <option value="reviews">Reviews</option> <option value="services">Services</option> </select> <input type="submit" name="submit" value="Search" /> </form> <?php if (isset($feed)): ?> <h2><?php echo $feed->title; ?></h2> <div id="summary"> <?php echo $feed->totalResults; ?> result(s) found. </div> <br/> <div id="results") <?php $count = 1; ?> <?php foreach ($feed as $entry): ?> <div class="result"> <div class="title"> <?php echo $count; ?>. <a href="<?php echo $entry->getLink('alternate')->getHref(); ?>" > <?php echo $entry->getTitle(); ?> Search and update Google Base with PHP Page 12 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® </a> </div> <table border="1" class="attr"> <?php /* get and display all attributes */ ?> <?php /* convert hyperlinks using preg */ ?> <?php foreach ($entry->getGbaseAttributes() as $attr): ?> <?php if ($attr->getText()): ?> <tr> <td><?php echo strtoupper($attr->getName()); ?></td> <td><?php echo (preg_match('/^(http(s?)):/', $attr->getText())) ? '<a href="' . $attr->getText(). '">click here</a>' : substr($attr->getText(), 0, 50); ?></td> </tr> <?php endif; ?> <?php endforeach; ?> </table> </div> <?php $count++; ?> <?php endforeach; ?> </div> <?php endif; ?> </body> </html> This script looks complicated, but have no fear, it' simpler than it appears. It begins by setting up a form with a search input field and an (abridged) list of Google Base item types. Based on the input entered by the user, it then generates a Zend_Gdata_Gbase_SnippetQuery object and passes this to the getGbaseSnippetFeed() method to obtain a result feed matching the input parameters. Each entry in the result feed is then processed, and the attributes for each entry are displayed through the getGbaseAttributes() method. Figure 3 demonstrates an example of the output generated by a search for reviews containing the keyword 'wine': Figure 3. The results of a Google Base search for reviews containing the keyword 'wine' Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 13 of 30 developerWorks® ibm.com/developerWorks Filtering Google Base search results As you've seen in previous examples, it's possible to filter the search results returned by Google Base, simply by using attribute as query filters. These attributes are typically enclosed in square parentheses, as illustrated in the following Search and update Google Base with PHP Page 14 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® examples: • Product manager jobs in California: http://www.google.com/base/feeds/snippets? bq=product+manager[itemtype:jobs][location:CA] • Recipes containing bacon: http://www.google.com/base/feeds/snippets? bq=[itemtype:recipes][ingredients:bacon] • Modern art exhibitions in New York City: http://www.google.com/base/feeds/snippets? bq=modern+art[itemtype:events%20and%20activities][location:new%20york] Note: For formatting purposes, the preceding URLs are split to multiple lines. Use the URLs as a single string. (View the URLs as a single string.) Apart from this, you can easily customize the API output by adding some of the following parameters to your REST query: • The start-index parameter, which specifies the start offset for the entries in a feed • The max-results parameter, which specifies the number of entries in a feed • The q parameter, which can be used for full-text searches • The crowdby parameter, which controls how often items with a specified attribute value are repeated • The orderby parameter, which specifies how to sort results Listing 5 has examples of these parameters in action: Listing 5. Filtering Google Base search results <?php if (isset($_POST['submit'])) { // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); try { Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 15 of 30 developerWorks® ibm.com/developerWorks // initialize service object // no authentication needed for public snippets feed $service = new Zend_Gdata_Gbase(); // prepare and execute search query on snippets feed // attach attribute filters $query = $service->newSnippetQuery(); $queryStr = $_POST['q'] . '[itemtype:events and activities]'; if (!empty($_POST['location'])) { $queryStr .= '[location: '. $_POST['location'] . ']'; } if (!empty($_POST['event_type'])) { $queryStr .= '[event_type: '. $_POST['type'] .']'; } // display 20 results per page // crowd by content field $query->setBq($queryStr); $query->setMaxResults(20); $query->setCrowdBy('content:2'); $feed = $service->getGbaseSnippetFeed($query); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } } ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Searching Google Base snippets</title> <style type="text/css"> .title { font-weight: bolder; } .attr { border-collapse: collapse; margin-top: 3px; } .result { margin-bottom: 5px; float: left; width: 450px; margin-left: 10px; } </style> </head> <body> <h2>Search Events</h2> <form method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>"> Keywords: <input type="text" name="q" /> Location: <input type="text" name="location" size="10" /> Event type: <input type="text" name="type" size="10" /> <input type="submit" name="submit" value="Search" /> </form> <?php if (isset($feed)): ?> <h2><?php echo $feed->title; ?></h2> <div id="summary"> <?php echo $feed->totalResults; ?> result(s) found. </div> Search and update Google Base with PHP Page 16 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® <br/> <div id="results") <?php $count = 1; ?> <?php foreach ($feed as $entry): ?> <div class="result"> <div class="title"> <?php echo $count; ?>. <a href="<?php echo $entry->getLink('alternate')->getHref(); ?>" > <?php echo $entry->getTitle(); ?> </a> </div> <table border="1" class="attr"> <?php /* get and display all attributes */ ?> <?php // convert hyperlinks using preg */ ?> <?php foreach ($entry->getGbaseAttributes() as $attr): ?> <?php if ($attr->getText()): ?> <tr> <td><?php echo strtoupper($attr->getName()); ?></td> <td><?php echo (preg_match('/^(http(s?)):/', $attr->getText())) ? '<a href="' . $attr->getText(). '">click here</a>' : substr($attr->getText(), 0, 50); ?></td> </tr> <?php endif; ?> <?php endforeach; ?> </table> </div> <?php $count++; ?> <?php endforeach; ?> </div> <?php endif; ?> </body> </html> Listing 5 generates a simple event search form, with fields for event keywords, location, and type. These inputs are then converted into the corresponding attributes and appended to the Google Base query. The setMaxResults() method is used to control the number of results displayed in the feed, while the setCrowdBy() method defines the control attribute and value for repetition (in this case, only two duplicate entries with the same content will be allowed). Figure 4 displays an example of a search for modern art exhibitions in New York: Figure 4. The results of a filtered Google Base event search Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 17 of 30 developerWorks® ibm.com/developerWorks Adding items to Google Base The previous examples have all operated with the public snippets feed. However, as you might remember, every authenticated user also has access to a private feed, which contains the user's own additions to Google Base. The Google Base Data API provides programmatic access to this feed, allowing users to add, edit, and delete items using API calls. The following sections examine this in detail. First up, adding new items. This is actually quite simple: to add a new entry, simply POST an XML-encoded <entry> block to the private feed URL. Listing 6 has an example of one such block: Listing 6. An example Google Base entry block Search and update Google Base with PHP Page 18 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® <atom:entry xmlns:atom="http://www.w3.org/2005/Atom"> <atom:title type="text">Chicken Tikka Masala</atom:title> <atom:content type="text">Cut the chicken into fine pieces. Fry until golden. Add onions, spices and fry for 4-5 minutes and golden. Add chopped tomatoes, curd and seasoning. Allow to simmer for 10 minutes.</atom:content> <item_type xmlns="http://base.google.com/ns/1.0" type="text">recipes</item_type> <main_ingredient xmlns="http://base.google.com/ns/1.0" type="text"> chicken </main_ingredient> <servings xmlns="http://base.google.com/ns/1.0" type="int">4</servings> <cooking_time xmlns="http://base.google.com/ns/1.0" type="number">30 </cooking_time> <author xmlns="http://base.google.com/ns/1.0" type="text">Mr. Fantastic Cook </author> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">chicken </ingredients> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">onions </ingredients> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">tomatoes </ingredients> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">turmeric </ingredients> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">coriander </ingredients> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">curd </ingredients> <ingredients xmlns="http://base.google.com/ns/1.0" type="text">mustard seeds </ingredients> </atom:entry> Since the private feed is, by definition, private, any operation on the data contained in the feed will only be successful if the operation is authenticated with the feed owner's username and password using one of the two Google-approved authentication methods: AuthSub or ClientLogin. Performing this type of authentication manually is a fairly messy task, and requires a fair bit of code to account for the various scenarios that might crop up during a typical authentication transaction. Fortunately, you don't have to worry too much about this: the Zend_GData Client Library handles all the details for you. Consider Listing 7, which illustrates how to add a new item to the private feed: Listing 7. Adding entries to Google Base <?php // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); // set credentials for ClientLogin authentication $user = "[email protected]"; $pass = "secret"; try { // perform login // initialize service object $client = Zend_Gdata_ClientLogin::getHttpClient( Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 19 of 30 developerWorks® ibm.com/developerWorks $user, $pass, 'gbase'); $service = new Zend_Gdata_Gbase($client); // initialize new item // set title, content and type $item = $service->newItemEntry(); $item->setItemType('recipes'); $item->title = new Zend_Gdata_App_Extension_Title('Chicken Tikka Masala'); $item->content = new Zend_Gdata_App_Extension_Content( 'Cut the chicken into fine pieces. Fry until golden. Add onions, spices and fry for 4-5 minutes and golden. Add chopped tomatoes, curd and seasoning. Allow to simmer for 10 minutes.'); // set type attributes $item->addGbaseAttribute('main_ingredient', 'chicken', 'text'); $item->addGbaseAttribute('servings', '4', 'int'); $item->addGbaseAttribute('cooking_time', '30', 'number'); $item->addGbaseAttribute('author', 'Mr. Fantastic Cook', 'text'); $item->addGbaseAttribute('ingredients', 'chicken', 'text'); $item->addGbaseAttribute('ingredients', 'onions', 'text'); $item->addGbaseAttribute('ingredients', 'tomatoes', 'text'); $item->addGbaseAttribute('ingredients', 'turmeric', 'text'); $item->addGbaseAttribute('ingredients', 'coriander', 'text'); $item->addGbaseAttribute('ingredients', 'curd', 'text'); $item->addGbaseAttribute('ingredients', 'mustard seeds', 'text'); // save to server $entry = $service->insertGbaseItem($item); // display success message echo "Entry added successfully with ID: " . $entry->getId(); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } ?> Listing 7 begins by loading the Zend class libraries, and then initializing an instance of the Zend_Gdata service class. Unlike what you've seen earlier, this class now makes use of a Zend_Http_Client object, which is provided with the necessary user authentication information and used to open an authenticated connection to the Google Base service. Once an authenticated connection is opened, the service object's newItemEntry() method is used to initialize an instance of the Zend_Gdata_Gbase_ItemEntry class, and the setItemType() method of the entry object is used to define the item type. The title and content of the entry are also set, as instances of the Zend_Gdata_App_Extension_Title and Zend_Gdata_App_Extension_Content classes, and individual attributes are assigned through the entry object's addGbaseAttribute() method. Once the entry is complete, the entire thing is posted to the Google Base servers through a call to the service object's insertGbaseItem() method. Once the entry has been successfully posted, you should be presented with a result page that contains the new entry ID. Figure 5 has an example of one such result page: Figure 5. The result of successfully adding an item to Google Base Search and update Google Base with PHP Page 20 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® The entry will now also appear in your Google Base account. Figure 6 has an example of what you might see. (See a larger version of Figure 6.) Figure 6. The newly-added item, shown in the Google Base interface Editing and deleting items on Google Base The Google Base Data API also permits entry editing and deletion. To delete an entry, send a DELETE request to the entry URL, which is the URL specified in the entry's <link rel="self" ...> element. In the Zend library context, you can simply pass Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 21 of 30 developerWorks® ibm.com/developerWorks this URL to the Zend_Gdata_Gbase_ItemEntry object's delete() method, as in Listing 8: Listing 8. Deleting entries from Google Base <?php // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); // set credentials for ClientLogin authentication $user = "[email protected]"; $pass = "secret"; try { // perform login // initialize service object $client = Zend_Gdata_ClientLogin::getHttpClient( $user, $pass, 'gbase'); $service = new Zend_Gdata_Gbase($client); // get and delete entry $id = 'http://www.google.com/base/feeds/items/1256392227904491772'; $entry = $service->getGbaseItemEntry($id); $entry->delete(); // display success message echo "Entry deleted successfully with ID: " . $entry->getId(); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } ?> In a similar vein, to edit an entry, use the getGbaseItemEntry() method to retrieve the entry using its unique URL, change the values you wish to update, and then save the entry back to the server using the Zend_Gdata_Gbase_ItemEntry object's save() method, which sends a PUT request to the URL specified in the entry's <link rel="self" ...> element. Listing 9 illustrates the process: Listing 9. Updating entries on Google Base <?php // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); // set credentials for ClientLogin authentication $user = "[email protected]"; $pass = "secret"; try { // perform login // initialize service object $client = Zend_Gdata_ClientLogin::getHttpClient( $user, $pass, 'gbase'); $service = new Zend_Gdata_Gbase($client); // get entry Search and update Google Base with PHP Page 22 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® $id = 'http://www.google.com/base/feeds/items/11812821492318418471'; $item = $service->getGbaseItemEntry($id); // set new title $item->title = new Zend_Gdata_App_Extension_Title('Spaghetti Amatriciana'); // set new content $item->content = new Zend_Gdata_App_Extension_Content('Cut the bacon into thin strips and fry. Chop the onions and chillies and fry. Add the bacon and tomato puree to the mixture. Simmer for 10 minutes, then drain the pasta and mix.'); // remove existing attributes // set new ones foreach ($item->getGbaseAttributes() as $attr) { $item->removeGbaseAttribute($attr); } $item->setItemType('recipes'); $item->addGbaseAttribute('main_ingredient', 'pasta', 'text'); $item->addGbaseAttribute('servings', '4', 'int'); $item->addGbaseAttribute('cooking_time', '30', 'number'); $item->addGbaseAttribute('author', 'Mr. Fantastic Cook', 'text'); $item->addGbaseAttribute('ingredients', 'bacon', 'text'); $item->addGbaseAttribute('ingredients', 'onions', 'text'); $item->addGbaseAttribute('ingredients', 'tomatoes', 'text'); $item->addGbaseAttribute('ingredients', 'green chillies', 'text'); $item->addGbaseAttribute('ingredients', 'pasta', 'text'); $item->addGbaseAttribute('ingredients', 'tomato puree', 'text'); // save changes to server $item->save(); // display success message echo "Entry updated successfully with ID: " . $item->getId(); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } ?> In Listing 9, the entry is first retrieved using its unique ID, and assigned a new title and content. The existing attributes are removed with the removeGbaseAttribute() method, and a new set of attributes are assigned. The resulting entry object is then saved back to Google Base with the same ID. A simple application As the previous listings illustrate, it is quite easy to add, delete, and update items on Google Base using the Google Base Data API. This section will build a simple PHP application around these functions, allowing users to add and delete event listings using the Google Base Data API. Listing 10, which is based on Listing 7, creates a form to add new event listings and then submits the form data for addition to Google Base: Listing 10. Adding event entries to a user's private Google Base feed <?php if (isset($_POST['submit'])) { Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 23 of 30 developerWorks® ibm.com/developerWorks // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); // set credentials for ClientLogin authentication $user = "[email protected]"; $pass = "secret"; try { // perform login // initialize service object $client = Zend_Gdata_ClientLogin::getHttpClient( $user, $pass, 'gbase'); $service = new Zend_Gdata_Gbase($client); // // perform input validation here // omitted for clarity // // initialize new item // set title, content and type $item = $service->newItemEntry(); $item->setItemType('events and activities'); $item->title = new Zend_Gdata_App_Extension_Title($_POST['title']); $item->content = new Zend_Gdata_App_Extension_Content($_POST['content']); // set type attributes foreach ($_POST['attr'] as $key => $value) { $item->addGbaseAttribute($key, $value); } // set date range $start = date('c', strtotime($_POST['start'])); $item->addGbaseAttribute('event_date_range', $start); // save to server $entry = $service->insertGbaseItem($item); // display success message echo "Entry added successfully with ID: " . $entry->getId(); echo '<br/>'; echo '<a href="list.php">Back to list</a>'; } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } } else { ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Adding events to Google Base</title> </head> <body> <h2>Add Event</h2> <form method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>"> <p> Title: <br/> <input type="text" name="title" /> </p> <p> Description: <br/> <textarea name="content"></textarea> </p> Search and update Google Base with PHP Page 24 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® <p> Venue: <br/> <input type="text" name="attr[venue]" /> </p> <p> Address: <br/> <input type="text" name="attr[location]" /> </p> <p> Type: <br/> <input type="text" name="attr[event_type]" /> </p> <p> Start date/time (dd-mm-yyyy hh:mm): <br/> <input type="text" name="start" /> </p> <input type="submit" name="submit" value="Submit" /> </form> </body> </html> <?php } ?> Listing 11 contains the code to query the user's private feed and retrieve all the entries stored within it. Event entries are displayed in a list, together with venue and date information. Each entry also has a Delete link, which points to the script in Listing 12: Listing 11. Listing event entries in a user's private Google Base feed <?php // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); // set credentials for ClientLogin authentication $user = "[email protected]"; $pass = "secret"; try { // perform login // initialize service object $client = Zend_Gdata_ClientLogin::getHttpClient( $user, $pass, 'gbase'); $service = new Zend_Gdata_Gbase($client); // get item feed $feed = $service->getGbaseItemFeed(); } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 25 of 30 developerWorks® ibm.com/developerWorks <title>Listing events on Google Base</title> <style type="text/css"> .title { font-weight: bolder; } .attr { margin-left: 15px; } .result { margin-bottom: 5px; float: left; width: 450px; margin-left: 20px; } </style> </head> <body> <h2>List Events</h2> <a href="add.php">Add a new event</a> <br/> <br/> <div id="results"> <?php $count = 1; ?> <?php foreach ($feed as $entry): ?> <?php if ($entry->getItemtype() == 'events and activities'): ?> <?php // iterate over attributes // get required data foreach ($entry->getGbaseAttributes() as $attr) { if ($attr->getText()) { switch ($attr->getName()) { case 'location': $location = $attr->getText(); break; case 'venue': $venue = $attr->getText(); break; case 'event_date_range': $date = $attr->getText(); break; case 'event_type': $type = $attr->getText(); break; } } } ?> <div class="result"> <div class="title"> <?php echo $count; ?>. <?php echo $entry->getTitle(); ?> (<a href="delete.php?url= <?php echo $entry->getSelfLink()->getHref(); ?>" >delete</a>) </div> <div class="attr"> Description: <?php echo $entry->getContent(); ?> <br/> Venue: <?php echo (!empty($venue)) ? $venue : 'Unspecified'; ?> <br/> Address: <?php echo (!empty($location)) ? $location : 'Unspecified'; ?> <br/> Type: <?php echo (!empty($type)) ? $type : 'Unspecified'; ?> <br/> Date/time: <?php echo (!empty($date)) ? date('d-m-Y h:i', strtotime($date)) : 'Unspecified'; ?> <br/> </div> </div> Search and update Google Base with PHP Page 26 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® <?php $count++; ?> <?php endif; ?> <?php endforeach; ?> </div> </html> Listing 12 accepts an entry's URL and uses the technique shown earlier in Listing 7 to delete it from Google Base: Listing 12. Deleting event entries from a user's private Google Base feed <?php // load Zend Gdata libraries require_once 'Zend/Loader.php'; Zend_Loader::loadClass('Zend_Gdata_Gbase'); Zend_Loader::loadClass('Zend_Gdata_ClientLogin'); // set credentials for ClientLogin authentication $user = "[email protected]"; $pass = "secret"; try { // perform login // initialize service object $client = Zend_Gdata_ClientLogin::getHttpClient( $user, $pass, 'gbase'); $service = new Zend_Gdata_Gbase($client); // get and delete entry $id = $_GET['url']; $entry = $service->getGbaseItemEntry($id); $entry->delete(); // display success message echo "Entry deleted successfully with ID: " . $entry->getId(); echo '<br/>'; echo '<a href="list.php">Back to list</a>'; } catch (Exception $e) { die('ERROR:' . $e->getMessage()); } ?> Conclusion Over the last few pages, you got a crash course in how to integrate data from the Google Base Data API into a PHP application using a combination of SimpleXML and the Zend client library. The examples in this article: • Introduced you to the two main Google Base feeds • Explained how Google Base classifies data using item types and attributes • Showed you how to retrieve Google Base content using a variety of different filters Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 27 of 30 developerWorks® ibm.com/developerWorks • Illustrated how to programmatically add, modify and delete content • Built a customized interface to a user's Google Base data As these examples illustrate, the Google Base Data API is a powerful and flexible tool for developers ready to build creative new applications around content aggregation and search. Play with it, and see what you think! Search and update Google Base with PHP Page 28 of 30 © Copyright IBM Corporation 2010. All rights reserved. ibm.com/developerWorks developerWorks® Resources Learn • The Developer's Guide and Reference Guide: Learn more about the Google Base Data API. • Google Data API authentication: Learn more about the two types. • The Google Base blog: Track Google Base news. • Google Account: Register for your own account and get started. • Google Base API Developer's Forum: Read and join discussions about Google Base API development. • The Zend_Gdata_Gbase library: Read more about the Zend Framework and how to use it with the Google Base Data API. • More articles by this author (Vikram Vaswani, developerWorks, August 2007-current): Read articles about XML, additional Google APIs and other technologies. • XML area on developerWorks: Get the resources you need to advance your skills in the XML arena. • IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies. • XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks. • developerWorks technical events and webcasts: Stay current with technology in these sessions. • developerWorks podcasts: Listen to interesting interviews and discussions for software developers. Get products and technologies • The Zend Gdata Client Library Download and get everything you need to access Google Base Data APIs. • IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®. Discuss • XML zone discussion forums: Participate in any of several XML-related discussions. Search and update Google Base with PHP © Copyright IBM Corporation 2010. All rights reserved. Page 29 of 30 developerWorks® ibm.com/developerWorks • developerWorks blogs: Check out these blogs and get involved. About the author Vikram Vaswani Vikram Vaswani is the founder and CEO of Melonfire, a consulting services firm with special expertise in open-source tools and technologies. He is also the author of the books PHP Programming Solutions and PHP: A Beginners Guide. Trademarks IBM, the IBM logo, ibm.com, DB2, developerWorks, Lotus, Rational, Tivoli, and WebSphere are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. See the current list of IBM trademarks. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others. Search and update Google Base with PHP Page 30 of 30 © Copyright IBM Corporation 2010. All rights reserved.