From WorldCat Developers' Network

Jump to: navigation, search

WorldCat Identities Widget

Background

At the OCLC Mashathon held in May 2009 in Amsterdam, a group of developers discussed ways in which WorldCat APIs (also known as OCLC Grid Services) can be used to extend access from library catalog interfaces. We quickly focused on the use of the WorldCat Identities API to find related authors, books by and about, subjects, and other facets of information associated with an author and a work.

The group was also interested in developing an application that would return data that would be easily incorporated into a library catalog user interface. We found that just a portion of the XML returned for an Identities record (just the works by an author, just the subjects associated with the author) would typically be needed, and that some other presentation of the XML (HTML formatted for the UI, or JSON for rendering by local JavaScript) would be important.

So the group set out to build a working prototype of a "widget" application; one that could be called with a few values known to the local catalog system and that would return an easily-incorporated bit of additional information.

A quick way to find an Identities record is to send a request that includes a last name and an OCLC number of a record with which they are associated. Library catalogs can easily generate last names, but not all will have an OCLC number handy for a record for that person in WorldCat. Some of the participants in the group had PPNs instead of OCLC numbers. So we began by building a widget that, given a PPN, calls a web service from that can return a related OCLC number.

Next we extended the widget so that, given a PPN and a last name, it would get the related OCLC number and formulate a request to call the WorldCat Identities API and return a matching record's XML, if found. Since Identities records can include several separate "facets" of information separate XML elements, we added a widget URL parameter so that a particular facet could be extracted and returned.

We then added an extension to the widget so that it could accept an OCLC number directly, for those with a number handy. And we further extended it to work with systems that had neither a PPN or OCLC number. If given a title string along with the last name, the widget formulates a WorldCat Search API URL (including a key provided by the widget implementer) to issue an author/title search in WorldCat, and if there are matches it extracts the OCLC identifier of the first matching work to use in an Identities API request.

Prototype Widget

The prototype built during the Mashathon is available for evaluation. This implementation is not intended for high-level use in a production environment.

Sample links for the prototype widget:

Installation Notes

The PHP code used in the working prototype is shown below, along with installation notes.

To use the following PHP code, install in a directory on your PHP server. Also download and install the xml2json PHP application, and modify the "require_once" statement in the PHP code below as needed for your installation. As written, the code assumes that xml2json.php is installed in the same directory as the prototype PHP code, with a subdirectory named "json" that includes the xml2json JSON.php code and license.

The PHP code provides access to API keys on the PHP server is through a scheme that is unique to this installation. We'd expect an implementation to replace the key lookup implementation in this code with a similar method in use in the implementer's environment.

Widget URL Parameters

  • ppn -- A PPN number to identify a bibliographic record (required if title or oclc not provided)
  • oclc -- An OCLC number to identify a bibliographic record (required if ppn or title not provided)
  • last -- An author's family name (required)
  • facet -- A WorldCat Identities facet. If not supplied, the complete Identities record is returned. Parameter values include:
    • about -- citations of works about the identity
    • audLevel -- OCLC audience level calculations for works by the identity
    • authorityInfo -- name authority file entries for the identity
    • biogSHs -- biographical subject headings for the identity
    • by -- citations of works by the identity
    • fastHeadings -- FAST subject headings for works by the identity
    • genres -- genre terms for works by the identity
  • title -- A title of a bibliographic record (required if ppn or oclc are not provided)
  • wrapper -- An XML node to enclose data returned from Identities (optional, if not provided "facets" is used)
  • api -- (Specify "js" to return a JSON object assigned to the JavaScript variable _WCIinfo)
  • callback -- (Specify a callback function to return JSON as the function value)
  • xslt -- Optional URL for an XSLT stylesheet to embed within XML responses (e.g., to reformat as HTML)

PHP Code

<?php

// lookup.php
//
// Given a ppn or oclc accession number and a last name, or an author/title, and given an identities "facet", 
// return a matching facet in XML, XML with an XSLT stylesheet, or as JSON in a callback function
//
// Examples:
//
// ppn and last name returning associatedNames facet in XML: 
// lookup.php?ppn=304032387&last=telemans&facet=associatedNames
//
// ppn and last name returning associatedNames facet in JSON using "ws_result" as the callback function name: 
// lookup.php?ppn=304032387&last=telemans&facet=associatedNames&api=js&callback=ws_result
//
// oclc number and last name returning fastHeadings facet in XML: 
// lookup.php?oclc=49699580&last=marx&facet=fastHeadings
//
// title and last name returning about facet in XML with an embedded XSLT stylesheet to convert to HTML: 
// lookup.php?title=communist+manifesto&last=marx&facet=about&xslt=http://worldcatdoor.org/widgets/identities/citation.xsl


// EXTERNAL PHP SCRIPT REQUIREMENTS
//
// xml2json.php:
// converts XML to JSON, available for download from http://www.ibm.com/developerworks/xml/library/x-xml2jsonphp/
require_once("xml2json.php");


// SCRIPT FUNCTIONS
//
// function get_url 
// Given a URL and a timeout, use PHP cUrl to get the URL contents and return them (if provided), 
// the HTTP status, and an error code and message (if provided)
// Function results are returned in an array with the element names contents, status, error and errortext
// e.g., get_url("http://ppwww.pica.nl/psi_ppn2ext/ppn2ext.php?PPN=304032387","8")
function get_url($url,$timeout) {
	$contents = "";
	$c = curl_init();
	curl_setopt ($c, CURLOPT_URL, $url);
	curl_setopt ($c, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt ($c, CURLOPT_FOLLOWLOCATION, true);
	curl_setopt ($c, CURLOPT_CONNECTTIMEOUT, $timeout);
	$contents = curl_exec($c);
	$status = curl_getinfo($c,CURLINFO_HTTP_CODE);
	$error = curl_errno($c);
	$errortext = curl_error($c);
	$arr = array('contents'=>$contents,'status'=>$status,'error'=>$error,'errortext'=>$errortext);
	return $arr;
} // end get_url

// function get_appid
// Given the name of a webservice, look up its application ID (aka Key) from a file elsewhere on the host
// The config file for this function is expected to take the following form.  For an alternate form, modify the function.
// <settings>
//    <keys>
//    	<worldcat>ugUu6hJ5Dxwem6wlGj6YJB773XpKjqtwZBfMP8vnoW1sGDyELVPKv79huxNdHITQu4mR39V9HzXOrXwV</worldcat>
//    	<yahoo>3H2L2sJV34WR5BXeMVm8yHVE5sWe2drk4TK0YGmdR3CTvDr7f1DErqHwxwhcYHWA1A--</yahoo>
//    </keys>
// </settings>
// e.g., for a worldcat key in the file config.xml in the path /props: get_appid("worldcat","props","config.xml")
function get_appid($type,$path,$config) {
	$id = ""; // set the return key to an empty string
	$home_path = $_SERVER['DOCUMENT_ROOT'];  // get the web home path
	$filename = $home_path."/".$path."/".$config; // set the config file path and name
	$config =  simplexml_load_file($filename); // load the config file
	$children =  $config->children();
	$keys = $children->keys;
	foreach ($keys as $key) { // check all keys, set the id to value of an element matching the key type
	  $details = $key->children();
	  $id = $details->$type; 
	}	
	return $id;
} // end get_appid

// function get_param
// given a script parameter name, get if from $_GET, and process it to handle html entities and leading/trailing whitespace
function get_param($param) {
	$val = "";
	if ($_GET[$param]) {
		$val = trim($_GET[$param]);
		$val = htmlentities($val);
	}
	return $val;
} // end get_param

// function error_report
// report errors for API calls and other processes within the script
function error_report($msg,$status,$error,$errortext,$api,$callback) {
	if ($callback) {
		// return an empty JSON object in a callback function
		header ('Content-type: text/plain; charset=utf-8');
		echo $callback."({});";
	} else if ($api == "js") {
		// return an empty JSON object in a callback function
		header ('Content-type: text/plain; charset=utf-8');
		echo "var _WCIinfo = {};\n";
	} else {
		header ('Content-type: text/plain; charset=utf-8');
		echo $msg."\n";
		if ($status) {
			echo "HTTP status: ".$status."\n";
		}
		if ($error) {
			echo "Error code: ".$error."\n";
		}
		if ($errortext) {
			echo "Error text: ".$errortext."\n";
		}
	}
	exit;
} // end error


// GLOBAL SCRIPT VARIABLES

// defaults
$timeout = "8"; 			// default timeout for cUrl requests
$api = "xml"; 				// default api response is xml
$wrapper = "facets"; 	// an outer XML node name to contain XML returned from WorldCat Identities

// get cleaned up request parameters
$ppn = get_param("ppn");						// ppn number
$oclc = get_param("oclc");					// oclc accession number
$last = get_param("last");					// author last name
$title = get_param("title");				// title
$facet = get_param("facet");				// identities facet
$callback = get_param("callback");	// json target callback function
if ($callback) {
	$api = "js";
}
$xslt = get_param("xslt");					// xslt stylesheet URL
if (get_param("wrapper")) {
	$wrapper = get_param("wrapper");
}
if (get_param("api")) {
	$api = get_param("api");
}

// modify last name and title for use in identities web service URLs
$last = urlencode($last);
$title = urlencode($title);

// lookup API keys
$worldcatKey = get_appid("worldcat","props","config.xml");

// set API root URLs
$ppn2ext = "http://ppwww.pica.nl/psi_ppn2ext/ppn2ext.php?"; // web service to convert PPNs to OCLC numbers
$worldcatSearchAPI = "http://worldcat.org/webservices/catalog/search/sru?"; // WorldCat Search API
$worldcatIdentities = "http://worldcat.org/identities/search/Identities?"; // WorldCat Identities API

// MAIN 

// Obtain an OCLC number from a PPN or an author/title, if one if not provided 
if !($oclc) {
	// no oclc parameter
	if ($ppn) {
		// a ppn was provided so translate it to an oclc number
		$ppn2oclc = $ppn2ext."PPN=".$ppn;
		$xml = get_url($ppn2oclc,$timeout);
		if ($xml['status'] == "200") {
			// PICA ppn2ext web service response reports success
			$x = new SimpleXMLElement($xml['contents']);
			$oclc = $x->oclc;
		} else {
			// PICA ppn2ext web service response reports an error
			error_report("Pica PPN2EXT web service error",$xml['status'],$xml['error'],$xml['errortext'],$api,$callback);
		}
	} else {
		// a ppn was not provided so look up an oclc number by a WorldCat Search API author/title search
		if ($title) {
			$apisearch = $worldcatSearchAPI."query=srw.au+all+%22".$last."%22+and+srw.ti+exact+%22".$title."%22&version=1.1&operation=searchRetrieve&wskey=".$worldcatKey."&recordSchema=info%3Asrw%2Fschema%2F1%2Fdc&maximumRecords=10&startRecord=1&recordPacking=xml&servicelevel=default&sortKeys=relevance&resultSetTTL=300&recordXPath=";
			$xml = get_url($apisearch,$timeout);
			if ($xml['status'] == "200") {
				// OCLC WorldCat Search API response reports success
				// Look inside first record for an oclcterms:recordIdentifier value without an attribute, that's the OCLC record ID
				$x = simplexml_load_string($xml['contents']);	
				foreach ($x->records->record as $record) {
				  $oclcterms = $record->recordData->oclcdcs->children('http://purl.org/oclc/terms/');
					foreach ($oclcterms->recordIdentifier as $id) {
						$ii = 0;
						$attrs = $id->attributes('http://www.w3.org/2001/XMLSchema-instance');
						if (strlen($attrs) == 0) { // look for oclc identifiers, which do not have an xsi:type attribute (lccns do)
							$oclc = $id;
							break; 
						} // end if
					}	// end foreach
					break;
				}	// end foreach	
				$oclc = trim($oclc);
				if (strlen($oclc) == 0) {
					// OCLC number wasn't returned by WorldCat Search API request
					error_report("The WorldCat Search API did not return an author/title match with an OCLC number","","","",$api,$callback);
				} // end if
			} else {
				// OCLC WorldCat Search API response reports an error
				error_report("WorldCat Search API service error",$xml['status'],$xml['error'],$xml['errortext'],$api,$callback);
			} // end if else
		} else {
			// No work identifiers provided
			error_report("An OCLC number, PPN, or title is required","","","",$api,$callback);
		} // end if else
	}	// end if else	
}// end if

// Call WorldCat Identities with a last name and OCLC number
$identities = $worldcatIdentities."query=local.FamilyName+%3D+%22".$last."%22+and+local.OCLCNumber+%3D+%22".$oclc."%22";
$xml = get_url($identities,$timeout);

// Process the WorldCat Identities response
if ($xml['status'] == "200") {
	// the Identities service response reports success
	$x = new SimpleXMLElement($xml['contents']);
	$numberOfRecords = $x->numberOfRecords;
	if ($numberOfRecords > 0) {
		$facets = "";
		if ($facet) {
			if (strpos($xml['contents'],"<".$facet) > 0) {
				$facets = substr($xml['contents'],strpos($xml['contents'],"<".$facet)+1);
				$facets = substr($facets,strpos($facets,">")+1);
				$facets = substr($facets,0,strpos($facets,"</".$facet));
			}
			$facets = "<".$wrapper.">".$facets."</".$wrapper.">";
		} else {
			$facets = substr($xml['contents'],strpos($xml['contents'],"<Identity"));
			$facets = substr($facets,0,strrpos($facets,"</Identity>")+11);
		} // end if else
		if ($callback) {
			$jsonContents = xml2json::transformXmlStringToJson($facets);
			header ('Content-type: text/plain; charset=utf-8');
			echo $callback."(";
			echo $jsonContents;
			echo ");";
		} else if ($api == "js") {
			$jsonContents = xml2json::transformXmlStringToJson($facets);
			header ('Content-type: text/plain; charset=utf-8');
			echo "var _WCIinfo = ";
			echo $jsonContents;
			echo ";";				
		} else {
			header ('Content-type: text/xml; charset=utf-8');
			if ($xslt) {
				// an xslt stylesheet URL was provided so embed it in the XML response
				echo "<?xml-stylesheet type=\"text/xsl\" href=\"".$xslt."\"?>";
			}
			echo $facets;
		} // end if else if else
	} else {
		// the requested facet wasn't found
		error_report("An Identities record was not found for OCLC number ".$oclc." and family name ".$last,"","","",$api,$callback);
	}	// end if else
} else {
	// the Identities service response was not successful 
	error_report("Identities web service error",$xml['status'],$xml['error'],$xml['errortext'],$api,$callback);
} // end if else
	

// END MAIN

?>