| May/June 1999 No.239
OCLC CORC Project |
|||||
| Contents | From Jay Jordan | Membership News | Worldwide | Research | Feature | Product News | |||||
|
|
||||
|
|
|||||
| Feature: OCLC CORC Project | |||||||
The genesis and development of CORC as an OCLC Office of Research projectby Thomas B. Hickey, Eric Childress, and Bradley C. Watson |
|||||||
|
One of the critical value-added services just about every library in the world provides is a catalog to assist users in discovering and gaining access to resources available through the library. For centuries this has usually meant a physical catalog (in any of many forms--the card catalog being the most widely used type of library catalog during the last century) that contained information about physical objects owned by the library. And the library traditionally has served its users as the premier and most accessible point of access to information-about-information (also known as metadata) in addition to being a primary delivery point for information resources, whether held locally and delivered quickly, or held remotely and delivered after some delay via interlibrary loan or other means. Rapid changes in technology in the later part of the 20th century have redefined the boundaries of just about every part of the acquisition, storage, description and access points in the information delivery chain. For libraries, the World Wide Web represents a tremendous opportunity to expand the breadth and reach of library collections and services. Users are increasingly comfortable with--and oriented toward--seeking and using information on the Web. It is not surprising then that the rise of the digital era has led to a wide array of library digitization and Web-selection/access projects, both formal and informal, done either in isolation or in coordination with small groups of libraries. While many of the products of these efforts have been and will continue to be useful, no global, cooperative, standardized registry or catalog of library-selected Web resources exists. And even if one did, much of the metadata for these resources that has been created to date has not always been created in forms that can be widely shared, let alone cooperatively maintained (arguably a critical feature since many of these resources frequently change). Providing users improved ways to discover and sort through Web resources has not captured just the attention of librarians, of course. There are major efforts under way worldwide by many communities, agencies and commercial concerns to help develop a number of the emerging metadata standards and technologies. Notably, efforts such as the Dublin Core Metadata Initiative (DC) and the developing standards such as HyperText Markup Language (HTML), Resource Description Framework (RDF) and Extensible Markup Language (XML) by the World Wide Web Consortium (W3C)--of which OCLC is a member--show great promise. Taking all the above into account, OCLC is partnering with a self-selected group of libraries to pioneer a new, cooperative approach to building a useful catalog of library-selected Web resources--the CORC research project. CORC is a response to the continuing need for libraries to leverage their expertise and services through cooperation. OCLC believes--in light of the evolution of several promising, key standards to the point of stability and the availability of technologies developed by the OCLC Office of Research and others--now is the appropriate time to target the universe of electronic resources and their natural home, the World Wide Web. CORC GenesisCORC as a project did not spontaneously emerge, full-grown from the OCLC Office of Research garden of research initiatives. Instead, it grew from a seed created through the intertwining of two processes--one long term, the other short term. The long-term development took place as various research projects emphasizing metadata over the last several years came to bear fruit. When viewed across the whole, these projects had overlapping, interconnected consequences. The short-term development occurred in a series of mini-retreats over a six-month period beginning in September 1997. No one has ever satisfactorily explained, in explicit scientific terms, the serendipity effect that is often found to operate in the realm of scientific research. It is a phenomenon that just is. A good researcher learns to take advantage of any happy accident that happens to occur in his or her laboratory. Certainly, the research scientists in the OCLC Office of Research did not sit down in the early 1990s to plan a group of research projects whose ultimate goal was to produce results that could be collectively applied in a large, overarching research project. The effect we were seeing by late 1997 was just that--a group of independent projects whose research results clearly converged, supported and overlapped each other. But in just what larger, containing form these results should be shaped, nobody knew. In an apparently unrelated province in September 1997, Terry Noreault, now vice president, OCLC Office of Research, called the research staff together to consider the future of libraries. This is an exercise the Office of Research undertakes from time to time. It offers a means for focusing our thinking about what we should be doing as researchers, individually and collectively. Normally, we do this as part of a two-day retreat, but that year we chose a different format where we met for an hour or so at a time over a few weeks to discuss the issue. Finally, we decided that each of us would write a one-page ‘vision’ paper that would elucidate each person’s view of what libraries would look like in 2003 A.D. We agreed to reconvene at the first of the year to discuss the papers. This activity resulted in 12 papers, ranging from 300 words to nearly 2,000. If one reads the papers together, a common theme emerges of changes occurring due to continuing tightness of library budgets and the advent of the World Wide Web. When the research staff met to discuss what we each had written, we all could see this common theme, but there was no clear vision as to what it meant for the Office of Research or OCLC in their role of supporting libraries in their mission to support their users. To help us see more clearly, we decided to develop a list of drivers that we had linked to the various changes we had predicted, and then have one or two of us write a position paper regarding each driver. Our goal was to see which drivers were going to be most evident in the changes to come, and which of those the Office of Research and OCLC could most usefully work with to support libraries in their mission. The drivers we found in our papers were: economics; technology; advent of consortia; consolidation in the information industry; communication media changes, particularly publishing patterns affected by the Web; changing social behaviors of library users; demographic trends in the overall population; evolving role of archiving and preservation activities; and distance learning technologies and uses. Looking back on these now, with the 20/20 vision of hindsight, it seems that it should have been obvious from the start what the Office of Research and OCLC had to offer libraries headed into the world of 2003. Given the changes we knew were coming due to these drivers, and the collective, converging results we had in hand from the research projects we have pursued this decade, the project we needed was not difficult to see. Clearly, the most effective service OCLC could offer would be one that leveraged the collective capabilities of our members to organize the emerging world of online resources. Each of the nine elements that were driving the changes we saw coming demanded a capability in the library community to jointly address the problems and solutions that these drivers were presenting. And what OCLC has been and is about is providing a platform based on the latest technological innovations to support the cooperative building of useful information-organizing and access capabilities by our member libraries. And what the OCLC Office of Research has been about is forging foundational technologies and capabilities to support OCLC’s ability to pursue its mission for supporting libraries. After six months of discussion, what we discovered in early 1998 was exactly what Fredrick G. Kilgour discovered 30 years ago when he founded OCLC--a great way to support libraries in their mission is to provide a centrally located and administered computer-based system that allows for the cooperative creation of information access paths. That means something slightly different than it did in 1967. For, not only do we need to support the creation of individual resource descriptions--known as catalog cards for the last few centuries, known now as metadata records--but we can and need to support the creation and use of actual interactive pathfinders, which lead a user directly to needed resources. The time is right to re-invent the online union catalog in the image it needs to carry it into the 21st century. Given OCLC Office of Research converging, mutually supportive, research projects that address electronic media, and the related technologies and standards that the world, with help from OCLC, is developing, we can build a new, more useful cooperative cataloging and information access tool. That is what we decided to do. CORC DevelopmentOver the past decade, OCLC has researched and developed a number of promising technologies and built a wealth of experience and expertise that could be applied to assisting libraries in their efforts to provide organized access to library-selected Internet resources. Once a cooperative catalog model was identified as the approach of choice for CORC, OCLC developed a project plan to identify and incorporate the right mix of objectives and features for a successful project. These include: cooperative cataloging of Web resources; accommodating both local and shared metadata; supporting a catalog of metadata for physical and digital items; authority control for access points; RDF/XML import/export; Pathfinder (i.e. portal page) import/export; integration of Dublin Core and MARC in a single system; flexible harvesting of resources; Unicode support; assisted classification and subject heading assignment; automatic keyword extraction; automated data extraction; link maintenance; reference access--Z39.50 browsing interfaces. The use of Dublin Core was almost a given, as was the use of Mantis, a toolkit for building Web-based cataloging systems, as a software platform. For Web resources, Dublin Core has gained wide interest, and Mantis offered us a remarkable system to build on. With the goal of developing a service that would be useful to libraries for efficiently describing and accessing Internet resources, in August 1998, we presented the CORC concept to the OCLC Research Advisory Committee, a group of outside advisors that helps guide research at OCLC. Our approach was to develop and run a prototype system within the OCLC Office of Research, inviting libraries to participate by using the system as we refined it. Although the Mantis system was already up and running, the rather extensive list of features we wanted to supply, and in particular our desire to offer a system mixing both Dublin Core and MARC cataloging, made a target release of CORC an unusual research project in several ways--one being simply the size and scope of the project. We are undertaking the early development of what could become a major new system for OCLC, running with dozens of actual users. It was apparent from the start that the Office of Research could not undertake the project alone, both in terms of manpower needs and because of the expertise and guidance we would need from the rest of OCLC. This need resulted in the assignment of people from Marketing, Development, Quality Assurance and Documentation to the project. We started development in August 1998 with the intention of bringing real users on in January 1999. The Office of Research would then run the system for at least the calendar year, adding and refining features as they became available. During this time, OCLC would work on business plans with the goal of starting work on turning CORC into an OCLC service in mid–1999. We are running very close to this schedule. Work on the plan is well under way; CORC went ‘live’ on Jan. 15, when it first became available to libraries participating in the research project. Currently CORC has more than 80 libraries signed up. We have been able to keep to the schedule by being flexible in the features and level of features implemented. An example of this is that we are still working through some of the more difficult mapping issues relating Dublin Core and MARC, and the implementation of the authority control features moved to a March introduction rather than the February release we had hoped for. Beyond the goal for system features, participation of a substantial number of libraries is critical to CORC’s success. We knew how to build a metadata system. We had already built a prototype and had component technologies that could be assembled into any number of different systems. The trick was to build a system that satisfies the needs of libraries, and it is impossible to do that without the active participation of libraries applying the system to real tasks. Through the University of Michigan, we made contact with the Committee on Institutional Cooperation (CIC), the academic consortium of the Big Ten universities and the University of Chicago--13 campuses in all, and several CORC staff attended the CIC’s metadata conference in November 1998. The conference gave us a chance to meet CIC staff and many of the leaders in the application of metadata in library settings. The CIC libraries decided to participate in CORC. We regarded this as a very strong validation that our direction, at the very least, was of interest to librarians. Since the initial call for participation in late 1998, CORC has earned a high level of interest from the library community and beyond. To date, CORC has attracted a remarkably wide range of types and sizes of institutions as project participants. Large academic libraries are the largest single category of participants, but we also have public libraries, community colleges, small-to-medium-sized academic libraries, a museum, several U.S. federal agency special libraries, a medical library, and other types. Outside the United States, CORC has enrolled libraries from Canada, Europe, the Middle East, Australia and other parts of Asia. We think the OCLC CORC project has accomplished a lot. The CORC system and database is, of course, the focus of our effort. Just organizing a project this size with dozens of participating libraries, running a team assembled from all over OCLC, planning and conducting meetings, and maintaining a Web site and an electronic discussion list to administer and respond to takes a fair amount of time and energy. Research projects always encounter unforeseen obstacles, and CORC is no exception. We have been able to overcome them so far, and we think the resulting system is going to be more than just another service from OCLC. We think it will be a major factor in how libraries use the Web and electronic resources.--Thomas B. Hickey is chief scientist, OCLC Office of Research; Eric Childress is senior product support specialist, OCLC Library Resources; and Bradley C. Watson is consulting systems analyst, OCLC Office of Research. |
||||||
| back to top | |||||||
|
|
|||||
|
|
||||
| Contents | From Jay Jordan | Membership News | Worldwide | Research | Feature | Product News | |||||
| OCLC Newsletter No. 239 | |||||