RLG DigiNews
BROWSE ISSUES
SEARCH
RLG
   
August 15, 2002, Volume 6, Number 4
ISSN 1093-5371

Table of Contents

Feature Article 1
Why the Archives Introduced Digitisation on Demand, by Ted Ling

Feature Article 2
Digitizing Historic Newspapers: Progress and Prospects, by Marilyn Deegan

Highlighted Web Site
Fedora: Developing an Open-Source Digital Repository Management System

FAQ
New Web Domain Names, by Peter Botticelli

Special Focus
Book Scanners and Cradles: Links to Products and Reviews, by Stephen Chapman

Calendar of Events

Announcements


feature article one

print this article print this article

Why the Archives Introduced Digitisation on Demand

Ted Ling
Director, Legislative and Accessibility Projects
National Archives of Australia
tedl@naa.gov.au

Introduction

All cultural institutions today are faced with the challenge of how to promote wider access to, and greater public use of, their collections. For the National Archives of Australia this challenge is complicated by the:

  • size of our collection - about 270 kilometres, or 40 million items, created by Commonwealth Government agencies

  • value and unique nature, and in some cases the fragility, of the collection

  • wide geographical distribution of Australia's population, which prevents many people from gaining direct access to the collection.

This paper describes the Archives' attempts to meet these challenges through an initiative known as digitisation on demand. I will explain how this initiative was first planned and implemented and the lessons we have learned since implementation. I will also mention how we see this initiative proceeding in the future.

The Tyranny of Distance and the Needs of Researchers

The National Archives of Australia has a head office, reading room, and galleries in Canberra, as well as reading rooms in each State and Territorial capital city. There are eight public facilities throughout the country.

Such a network is of limited benefit to researchers who find it difficult to visit our reading rooms. It is important to remember that the Archives does not move records from one city to another. Researchers must go to where the records are located. Once there, they can view the records free of charge. Alternatively, researchers can have a search agent examine the records on their behalf for a fee, or they can have photocopies made and sent to them, also for a fee.

Why should researchers be penalised because they are unable to visit our reading rooms, while other researchers who are able to visit us can access our collection at no cost? The Archives was unable to adequately address this inequity in the traditional reference service environment.

Computer technology has enabled the Archives to provide access to information about our publications, standards, and policies through our Web site to anyone with access to the Internet, regardless of where they live or work. Importantly, for those who require access to our collection it has provided us with the means of presenting information about the collection, and the government agencies that created these records. This has been achieved through RecordSearch, our online database. RecordSearch has given our dispersed researcher audience the ability to identify records that may be relevant to their studies through a keyword search facility. However, until we introduced digitisation on demand, we were unable to fulfill all the informational needs of our researchers because they could not access the actual records online.

Digitisation Trials

Throughout 2000 the Archives tested a number of methods to digitise its collection, including various types of scanners, microfilm-to-digital copiers and digital cameras. The aim of these trials was to find a cost-effective means of making our collection more accessible. It was not about preserving the collection. At the end of the trials it was clear that low-resolution copying using overhead cameras was the most efficient and cost-effective way to proceed. We decided to initiate a digitisation on demand service that would allow researchers to select records from our collection and then request that digital copies of those records be loaded onto RecordSearch where they would be available to all researchers. We also decided to proactively identify certain high use records for digitising.

Copyright

One issue that could severely curtail the effectiveness of this new service was copyright. Most records in the collection consist of unpublished manuscript material, in which the Commonwealth Government holds copyright. However, in many cases copyright is held privately or by parties other than the Commonwealth—individuals, businesses, and foreign and state governments.

The approach adopted by the Archives was to look realistically at the nature of the material in question, and to look at the overriding purpose for which the Archives was planning to digitise and publish this material online. Generally the material in which copyright is held privately is of no commercial value. The records are mostly between 30 and 150 years old and, because of the passage of time, current copyright holders often cannot be identified or traced. The principal objective of our digitisation initiative is to fulfill our statutory function of encouraging and facilitating the use of archival material, and to that extent we have determined that it is in keeping with the spirit of the "fair use" provisions of our Copyright Act. On these grounds, the Archives felt that the public interest lay overwhelmingly in favour of proceeding with the initiative.

Users of the Archives Web site who identify themselves as the copyright owner of a record that has been digitised and added to our Web site, and who object to its continued availability online, are asked to contact us. In the 12 months since the service began, and with over one million images published online, no one has yet approached the Archives expressing such concern, something that we believe vindicates our decision not to take a narrow interpretation of the Copyright Act.

Privacy

There was a final issue to be considered before we introduced our online digital service—privacy.

Australian legislation regulates access to the Archives' collection and requires that we withhold sensitive personal information from public access. We sought legal advice to determine if there was a distinction between releasing records to the public in a reading room, or in photocopy form, and loading digital copies onto a Web site where they can be viewed by anyone with Internet access. We were advised that there was no difference. It is important to note that we only digitise records suitable for public release.

Introducing the Digitisation on Demand Initiative

We began our digitisation on demand service on 11 April 2001. The service was not publicised widely, as we did not know how the processes tested in an artificial environment might translate into actual service. Nor did we have an appreciation of the volume of requests that would be handled by an initiative that was very much in an embryonic stage.

As a first step we decided the service would be offered for records located in Canberra only. This gave us time to refine our procedures, gauge the volume of requests, and establish the appropriate infrastructure needed to provide a national service. The service was extended to Sydney on 5 April 2002 and in time will be extended to other state offices.

Digitisation on Demand - How it Works

The Archives' process of creating digital copies has three components: capturing images using digital cameras, and then processing and loading these images into RecordSearch using software developed in-house: ImageStore and ImageLoader.

Capturing the Images

Capturing digital images is a simple task for the operators. The hardware consists of a digital camera (Canon PowerShot G2) mounted on an adjustable stand for overhead alignment and a computer for processing the captured digital images.

The procedure requires the operator to place the record under the camera, aligned in a pre-set position, capture the image by releasing the camera shutter, turn to the next folio, and continue until the whole record is digitised. Operators are required to digitise from the top of the record down and to avoid dismantling the record unless it is necessary for legibility.

Operators work in five-hour shifts, with short breaks each hour. Capture rates are averaging 500 images per operator per five-hour shift. The average capture rate is easily achievable for regularly formatted records (i.e., where no dismantling of records, removal of pins, plastic sleeves, unfolding of maps, etc., is required).

Processing the Images

ImageStore rotates and crops the captured images without human intervention. It allows an on-screen review of documents copied and the replacement or redoing of single images if necessary. The program saves a large and a small copy of each raw image produced during the capture stage. The small image is the default image and is loaded for viewing. Researchers using RecordSearch can select the larger image for print purposes, or if there are legibility problems with the small version.

The image processing rate is about 20 images per minute. However, in practice, the processing rate is constrained by the rate of capture.

Loading the Images onto RecordSearch

ImageLoader is the conduit for loading the digital images onto RecordSearch. This program will also load images that have been captured by processes other than the digital camera/ImageStore mechanisms. It has the facility to replace and delete images or whole records. A summary of the Archives' specifications is in the appendix.

How Researchers Request Online Digital Records

To request an online digital copy, researchers select a button that appears on the record description screen on RecordSearch [1].

The researcher lodges an online request for a digital copy [2] and in return receives an electronic acknowledgment [3].

When the digital copy has been made and is available for viewing, an icon appears on the record description screen [4]. We do not contact researchers and advise them when their record is available. It is their responsibility to check the Web site from time to time.

When researchers open the digital copy they see a navigational tool at the top of the page [5]. It allows them to advance through the record page by page, or jump ahead to any page they require.

There are also version selection icons that appear at the top left side of the screen. By default the "small" digital image (e.g., 66 KB) will appear, which is usually adequate for on-screen viewing and printing. In practice, we have found that the "small" image usually provides a very legible on-screen and printed copy.

As part of our digitisation on demand service, we undertake to provide our researchers with:

  • legible copies in colour

  • each page copied in its entirety (i.e., no information is missing because of poor framing, etc.) and

  • a copy of the entire record or, if not, the researcher is told why a full copy cannot be provided. For example, it may be that parts of the record are less than 30 years old and so cannot be made public.

We do not promise total quality control, as we generally do not check the images. If we are advised that an image is poor we will simply re–scan it. Nor do we promise high quality images. There will be some pixellation. We use standard energy saving fluorescent lighting, not studio lighting, so some glossy surfaces do present problems with reflection and the lighting of pages is not always evenly distributed. Some of these deficiencies can be resolved quite easily, but this requires more individual attention, is thus more time consuming, and reduces output. Digitisation on demand is also limited to formats of A3 size (210 × 297 mm) or smaller.

In essence, we believe that the primary measure of the success of our digitisation on demand service is legibility, not the cosmetic appearance of the images.

It is important to remember that digitisation on demand is all about providing accessibility to the Archives collection, not the preservation of the collection. It is therefore about providing low-resolution digital images, not high resolution ones.

Digitisation on Demand - One Year's Experience

Digitisation on demand had its first birthday on 11 April 2002. Our researchers are delighted with the service. This is what three of them had to say:

"I feel that this service has the potential to revolutionise the study of history for those of us undertaking postgraduate study at regional universities (in my case a Ph.D. in history at the University of Newcastle)."

"Sincere thanks to you and your staff for a great job. You have provided us with detailed information on our family war heroes, information that was previously very difficult to access. In our case, we were able to establish details, including photos, which were a great joy to a sister of those heroes."

"You are providing a Rolls Royce service in digitising."

We have received many similar bouquets. The service is outstandingly popular.

Managing the Demand

We have been overwhelmed by the interest generated by this initiative. As far as we know, we are the only archives that allow the public to choose which records will be digitised, and we provide this service at no cost. Even though there has been little publicity the demand was instantaneous and it has shown no sign of abating.

Between 11 April 2001, when the service began, and 31 March 2002 we digitised and loaded onto RecordSearch a total of 1,090,934 images. We initially promised our researchers a 30-day turnaround time. However, the high volume of requests has meant delays of up to 90 days. We now simply tell researchers on what date requests currently being digitised were received. To help manage the demand we have introduced longer shifts. We have a team of six operators working shifts between 8:00 a.m. and 6:00 p.m. (Monday to Friday). Yet the demand is still rising. We have limited the number of records a researcher can request to five each year. However, this has not stemmed the flow.

The service is currently free. Our view is why should someone have to pay for a digital copy that is then loaded onto our Web site for the entire world to see for free? Furthermore, the service we are now providing is intended to promote equal opportunity for those researchers who cannot visit our reading rooms, where they could access the records at no charge. We could introduce a fee in return for a fast tracking service, but we believe that this would only create a disparate level of service whereby those who can pay receive one standard of service, and those who cannot pay receive a lesser standard.

We could adopt the same policy as the National Archives of Canada and consult with various user groups (family historians, academics, etc.) to ascertain which are our most valued records and then digitise them, rather than digitise individual items on request. But if we followed the Canadian model we would undoubtedly be digitising some records that are of no interest to many researchers.

The reality is that through our digitisation on demand service we are giving our researchers exactly what they want. They are telling us precisely which records are of value to them and we are doing our best to meet that demand. Our current policy is to develop a combination of proactive and reactive digitisation services. Proactively, like the Canadians, we will identify certain high demand records and have them digitised by external contractors. Reactively, we will continue to digitise records on demand in-house. The delays are likely to continue and we will advise our researchers accordingly. If they are prepared to wait we will digitise the records they want at no charge. If they cannot wait they have the option of obtaining a photocopy (for a fee) or visiting our reading rooms to see the records personally (at no cost). So far, the evidence is that most researchers appreciate the service and are prepared to wait.

Extending Digitisation into the Future

It is clear that we have introduced a service that our researchers value and that the demand for this service will only continue to grow. We know that many institutions are watching with interest to see how we manage the service.

In the past year we have worked with a number of organisations to increase accessibility to our collection through the Internet. The digital system that we have established allows external sites to link to digital images in RecordSearch. This has a multiplier effect in that some researchers who come to RecordSearch from other sites may not have had access to these records if it had not been for the link provided from their original search site. A few examples will illustrate this point.

During the digitisation trials, the University of Newcastle approached us. They wanted to make digital copies of archival documents available to their students for research course work. A number of records were digitised and have subsequently been made available online, both for students and for anyone else interested in foreign relations. This group of records covers aspects of Australia's foreign relations with Japan, Indonesia, Portuguese Timor, and China. We have since developed a number of subject-based icons on our Web site so that researchers have the option of locating records grouped by subjects such as Foreign Relations. Researchers can access records by their control numbers, or they can simply search the Foreign Relations icon. While there is only one digital copy of each record, each can be accessed through different points on our Web site.

We have developed an alliance with the Hellenic Studies Centre at La Trobe University in Victoria to help them gather together records that document Greek migration and other aspects of life in Australia for Hellenic people. Rather than requesting photocopies of relevant records, the Centre now selects records and we digitise them. The Centre then provides links from their online collection to our records on RecordSearch. The result is that a significant group of records are available through the Web sites of both organisations. A similar alliance is now in place with the John Curtin Prime Ministerial Library and it is anticipated that another alliance will be established with Deakin University.

Annual Cabinet Release

At the beginning of each year, Cabinet records that are 30 years old are publicly released. A media launch takes place in early December before the public release. At the moment we provide journalists with a bound volume of selected highlights in photocopy form (which we call a "brick"). The journalists take the volume away with them and use it to write their stories. We now package these records in a digital form, so journalists can access the digital copies from their home or office.

Committees of Inquiry

In recent years there have been a number of committees of inquiry, e.g., Aboriginal deaths in custody, the separation of Aboriginal and Torres Strait Islander children from their families, and child migration from the United Kingdom and Malta. Such committees have often indicated how important records are to people's lives and their identities. We now have the potential to provide online copies of key records identified by these committees and referred to in their reports. For the child migration enquiry we have already begun linking relevant records to the committee's report.

Fact Sheets and Reference Guides

Like many archival institutions, we produce an array of fact sheets and detailed subject-based reference guides. These products are located on our Web site. We can now link digital copies of records to the fact sheet or guide in which they are listed. This provides researchers with an opportunity to view not only the information about a record, but a digital copy of each record as well.

Digitising Records in Many Formats

Digitisation on demand is not just about copying files and documents. We can digitise photographs, plans and many other formats. Here are a few examples:


Photograph of a young woman wearing a veil.



Colour plan of Smoky Cape lighthouse.



Uniform worn by Stanley Melbourne Bruce.



World War I medal.


Conclusion

Over the past five years we have witnessed how new and emerging technology has changed people's lives. The Internet has become a central part of our communication, business, and entertainment industries. According to the Australian Bureau of Statistics, in 1997 7.5% of households had access to the Internet. The following year access increased to 19%, followed by 25% in 1999 and 37% in 2000. The impact of the Internet was in fact recognised by the Bureau when—for the first time—its usage was included as part of the questions asked of all Australians in the 2001 census. In 1995 the Archives grasped the opportunity that the Internet provided to make our research services more widely accessible. It was this technological foundation that enabled the transition to an online digital service that began in April 2001.

If we are to continue to provide accessibility to collections and services that are relevant to our ever-changing environment, we cannot afford to ignore new technologies or the wants and needs of our researchers. Our digitisation on demand service is just the beginning. There is much more that we can do and the only limitations are technology and the resources available to us.

Appendix: Image Capture Output Specifications and Statistics

Digital camera: Canon PowerShot G2
Image resolution: 72 dpi
Image format: Progressive jpeg

Document dimensions
  Width (cm) Height (cm)
Raw 56.44 42.33
Large 42.33 56.44
Small 10.16 15.24

Pixel dimensions
  Width Height
Raw 1,600 1,200
Large 1,200 1,600
Small 720 1,080

Average file sizes
Raw 182 kb
Large 182 kb
Small 66 kb

Capture Rate: 100,000 images per month, based on an average of 500 images per operator per five-hour shift (in practice, each shift includes a total of 20 minutes of breaks and approximately 55 minutes of processing time, so the effective time available for capture per shift is something like 3 hours 45 minutes. As far as possible, breaks are taken during the processing of large files). This rate is easily achievable for records in regular formats (i.e., where no dismantling of records, removal of pins, plastic sleeves, unfolding of maps, etc., is required). The capture rate can quickly fall if this sort of manual preparation is needed. By contrast, the rate can increase to over 1,100 images per shift for regularly formatted records.

Processing time Approximately 20 images per minute per operator, less if editing is required. Approximately 12 minutes of each hour is spent processing the images captured. As mentioned above, where possible breaks are taken during the processing of large files to maximise productivity. Productivity falls dramatically with smaller files, because they are processed so quickly the operator has to be present, which means they cannot utilise processing time for their hourly break. Processing time is also used for reassembling records that have to be taken apart for capturing.

Storage of Captured Data Captured data is housed on a single server with a capacity of 2 TB, of which just over 1,300 GB is free. The database is growing at the rate of 40 GB per month. We can, however, add additional disk storage to the current machine or add additional servers as the database grows. There is no practical limit to the amount of disk space that the application design can address, as it is designed to span multiple machines.

Return to Text

feature article two

print this article print this article

Digitizing Historic Newspapers: Progress and Prospects

Marilyn Deegan
Forced Migration Online
University of Oxford
marilyn.deegan@qeh.ox.ac.uk

Emil Steinvel, Olive Software
Emil@olivesoftware.com

Edmund King, British Library
ed.king@bl.uk

Introduction

In the last issue of DigiNews, Richard Entlich (1) presented a fascinating update of some projects to digitize newspaper content from microfilm, which had last been reported upon some five years previously (2). Entlich concluded the piece by stating that "newspapers continue to push the limits of current digital capture, image processing, OCR, and Web delivery technologies." One of the projects originally surveyed in 1997, the Caribbean Newspaper Imaging Project of Cuban and Haitian newspapers at the University of Florida, was unable to provide updated information for the Entlich piece, but in a comprehensive and extremely useful report on the digitization of newspaper content from microfilm made available in 2001, the project concluded that, for such materials "there is still no good, cost effective means of providing the researcher with full text or connecting story lines broken by column and page breaks."

Although it is true that newspaper content is extremely challenging for many different reasons, the cost-effective creation of usable and searchable digital content that offers users a realistic experience of the richness of newspapers is perhaps closer than we have hitherto thought. A number of libraries and academic institutions, together with Olive Software and OCLC, have made significant volumes of newspaper content available for full-text searching over the Internet, using automated processes developed by Olive Software.

The Importance of Newspapers

The desire to be informed, and to be informed speedily, about local and remote happenings seems to be a basic human need. Methods of disseminating news rapidly have been developed at all periods and in all cultures both literate and oral. In 490 BC, legend has it, Phidippides, the first ever Marathon runner, completed his 26 mile run from Marathon to Athens in around 3 hours to announce victory over the Persians and promptly died from exhaustion. News passed through oral transmission is only as durable as the memories of those who transmit or hear it, but news recorded on some kind of medium has a life beyond the immediate purpose of imparting information rapidly: it becomes part of the historic record. And there is no other medium in our history that records every aspect of human life over the last 300 years—on a daily basis—like newspapers. The information contained in newspapers is, however, considered by its creators as essentially ephemeral—important today, discarded tomorrow—and so they print it on paper which is produced with cheapness in mind, rather than survival. As newspapers have developed over the last three to four centuries they have become increasingly complex. The desire to be informed creates a huge market for those who want to inform, so, as well as news, newspapers now also have pictures, comments, reviews, advertisements, listings, recipes, and increasing numbers of supplements, each of which has its own complexity. All of this is important, for there is a huge social history recorded in even the smallest of articles or advertisements. For instance, a search in the British Library Newspaper Pilot (see below) for the word "cigarette" will reveal a very different attitude to smoking than that which prevails today. The Weekly Dispatch of 1 July 1917 appeals to the British public to help keep the hospitals at the front well supplied with tobacco, for "No wounded man ought to ask for a smoke in vain. It is our privilege to keep him supplied."

Newspapers and Libraries

The huge value of newspapers as part of the historic record has always been recognised by libraries, of course, and there are millions of miles of newsprint stored in libraries all over the world. But newspapers present huge problems of preservation and access: they are large in format, prolific in output, and there has been grave concern for decades about the survival potential of historic newspapers, given that many of them were printed on acid paper. Major libraries such as the Library of Congress in the USA and the British Library in the UK have been microfilming newspapers for many decades in order to preserve the historical record as well as, or instead of, preserving the objects. But there is also concern about the preservation status of microfilm not produced and stored according to standards. Libraries have come under fire for microfilming some titles and then disposing of the originals, but, given the continually increasing problems of storage and funding, what are librarians to do?

The fate of newspapers has leapt into prominence over the last two years with the controversies caused by Nicholson Baker and others about selection and retention policies in the UK and the US. (See References) Never has there been a better time to think about some new ways of preserving and delivering newspaper content to traditional and new audiences.

It takes dedicated researchers to handle broadsheet-sized bound volumes of crumbling paper, or miles of microfilm, especially when most newspapers are minimally indexed. What makes newspapers such a unique resource is what also makes them so difficult to manage. Extracting content from the text of newspapers without presenting all the information around it, as well as the layout and typographical arrangement, is an impoverishing exercise, and clippings without context are bound to lose some meaning. In historical perspective, too, those aspects of newspapers that are often ignored day-to-day—such as advertising—become a huge source of social, economic, political, and cultural information. But researching newspapers requires diligence and often serendipity, and many scholars and others have spent years in libraries searching through unindexed bound volumes and microfilms. Given the importance of newspapers to our daily lives, finding some way of unlocking the content could create an interest in their historic value for many new audiences, including students, school-children, and anyone interested in the multifarious facts, opinions, products, and stories contained therein.

Digitization of Newspapers

The capture of newspaper content as image files is now possible with modern technologies. In particular, digitization from microfilm has been shown to be fast and cheap, giving relatively good results. It is also possible to create acceptable content from compromised originals, with the resultant files being digitally "cleaned" for better readability. Creating searchable content is a much more difficult process, given the complexity of the newspaper page and the mixed media formats, with text, images, advertisements juxtaposed and interspersed in order that maximum content can be accommodated in the minimum space. Stories run across widely separated pages, too. The complex structure of newspapers also changes over time and between titles. Early attempts at Optical Character Recognition (OCR) failed because the quality achieved was too poor for adequate retrieval (and correction too costly) and because the OCR engines operated on linear text, not individual content objects. The structural unit of the page was recognised, not the logical unit of the item. Other problems for OCR (especially with microfilmed content) include curved or rotated lines due to tight bindings, and "noise" or garbage elements, which can be caused by microfilm deterioration, dirt on the scanner, or imperfections in the original, including broken lines, scratches from overuse of the microfilm, and broken characters.

Manual indexing and rekeying offer much better possibilities than conventional OCR, but are too costly for libraries, and are probably too costly even for most of the newspaper publishers themselves, though some are producing digital newspaper archives using manual methods (3).

The Olive Software Approach

The problems outlined above are severe, and no technology can compensate for them fully, particularly since scanned microfilm pages often suffer from many of these problems at the same time. Olive Software's PIPEX™ digitization technology and ActivePaper Archive™ offer a new approach to the digitization of newspapers, using specially-developed algorithms capable of dealing with most combinations of the above-mentioned OCR problems, and also using new technologies for the "zoning" of content into logical as well as structural components. The Olive process, developed in partnership with OCLC for libraries wishing to create online newspaper archives, recognises that for the best end results, every step in the capture and delivery process has to be carefully controlled, and therefore offers an "end-to-end" solution which utilizes library standards of digitization and metadata creation.

The Olive Software solution has been developed over the last seven years, starting life as the smart image software, "Newsware," developed by IOTA, Inc., in the mid 1990s for the Palestine Post project. Early versions of the software were based upon the recognition of words within the structural page units, and the "smart image" component of the software allowed the hits to be highlighted on the page by mapping the co-ordinates of each word and storing this information alongside the OCR information. However, as Dr Ronald, Director of the Palestine Post project points out,

"Page-based newspaper retrieval systems create as many problems as they solve. For the dream of digital newspapers to work, they must be based on the automatic segmentation of newspaper pages into discrete articles. And these articles must retain the hierarchical structure inherent in newspapers as a print medium (4)."  

Zoning, OCR, and the Creation of an XML Repository

ActivePaper Archive has now been used to build a number of newspaper archives. The examples used here are taken from the British Library Newspaper Pilot carried out by the British Library, Olive Software, and OCLC in 2001. The goal of the project was to allow for online accessibility to the British Library's historic content. Such accessibility was to be made possible through a process of digitization, divided into two main parts, precision scanning and image processing, with a digital newspaper archive being generated from the processed images.

OCLC Preservation Resources executed the precision scanning of microfilmed pages of British Library Newspapers (18 reels of duplicate negative microfilm) to TIFF format, at its Conversion Plant in Bethlehem, Pennsylvania. These files were then shipped to Olive Software's microfilm digitization production facility in Israel for processing using their PIPEX system, in order to produce the digital archive from the TIFF files. A large team was involved at all stages of the project including the British Library, OCLC, and Olive Software staff. In addition, the Malibu hybrid library project staff at King's College London and Oxford University were involved in the initial inception of the project, and in its design, implementation, evaluation and promotion. (5).

The Digitization Concept

The project's digitization concept, which is at the core of PIPEX and ActivePaper Archive software, was developed with two primary aims: to make digitization practical (significantly reducing time and cost as a result) and to enable high quality access to historic materials. One PIPEX machine uses a 96 CPU computer architecture to perform advanced parallel image processing on each scanned page. One PIPEX based production line has a monthly capacity of 1.2 million pages, delivering approximately 12 million segmented and tagged articles and photos per month. Olive/OCLC currently runs two PIPEX production lines, with a third scheduled to come on line in late 2002. Olive and OCLC year-end capacity will be about 3.6 million pages, delivering 36 million individual tagged items.

Separating "Readability" from "Searchability"

"Readability," defined as the user's capacity to view and comprehend historic text, and "searchability," defined as the user's capacity to reach relevant content through provision of search criteria, can be said to be the two components of "accessibility," or the user's capacity to retrieve and read relevant content. Both readability and searchability are key goals of any digitization effort.

In the past, it was thought that text generated by OCR (Optical Character Recognition) could provide both readability and searchability. Due to the difficulty of extracting high-quality text from historic scans, this approach is now considered to be impractical. ActivePaper Archive is among the first technologies based on, and enabling, separation of readability from searchability.

Readability in ActivePaper Archive™

ActivePaper Archive achieves readability by enabling the user to read directly from images instead of from the OCR-generated text. The task of comprehending the degraded text is performed by the human eye and brain, the best possible OCR engine. This is an effective solution to the problem, albeit one that is not simple to achieve.

In newspaper material, in particular, it is not practical to provide the online user with readable page images. These would have to be large, high resolution image files, prohibitive both in terms of screen real-estate and download time. This suggests the use of smaller images—of articles, and the elements comprising them—to deliver content to the user.

To provide this capacity cost-effectively, ActivePaper Archive uses an image processing technique called "segmentation," which breaks the page down into its smaller information units (articles, pictures and ads, and their components), identifies them, and infers the relationships between them. Using artificial intelligence and a patented bitmap indexing and image search technology, the software attempts to overcome the formidable obstacles of poor image quality and complex page layout, both very common features in historic microfilmed newspapers.

Searchability in ActivePaper Archive™

For searchability, ActivePaper Archive relies on OCR-generated word patterns, stored in XML format. The software uses APFS™—Adaptive Probability Fuzzy Search (patent pending), a fuzzy logic search technology—to compensate for text inaccuracies by applying fuzzy logic according to the probability for error in each word-pattern. Blindly applying fuzzy logic to an entire archive of corrupted text results in large numbers of irrelevant results. This is not recommended in a microfilm setting. APFS applies fuzzy logic only when needed, providing highly relevant results users would not otherwise get. To support the APFS engine, ActivePaper Archive employs special OCR techniques developed to solve the specific problems of microfilmed historic materials. These enable reasonable OCR accuracy, even in very degraded pages, to further enhance the searchability factor. In addition, the technology produces "word patterns" instead of simple ASCII text conversion. These word patterns include the actual characters making up a word, graphic characteristics of the word, and an encoded error probability parameter.

Bridging Layout and Structure, Images and Text

The link between the searchability factor and accurate readability is provided by a patented technique called Bitmap Indexing™. This technique allows for indexing of each meaningful group of pixels (containing a page element like an article title, a body text word-pattern, or a picture) on the page image. Having a digital index that points to these valuable image elements enables direct access to, and sophisticated manipulation of, image "clips" instead of cumbersome page images. Bitmap Indexing results in meaningful end-user features. For example, search hits can either be highlighted in an article image, and search results pages can display scaled images of article titles, not corrupted OCR text; or they can bring the first text body paragraph, which offers readable results for true searchability.

The Digital Archive

In ActivePaper Archive, newspapers and documents run through the PIPEX image processing stage are converted to ActivePaper XML. Traditionally, XML holds text and its structure, but ActivePaper goes further by tying the XML to images. The product uses three XML layers - one based on the NewsML/NITF standards, one on the Dublin Core, and a third on PRML, or Preservation Markup Language. PRML maps the newspaper's layout, recording coordinates for each piece of text and each page object (6). The first two layers, containing industry-standard tags, make certain that the archive is based on an open, integrative platform, while unique PRML tags lay the basis for Bitmap Indexing and APFS. Work is currently being done to make the DTDs interoperable with library standard XML DTDs such as METS, TEI, and EAD. The archive functions as a dynamic XML repository. The results of image processing (XML files and images) are organized in a logical file-system hierarchy. This provides great flexibility, as the archive can very simply be distributed over multiple hard-drives or storage media. It also avoids the use of database systems, which do not fare well when faced with the volume and complexity of digital newspaper archives. But most importantly, the XML repository can be accessed directly by a Web browser, using XML style sheet technology.

Potential for True Online Accessibility

As evidenced by the results of the project, this conceptual and technological shift from previous visions of digitization means that, for the first time, the technology provides the potential for true online accessibility to large quantities of historic materials with complex content like newspapers.

Building the British Library Demonstrator

1. Microfilm reel scanning to TIFF

In the first stage, microfilm reels were scanned to 300 dpi TIFF images by OCLC Preservation Resources. The images were shipped to Olive's processing facility on CD.

 

2. TIFF image pre-processing and binding

Next, TIFFs were named according to their page number, issue date, and publication name, and the images were optimized. Since generic algorithms may damage microfilmed images, much research has gone into Olive's automatic image cleanup and alignment procedure: a microfilm frame may contain one or two openings, or may have overlap of a fragment of a neighbouring opening (as in the image above). The Olive system (patent pending) automatically separates out the individual page, and deskewing and cleanup of each page is then performed. Different cleanup methods are used for text, images, and margins.

3. Page zoning

Here, the page image has been analysed to find horizontal and vertical lines, text strings, and picture regions. Then, working like a human eye that views a newspaper page from a distance, the zoning engine uses these lines and shapes to analyse the geometry of the page. It builds a net of image objects, examining alignment, size, brightness, and other characteristics of groups of elements on the grid. The result is a rough page structure definition, which includes text regions, classified as body text or titles.

4. OCR

OCR was performed on each of the text regions detected in image analysis. The results of OCR were written into a PDF, overlaid on page images, together with detailed information about word coordinates, font, and size.

5. Segmentation

In this stage, all the information gathered in image analysis, layout analysis and OCR is put to use. The segmentation engine analyses textual objects and their optically-recognized text to find page objects like articles, pictures, and ads, their components, and the relationships between them. This structural information is also written into the PDF.

6. Output to ActivePaper XML
In the final stage, the newspaper issues are output to the XML repository, as ActivePaper XML and page object image clips. Each newspaper issue is a self-contained unit within the larger structure of the repository. Following is a table illustrating the components of each XML-preserved newspaper.

Component Format Example

Component

Format

Example

Table of Contents XML
Stores general newspaper metadata, including section and page details.

ASCII (XML)

Page XML
Stores rectangle coordinates for each entity on the page.

ASCII (XML)

Entity XML
Stores full entity text together with detailed information about styles used and original coordinates.

ASCII (XML)

Page Snapshot
Snapshot of a rectangular page image, which comprises part of an entity.

GIF /

PNG /

JPEG

Page Thumbnail
Low resolution, reduced page-image. Titles are readable, allowing for page-based navigation.

GIF /

PNG /

JPEG

7. Building the Demonstrator

Having scanned the images and processed them to create the XML repository, an experimental Web site was built. This Web site links to the opening "portal" of the repository, which physically resides on an ActivePaper Archive server installed at King's College, London.

A powerful and flexible search engine embedded in the Olive system allows users to perform Boolean searches on the entire repository of more than 200,000 items. Searches can also be restricted by date or newspaper title, and can be further refined by exploiting the XML structure of the repository by searching only within articles, advertisements, or pictures. Further precision can be obtained by searching for individual elements ("title," "byline," etc.) within items. Search results can thus display "snippets" of the newspaper page: article titles, the first few lines of text, image captions or advertisements, so that the results are meaningful at a glance. Clicking on the snippet opens a window displaying the whole item, and from there the user can navigate to the item's position on the newspaper page. It is also possible to navigate the archive by newspaper title and date, just as in a traditional archive.

Conclusion

With Olive Software's technology, the dream of low-cost, fully automated digitization and delivery of historic newspaper content has been achieved, offering libraries new possibilities for increasing access to a greater range and number of potential users. The technology can also be used for the development of searchable archives of other kinds of documents, as for instance has been shown by the development of the Forced Migration Online Digital Library, which contains some 3,000 items (c.70,000 pages) of grey (unpublished) literature on all aspects of refugee studies.

Acknowledgements
The authors would like to thank Judy Cobb, OCLC, and Yoni Stern, Olive Software, for input into this article.

Footnotes

(1)     Richard Entlich, FAQ:  Where are they now? Digitizing Microfilmed Newspapers, RLG DigiNews, June 15, 2002, Volume 6, Number 3.  [back]

(2)     Alan Howell, Film Scanning of Newspaper Collections: International Initiatives, RLG DigiNews August 15, 1997, Volume 1, Number 2.  [back]

(3)    See, for instance, http://www.bellhowell.infolearning.com/proquest/histdemo/  [back]

(4)     Ronald. W. Zweig, Retrieving Text from Digital Images: Lessons from the Palestine Post Project, http://kipp.tau.ac.il/lessons.htm  Solving the Problem of Access – Only to Drown in the Details: Problems in Newspaper Retrieval Systems, http://kipp.tau.ac.il/update.htm  [back]

(5)     There are further details about the British Library Newspaper Pilot at www.uk.olivesoftware.com/conference.  [back]

(6)     PRML was developed by Olive Software. OCLC is working with Olive to standardize PRML. Olive will provide a copy of the draft specification upon request. Contact Emil Steinvel for further details.  [back]

References  [back to text]

Baker, N. (2000) Deadline: the Author's Desperate Bid to Save America's Past, The New Yorker (24 July).

Baker, N. (2001) Double Fold: Libraries and the Assault on Paper, Random House Trade.

Cox, R. J. (2000) The Great Newspaper Caper: Backlash in the Digital Age, First Monday, 5 (12), http://firstmonday.org/issues/issue5_12/cox/index.html.

Pearson, D. (2000) Letter, Times Literary Supplement (8 September).

Return to Table of Contents

feature article two

Highlighted Web Site

print this article print this HWS

Fedora

In September 2001, the University of Virginia received a grant of $1,000,000 from The Andrew W. Mellon Foundation to enable the Library, in collaboration with the Digital Library Research Group at Cornell University, to build a digital object repository system based on the Flexible Extensible Digital Object and Repository Architecture (Fedora). The new Fedora open-source system offers the opportunity to deploy interoperable digital libraries using the latest Web technologies.

The project's Web site is the primary source for publications by members of the Fedora development team, as well as for technical and training documentation. A summary of the Fedora project can be found in the paper by Sandy Payette and Thorny Staples to be presented at the European Digital Library Conference in September 2002 at: . The complete technical specifications for the Fedora software are available at: http://www.fedora.info/techdoc.shtml.

Fedora's sub-systems are described using the Web Services Description Language (WSDL), as are all auxiliary services included in the architecture. The system communicates over HTTP and supports the Simple Object Access Protocol (SOAP). Additionally, the project has adopted the Metadata Encoding and Transmission Standard (METS) as the means to encode and store digital objects as XML entities.

Return to Table of Contents

faq

FAQ

print this article print this FAQ

“I’ve begun to see Web sites with some unusual domain name extensions.  Why were these names introduced, and who, if anyone, regulates their use?"

Since the Internet’s Domain Name System (DNS) was created in the mid 1980s, it has provided a framework for naming host domains (i.e., Web sites) as well as for managing the huge databases, or “registries,” used to locate particular hosts.  At its highest level, the universe of Internet hosts, of which there are now over 25 million, has been organized into several Top-level Domains (TLDs).  These include three generic TLDs, .com, .org, and .net, and a handful of restricted TLDs, including .edu (limited to educational institutions), and .gov, limited to U.S. government agencies.  The original generic TLDs (.com, .org, .edu, .mil, .gov, plus country domains matching the two-letter ISO standard country codes, e.g., .uk, .au) were established in 1984, as part of the original design process for the DNS.  There are now more than 240 country-specific TLDs that are regulated at the national level. The use of TLDs as host name extensions was intended to help users navigate the Internet, by classifying hosts according to the type of institution they represent.  At the same time, organizing the Internet by TLD has enabled decentralization of the database registries, a necessary arrangement given the fact that the Internet now logs more than 12 billion DNS lookups every day.  The table below indicates the dramatic growth in the number of Internet hosts in recent years:

Internet Domain Names

 
Jan ‘02
Jan ‘01
Jan ‘00
Jan ‘99
Total:
29,227,627
27,480,324
10,008,475
4,037,875
  .com
22,746,754
21,023,720
8,006,100
3,425,625
  .net
3,988,975
3,960,363
1,216,750
261,375
  .org
2,484,886
2,489,924
779,950
347,550
  .edu
7,012
6,317
5,675
4,194
  .info
687,473
     
  .biz
499,410
     
Source: Zooknic (http://www.zooknic.com/)

Since the late 1990s, the Web’s exponential growth has made it clear that more generic TLDs will be needed to help users find information and to maintain the stability of the DNS itself. The .com domain, in particular, has become so popular that it now accounts for roughly 80 percent of all domain names. The ubiquity of .com names has raised two specific problems; first, .com has come to be used by a wide range of organizations and not just businesses, as was originally intended.  Second, as the number of registered hosts has grown, organizations have found it harder and harder to devise meaningful hostnames for their sites, especially since it has became a common practice for individuals and organizations to register multiple hostnames, often in the hope of selling the rights to others, a practice known as “cybersquatting.”

In spite of the consensus that new TLDs are needed, the expansion process has been neither straightforward nor without controversy. At present, the DNS is primarily governed by ICANN, the Internet Corporation for Assigned Names and Numbers, a private non-profit organization formed in 1998, and funded in large part by the U.S. government.  After ICANN announced its intention to expand the DNS, it received applications from 44 different organizations hoping to win contracts to operate the registry database for each new TLD.  Over a hundred new TLDs were proposed.

From ICANN’s perspective, the choice of new TLDs depended to a large extent on the business plans and technical expertise of the prospective registry operators.  The selection process was contentious, however.  Some prospective registry operators have charged ICANN with undue secrecy and with setting arbitrary criteria for the choice of new domains.  Many expressed puzzlement why some names were chosen over others.  The name .web, for instance, was rejected, in spite of its obvious appeal as a generic TLD.  Recently, ICANN has been the subject of calls for reform within the technology community and by members of the U.S. Congress.

Still, in November 2000, ICANN formally approved seven new TLDs, with more expected to follow.  The new TLDs approved thus far are:  .biz, .info, .name, .pro, .aero, .coop, and .museum.  All are now operational except .pro, for which negotiations are still underway. For information on the current status of the new TLDs, see http://www.internic.net/faqs/new-tlds.html.

The seven new TLDs fall into two basic categories:  “unsponsored” and “sponsored.”  The unsponsored domains, .biz, .name, .info, and .pro, are intended for broad use and are managed according to global policies, set by ICANN, in much the same way as the older TLDs.  However, unlike the old TLDs, ICANN has decided to place some limits on the use of the new TLDs, to ensure that .biz, for instance, is used only by private businesses.  Likewise, the .pro domain will require proof of professional credentials before a host name can be registered.  (A debate has been underway as to which groups should be entitled to call themselves “professionals.”  Doctors, lawyers, and accountants are likely to be accepted, but how about plumbers, musicians, and horse trainers?)

The .pro and .name domains represent a further departure from current practice, insofar as hosts will only be able to register third-level domain names instead of second-level names, as is the case with existing TLDs.  For example, if I decided to name my Web site “jmw.turner.name,” I can only register “jmw” as the unique portion of my hostname.  This policy was adopted to discourage cybersquatting, in which, in this case, someone might register turner.name and thereby prevent everyone else with this last name from using these characters in their hostname.  ICANN has also sought to combat cybersquatting in the new TLDs by calling for procedures whereby trademark holders can register their own trademarks as domain names, before the new domains are opened to the general public.

As for the “sponsored” TLDs, .museum, .coop, and .aero, it was intended from the start to restrict these domains to relatively small numbers of institutions, representing particular communities (museums, non-profit cooperatives, and the aviation industry, in these cases).  For each of these domains, ICANN has designated an official Sponsor organization (see http://www.internic.net/faqs/new-tlds.html) that has been empowered to set policies governing who can register hostnames.

In general, the introduction of new Top-level Domains has been part of an ongoing effort to better regulate the Internet as well as to expand and improve its infrastructure.  Nonetheless, it is possible that the Internet might continue to evolve in its historically decentralized and often chaotic manner, in spite of ICANN’s efforts to the contrary.  For example, in 2000, the .tv Corporation, a subsidiary of VeriSign, Inc., acquired the rights to the .tv domain from the Pacific island nation of Tuvalu.  Since ICANN’s authority does not extend to country-specific TLDs, the .tv Corporation thus has a TLD for which it can set its own policies.  In the coming years we can expect the Internet to remain a dynamic frontier, with new territories constantly opening up and new groups of settlers moving in to stake their claim.

-- pkb

special focus

Special Focus
print this article print this article

Book Scanners and Cradles: Links to Products and Reviews

Stephen Chapman
Library Preservation
Harvard University
stephen_chapman@harvard.edu

Bookscanners and book cradles for digital cameras are of tremendous interest to the preservation community, which has a longstanding commitment to balance materials handling concerns against quality and production cost requirements. Given the tremendous variety among binding structures, sizes, and conditions of books; quality requirements for reproductions; and project budgets, it is unlikely that a one-size-fits-all solution will emerge. Thus, Harvard's Weissman Preservation Center has posted pages on its Web site to define functional requirements for book copying systems (whether analog or digital) and to monitor the commercial and custom-developed products that have proven viable when neither flatbed scanning nor disbinding is an option. The following table is reprinted with permission. Harvard welcomes comments.

Book Scanners Cradle design Selected projects Notes
4DigitalBooks™ Digitizing Line automatically turns pages unknown (as of 2/02) advertised throughput of ca. 800 pages per hour
i2s digiBook various configurations with and without glass platen (like Zeutschel)

Ransom Center (Gutenberg Bible), Library of Bordeaux, Library of Nantes

configured in seven different models offering grayscale and RGB outputs, contact the U.S. reseller, IImage Retrieval, Inc. in Dallas, TX for more information
IBM Research Pro/3000 Scanner customized Linhof book easel (see below) Vatican Library, Library of Congress Federal Theatre Project collection

Mintzer report, which includes image of easel (cradle); reported throughput of 80 images per day; Pro/3000 used in other projects; see articles at IBM site

BookEye book must open to 180°, no glass METAe (evaluation) Muhlberger report notes, "Scanning bound books demonstrates painfully that books are not made for being opened 180°," and that in practice use of these cradles "can lead to broken bindings"
Minolta PS3000 book must open to 180°, held in place by operator's hands Internet Library of Early Journals (ILEJ), 1997-99 ILEJ achieved scanning throughput of 80-100 pages per hour (final report, p. 27)
Minolta PS7000 book must open to 180°, held in place by operator's hands METAe (evaluation) Muhlberger report notes, "Scanning bound books demonstrates painfully that books are not made for being opened 180°," and that in practice use of these cradles "can lead to broken bindings"
PARC Bookscanner 90° book cradle and wedge platen UC Berkeley Digital Library project Steve Ready, et al., A Bookscanner for Fragile Books, "A Bookscanner for Fragile Books," Final Program and Proceedings, IS&T's PICS Conference, 2001, 172-176.
Zeutschel Omniscan x000 Book scanners various configurations with and without glass platen GDZ, HEDS See, Tanner, et al., Higher Education Digitisation Service: access in the future, preserving the past - the UK perspective, p. 4.

Cradles Design notes Selected projects
exhibit-style cradles or foam supports   Used by Octavo Digital Imaging Laboratory to photograph rare books; see, Octavo collections.
Hand-crafted cherry book cradle constructed by John Riser The University of Virginia Library Special Collections Department uses several cradles in their studio. See, Early American Fiction, "Equipment and Vendors" page.
Linhof Book Copying Easel  

Vatican Library (see IBM Pro/3000 above). AIA /Bassant modified Linhof Easel also available.

Manfred Mayer cradle pages held flat by a perforated vacuum bar Gutenberg Digital See, Lossau and Liebetruth, "Conservation Issues in Digital Imaging," Spectra, Fall 2000.
Preservation Book Cradle designed by Alan Buchanan British Library, Oxford University Library, Lund University Library.

Return to Table of Contents

calendar of events

Calendar of Events

2002 Museum Computer Network Annual Conference
September 4-7, 2002
Toronto, Canada
In partnership with the Canadian Heritage Information Network (CHIN), the MCN annual conference theme this year is: In It for the Long Haul - Technology Programs That Go the Distance. Topics include building infrastructure, strategic use of membership/development systems, and current collection management systems for imaging. 

Copyright Town Meetings 2002: Museum IP Policy in a Digital World
September 7, 2002
Toronto, Canada
The 19th  National Initiative for a Networked Cultural Heritage (NINCH)  Copyright Town Meeting is free of charge and will be held in conjunction with the  Museum Computer Network (MCN) conference.

Sixth European Conference on Research and Advanced Technology for Digital Libraries
September 16-18, 2002
Rome, Italy
ECDL has become the major European forum focusing on digital libraries and associated technical, practical, and social issues. Integration of methods, services, systems and interoperability across different data structures, metadata and components are the key issues that will be addressed. 

School for Scanning: Creating, Managing, and Preserving Digital Assets
October 16-18, 2002
The Hague, The Netherlands
Presented by the Northeast Document Conservation Center, this conference provides current and essential information for collection managers who are seeking to create, manage, and preserve digital assets management of their digital projects.  Conference content will include: envisioning our digital future, quality control and costs, copyright, content selection for digitization, and digital longevity and preservation.

announcements

Announcements

The State of Digital Preservation: An International Perspective
Now available are the proceedings from the conference held in Washington, D.C., April 2002. The contents include: the changing preservation landscape, Overview of technological approaches to digital preservation, and understanding digital preservation.

Minerva
Funded by the European Commission, Minerva is coordinated by the Italian Ministry of Culture and members include Italy, Spain, Sweden, Finland, France, Belgium, and the United Kingdom. The goal is to discuss and bring together activities carried out in the national programs concerned with the digitization of cultural and scientific content. The plan is to create a European common platform, recommendations, and guidelines about digitization, metadata, long-term accessibility, and preservation. For further information contact: Rosella Caffo, Project Manager.

New Version of Online Archive of California (OAC) Available
The Online Archive of California (OAC) describes and provides access to over 6000 collections of primary source materials such as manuscripts, photographs, and works of art held in libraries, museums, archives, and other institutions across California. The new OAC homepage simplifies browsing and searching the finding aids of the collections and, in many cases, digital versions of the photographs, manuscripts, and other objects themselves. The new interface is based upon software from the University of Michigan's Library's Digital Library eXtension Service (DLXS) for the provision of EAD encoded finding aids. 

Metadata Object Description Schema (MODS) Available for Trial Use
The Library of Congress' Network Development and MARC Standards Office, with interested experts, has developed the Metadata Object Description Schema (MODS), which is a bibliographic element set that may be used for a variety of purposes, particularly for library applications. As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records. It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format.

The University of Michigan Libraries Digital Library Production Service Announces OAIster
Using 274,046 records from fifty-five  institutions this new product has created  a wide-ranging collection of free, useful, previously difficult-to-access digital resources that are easily searchable by anyone.

publishing information

Publishing Information

RLG DigiNews(ISSN 1093-5371) is a newsletter conceived by the members of the Research Libraries Group's preservation community. Funded in part by the Council on Library and Information Resources (CLIR) 1998-2000, it is available internationally via the RLG preservation Web site. It will be published six times in 2002. Materials contained in RLG DigiNews are subject to copyright and other proprietary rights. Permission is hereby given for the material in RLG DigiNews to be used for research purposes or private study. RLG asks that you observe the following conditions: Please cite the individual author and RLG DigiNews (please cite URL of the article) when using the material; please contact Jennifer Hartzell, RLG Corporate Communications, when citing RLG DigiNews.

Any use other than for research or private study of these materials requires prior written authorization from RLG, Inc. and/or the author of the article.

RLG DigiNews is produced for the Research Libraries Group, Inc. (RLG) by the staff of the Department of Preservation and Conservation, Cornell University Library. Co-Editors, Anne R. Kenney and Nancy Y. McGovern; Production Editor, Barbara Berger Eden; Associate Editor, Robin Dale (RLG); Technical Researchers, Richard Entlich and Peter Botticelli; Technical Coordinator, Carla DeMello; Technical Assistant, Kimberly Gazzo.

All links in this issue were confirmed accurate as of August 9, 2002.

Please send your comments and questions to RLG DigiNews Editorial Staff.

end of issue

 
RLG DigiNews
BROWSE ISSUES
SEARCH
RLG