![]() |
|||||||||||||||||||||||||||||||||
| August 15, 2002, Volume 6, Number 4 |
ISSN
1093-5371
|
||||||||||||||||||||||||||||||||
|
Why the Archives Introduced Digitisation on Demand Ted Ling Introduction All cultural institutions today are faced with the challenge of how to promote wider access to, and greater public use of, their collections. For the National Archives of Australia this challenge is complicated by the:
This paper describes the Archives' attempts to meet these challenges through an initiative known as digitisation on demand. I will explain how this initiative was first planned and implemented and the lessons we have learned since implementation. I will also mention how we see this initiative proceeding in the future. The Tyranny of Distance and the Needs of Researchers The National Archives of Australia has a head office, reading room, and galleries in Canberra, as well as reading rooms in each State and Territorial capital city. There are eight public facilities throughout the country. Such a network is of limited benefit to researchers who find it difficult to visit our reading rooms. It is important to remember that the Archives does not move records from one city to another. Researchers must go to where the records are located. Once there, they can view the records free of charge. Alternatively, researchers can have a search agent examine the records on their behalf for a fee, or they can have photocopies made and sent to them, also for a fee.
Why should researchers be penalised because they are unable to visit our reading rooms, while other researchers who are able to visit us can access our collection at no cost? The Archives was unable to adequately address this inequity in the traditional reference service environment. Computer technology has enabled the Archives to provide access to information about our publications, standards, and policies through our Web site to anyone with access to the Internet, regardless of where they live or work. Importantly, for those who require access to our collection it has provided us with the means of presenting information about the collection, and the government agencies that created these records. This has been achieved through RecordSearch, our online database. RecordSearch has given our dispersed researcher audience the ability to identify records that may be relevant to their studies through a keyword search facility. However, until we introduced digitisation on demand, we were unable to fulfill all the informational needs of our researchers because they could not access the actual records online. Digitisation Trials Throughout 2000 the Archives tested a number of methods to digitise its collection, including various types of scanners, microfilm-to-digital copiers and digital cameras. The aim of these trials was to find a cost-effective means of making our collection more accessible. It was not about preserving the collection. At the end of the trials it was clear that low-resolution copying using overhead cameras was the most efficient and cost-effective way to proceed. We decided to initiate a digitisation on demand service that would allow researchers to select records from our collection and then request that digital copies of those records be loaded onto RecordSearch where they would be available to all researchers. We also decided to proactively identify certain high use records for digitising. Copyright One issue that could severely curtail the effectiveness of this new service was copyright. Most records in the collection consist of unpublished manuscript material, in which the Commonwealth Government holds copyright. However, in many cases copyright is held privately or by parties other than the Commonwealthindividuals, businesses, and foreign and state governments. The approach adopted by the Archives was to look realistically at the nature of the material in question, and to look at the overriding purpose for which the Archives was planning to digitise and publish this material online. Generally the material in which copyright is held privately is of no commercial value. The records are mostly between 30 and 150 years old and, because of the passage of time, current copyright holders often cannot be identified or traced. The principal objective of our digitisation initiative is to fulfill our statutory function of encouraging and facilitating the use of archival material, and to that extent we have determined that it is in keeping with the spirit of the "fair use" provisions of our Copyright Act. On these grounds, the Archives felt that the public interest lay overwhelmingly in favour of proceeding with the initiative. Users of the Archives Web site who identify themselves as the copyright owner of a record that has been digitised and added to our Web site, and who object to its continued availability online, are asked to contact us. In the 12 months since the service began, and with over one million images published online, no one has yet approached the Archives expressing such concern, something that we believe vindicates our decision not to take a narrow interpretation of the Copyright Act. Privacy There was a final issue to be considered before we introduced our online digital serviceprivacy. Australian legislation regulates access to the Archives' collection and requires that we withhold sensitive personal information from public access. We sought legal advice to determine if there was a distinction between releasing records to the public in a reading room, or in photocopy form, and loading digital copies onto a Web site where they can be viewed by anyone with Internet access. We were advised that there was no difference. It is important to note that we only digitise records suitable for public release. Introducing the Digitisation on Demand Initiative We began our digitisation on demand service on 11 April 2001. The service was not publicised widely, as we did not know how the processes tested in an artificial environment might translate into actual service. Nor did we have an appreciation of the volume of requests that would be handled by an initiative that was very much in an embryonic stage. As a first step we decided the service would be offered for records located in Canberra only. This gave us time to refine our procedures, gauge the volume of requests, and establish the appropriate infrastructure needed to provide a national service. The service was extended to Sydney on 5 April 2002 and in time will be extended to other state offices.
Digitisation on Demand - How it Works The Archives' process of creating digital copies has three components: capturing images using digital cameras, and then processing and loading these images into RecordSearch using software developed in-house: ImageStore and ImageLoader. Capturing the Images Capturing digital images is a simple task for the operators. The hardware consists of a digital camera (Canon PowerShot G2) mounted on an adjustable stand for overhead alignment and a computer for processing the captured digital images. The procedure requires the operator to place the record under the camera, aligned in a pre-set position, capture the image by releasing the camera shutter, turn to the next folio, and continue until the whole record is digitised. Operators are required to digitise from the top of the record down and to avoid dismantling the record unless it is necessary for legibility. Operators work in five-hour shifts, with short breaks each hour. Capture rates are averaging 500 images per operator per five-hour shift. The average capture rate is easily achievable for regularly formatted records (i.e., where no dismantling of records, removal of pins, plastic sleeves, unfolding of maps, etc., is required). Processing the Images ImageStore rotates and crops the captured images without human intervention. It allows an on-screen review of documents copied and the replacement or redoing of single images if necessary. The program saves a large and a small copy of each raw image produced during the capture stage. The small image is the default image and is loaded for viewing. Researchers using RecordSearch can select the larger image for print purposes, or if there are legibility problems with the small version. The image processing rate is about 20 images per minute. However, in practice, the processing rate is constrained by the rate of capture. Loading the Images onto RecordSearch ImageLoader is the conduit for loading the digital images onto RecordSearch. This program will also load images that have been captured by processes other than the digital camera/ImageStore mechanisms. It has the facility to replace and delete images or whole records. A summary of the Archives' specifications is in the appendix. How Researchers Request Online Digital Records To request an online digital copy, researchers select a button that appears on the record description screen on RecordSearch [1].
The researcher lodges an online request for a digital copy [2] and in return receives an electronic acknowledgment [3].
When the digital copy has been made and is available for viewing, an icon appears on the record description screen [4]. We do not contact researchers and advise them when their record is available. It is their responsibility to check the Web site from time to time.
When researchers open the digital copy they see a navigational tool at the top of the page [5]. It allows them to advance through the record page by page, or jump ahead to any page they require.
There are also version selection icons that appear at the top left side of the screen. By default the "small" digital image (e.g., 66 KB) will appear, which is usually adequate for on-screen viewing and printing. In practice, we have found that the "small" image usually provides a very legible on-screen and printed copy.
As part of our digitisation on demand service, we undertake to provide our researchers with:
We do not promise total quality control, as we generally do not check the images. If we are advised that an image is poor we will simply rescan it. Nor do we promise high quality images. There will be some pixellation. We use standard energy saving fluorescent lighting, not studio lighting, so some glossy surfaces do present problems with reflection and the lighting of pages is not always evenly distributed. Some of these deficiencies can be resolved quite easily, but this requires more individual attention, is thus more time consuming, and reduces output. Digitisation on demand is also limited to formats of A3 size (210 × 297 mm) or smaller. In essence, we believe that the primary measure of the success of our digitisation on demand service is legibility, not the cosmetic appearance of the images. It is important to remember that digitisation on demand is all about providing accessibility to the Archives collection, not the preservation of the collection. It is therefore about providing low-resolution digital images, not high resolution ones. Digitisation on Demand - One Year's Experience Digitisation on demand had its first birthday on 11 April 2002. Our researchers are delighted with the service. This is what three of them had to say:
We have received many similar bouquets. The service is outstandingly popular.
Managing the Demand We have been overwhelmed by the interest generated by this initiative. As far as we know, we are the only archives that allow the public to choose which records will be digitised, and we provide this service at no cost. Even though there has been little publicity the demand was instantaneous and it has shown no sign of abating. Between 11 April 2001, when the service began, and 31 March 2002 we digitised and loaded onto RecordSearch a total of 1,090,934 images. We initially promised our researchers a 30-day turnaround time. However, the high volume of requests has meant delays of up to 90 days. We now simply tell researchers on what date requests currently being digitised were received. To help manage the demand we have introduced longer shifts. We have a team of six operators working shifts between 8:00 a.m. and 6:00 p.m. (Monday to Friday). Yet the demand is still rising. We have limited the number of records a researcher can request to five each year. However, this has not stemmed the flow. The service is currently free. Our view is why should someone have to pay for a digital copy that is then loaded onto our Web site for the entire world to see for free? Furthermore, the service we are now providing is intended to promote equal opportunity for those researchers who cannot visit our reading rooms, where they could access the records at no charge. We could introduce a fee in return for a fast tracking service, but we believe that this would only create a disparate level of service whereby those who can pay receive one standard of service, and those who cannot pay receive a lesser standard. We could adopt the same policy as the National Archives of Canada and consult with various user groups (family historians, academics, etc.) to ascertain which are our most valued records and then digitise them, rather than digitise individual items on request. But if we followed the Canadian model we would undoubtedly be digitising some records that are of no interest to many researchers. The reality is that through our digitisation on demand service we are giving our researchers exactly what they want. They are telling us precisely which records are of value to them and we are doing our best to meet that demand. Our current policy is to develop a combination of proactive and reactive digitisation services. Proactively, like the Canadians, we will identify certain high demand records and have them digitised by external contractors. Reactively, we will continue to digitise records on demand in-house. The delays are likely to continue and we will advise our researchers accordingly. If they are prepared to wait we will digitise the records they want at no charge. If they cannot wait they have the option of obtaining a photocopy (for a fee) or visiting our reading rooms to see the records personally (at no cost). So far, the evidence is that most researchers appreciate the service and are prepared to wait. Extending Digitisation into the Future It is clear that we have introduced a service that our researchers value and that the demand for this service will only continue to grow. We know that many institutions are watching with interest to see how we manage the service. In the past year we have worked with a number of organisations to increase accessibility to our collection through the Internet. The digital system that we have established allows external sites to link to digital images in RecordSearch. This has a multiplier effect in that some researchers who come to RecordSearch from other sites may not have had access to these records if it had not been for the link provided from their original search site. A few examples will illustrate this point. During the digitisation trials, the University of Newcastle approached us. They wanted to make digital copies of archival documents available to their students for research course work. A number of records were digitised and have subsequently been made available online, both for students and for anyone else interested in foreign relations. This group of records covers aspects of Australia's foreign relations with Japan, Indonesia, Portuguese Timor, and China. We have since developed a number of subject-based icons on our Web site so that researchers have the option of locating records grouped by subjects such as Foreign Relations. Researchers can access records by their control numbers, or they can simply search the Foreign Relations icon. While there is only one digital copy of each record, each can be accessed through different points on our Web site. We have developed an alliance with the Hellenic Studies Centre at La Trobe University in Victoria to help them gather together records that document Greek migration and other aspects of life in Australia for Hellenic people. Rather than requesting photocopies of relevant records, the Centre now selects records and we digitise them. The Centre then provides links from their online collection to our records on RecordSearch. The result is that a significant group of records are available through the Web sites of both organisations. A similar alliance is now in place with the John Curtin Prime Ministerial Library and it is anticipated that another alliance will be established with Deakin University. Annual Cabinet Release At the beginning of each year, Cabinet records that are 30 years old are publicly released. A media launch takes place in early December before the public release. At the moment we provide journalists with a bound volume of selected highlights in photocopy form (which we call a "brick"). The journalists take the volume away with them and use it to write their stories. We now package these records in a digital form, so journalists can access the digital copies from their home or office. Committees of Inquiry In recent years there have been a number of committees of inquiry, e.g., Aboriginal deaths in custody, the separation of Aboriginal and Torres Strait Islander children from their families, and child migration from the United Kingdom and Malta. Such committees have often indicated how important records are to people's lives and their identities. We now have the potential to provide online copies of key records identified by these committees and referred to in their reports. For the child migration enquiry we have already begun linking relevant records to the committee's report. Fact Sheets and Reference Guides Like many archival institutions, we produce an array of fact sheets and detailed subject-based reference guides. These products are located on our Web site. We can now link digital copies of records to the fact sheet or guide in which they are listed. This provides researchers with an opportunity to view not only the information about a record, but a digital copy of each record as well. Digitising Records in Many Formats Digitisation on demand is not just about copying files and documents. We can digitise photographs, plans and many other formats. Here are a few examples:
Over the past five years we have witnessed how new and emerging technology has changed people's lives. The Internet has become a central part of our communication, business, and entertainment industries. According to the Australian Bureau of Statistics, in 1997 7.5% of households had access to the Internet. The following year access increased to 19%, followed by 25% in 1999 and 37% in 2000. The impact of the Internet was in fact recognised by the Bureau whenfor the first timeits usage was included as part of the questions asked of all Australians in the 2001 census. In 1995 the Archives grasped the opportunity that the Internet provided to make our research services more widely accessible. It was this technological foundation that enabled the transition to an online digital service that began in April 2001. If we are to continue to provide accessibility to collections and services that are relevant to our ever-changing environment, we cannot afford to ignore new technologies or the wants and needs of our researchers. Our digitisation on demand service is just the beginning. There is much more that we can do and the only limitations are technology and the resources available to us. Appendix: Image Capture Output Specifications and Statistics Digital camera: Canon PowerShot G2 Document dimensions
Pixel dimensions
Average file sizes
Capture Rate: 100,000 images per month, based on an average of 500 images per operator per five-hour shift (in practice, each shift includes a total of 20 minutes of breaks and approximately 55 minutes of processing time, so the effective time available for capture per shift is something like 3 hours 45 minutes. As far as possible, breaks are taken during the processing of large files). This rate is easily achievable for records in regular formats (i.e., where no dismantling of records, removal of pins, plastic sleeves, unfolding of maps, etc., is required). The capture rate can quickly fall if this sort of manual preparation is needed. By contrast, the rate can increase to over 1,100 images per shift for regularly formatted records. Processing time Approximately 20 images per minute per operator, less if editing is required. Approximately 12 minutes of each hour is spent processing the images captured. As mentioned above, where possible breaks are taken during the processing of large files to maximise productivity. Productivity falls dramatically with smaller files, because they are processed so quickly the operator has to be present, which means they cannot utilise processing time for their hourly break. Processing time is also used for reassembling records that have to be taken apart for capturing. Storage of Captured Data Captured data is housed on a single server with a capacity of 2 TB, of which just over 1,300 GB is free. The database is growing at the rate of 40 GB per month. We can, however, add additional disk storage to the current machine or add additional servers as the database grows. There is no practical limit to the amount of disk space that the application design can address, as it is designed to span multiple machines.
Publishing
Information
|
|||||||||||||||||||||||||||||||||