 |
 |
 |
 |
 |
 |
 |
 |
Editor's Note |
|
 |
 A Fond Farewell
 |
 |
 |

As Jim and Lorcan have noted, this is the last issue of RLG DigiNews in its current incarnation. It really hit me last week as I sat in my living room chair early one morning, drinking coffee and reviewing the feature articles, that the job I’ve been doing for a decade has come to an end. Devoting this last issue under the editorship of Cornell University Library to reflecting on a decade of change has allowed my colleagues and me to reach closure more easily. Ten years ago, Google was neither a household name nor a verb. Mass digitization was imagined in the thousands of images rather than the millions and billions. And “digital preservation” was commonly used interchangeably with digitization. There was no OAIS, no PREMIS, no JHOVE; MTF wasn’t a term that tripped off the tongue easily. Certification referred to methelyne blue tests, not trustworthy digital repositories.
Our feature articles highlight some of the changes in the two key areas consistently covered by RLG DigiNews over the years: digital imaging and digital preservation. It’s gratifying to see the progress that has occurred in our understanding of the issues, particularly as they have been informed by practical experience at a range of cultural institutions. The FAQ continues a long tradition of probing assumptions about what is and what isn’t—this time focusing on legal impediments to digital preservation and the role of Open Archives. And, if we can be forgiven for being self-referential, it’s only fitting to showcase RLG DigiNews as the last Highlighted Web Site.
Over the past decade, I’ve had the privilege of working with wonderful colleagues both at RLG and Cornell. Robin Dale served as Associate Editor from the very beginning, providing invaluable advice, support, and content along the way. Other RLG contributors included Nancy Elkington, Jennifer Hartzell, and Jane Moss. A total of seventeen staff at Cornell helped produce RLG DigiNews over the years. I’d particularly like to acknowledge the contributions of Oya Rieger (co-editor from 1997 to 2001), who co-developed the newsletter’s focus, and Nancy McGovern (co-editor from 2002-2006), for her deep understanding of digital preservation. Rich Entlich served as the FAQ editor, gaining well-deserved kudos for his thoughtful insights into technical dimensions of digital imaging and preservation. Peter Hirtle served as Advisor and frequent contributor, lending his expertise in intellectual property issues. Barbara Berger Eden edited the announcements and calendar of events for a number of years in her capacity as Production Editor. Carla DeMello brought her considerable design skills to bear on the look and feel of the newsletter. Others involved in production and editing included Ellie Buckley, Peter Botticelli, Jenn Colt-Demaree, Martha Crowe, John Dean, Kimberly Gazzo, Robert Glase, Valerie Jacoski, Erica Olsen, and Allen Quirk.
I can safely speak for all of my colleagues in expressing our gratitude to RLG for this decade of collaboration, to the many authors who contributed feature articles, FAQs, editor’s interviews and conference reports, and to the readers for their interest and timely feedback. We’ll continue to support RLG DigiNews as eager consumers of its newly conceived focus and direction.
Anne R. Kenney, Editor, RLG DigiNews
 |
 |
 |
 |
 |
 |
 |
 |
 |
Feature Article 1 |
|
 |
 Digital Imaging - How Far Have We Come and What Still Needs to be Done?
Authors: Steven Puglia - US National Archives and Records Administration (steven.puglia@nara.gov), Erin Rhodes - US National Archives and Records Administration (Erin.Rhodes@nara.gov)
 |
 |
 |

Introduction
Libraries, archives, and museums have been engaged in the digitization of their collections for well over a decade now. As we look back over the past ten years, what is the best way to assess how far we have come and what work on defining digital imaging approaches still needs to be done?
This article attempts to provide a brief overview of the conceptual and technical influences that have defined digital imaging in cultural institutions during the last decade—by looking at goals and objectives for digital reformatting and how they have changed, by looking at specifications and imaging guidelines and how they have evolved, and by identifying areas that still merit further investigation. The focus is on digitization used to create raster images, as this type of work represents a very high percentage of the digitization that has been done to date.
In general, digitization has moved past the experimental, startup, standalone operation phase. Unfortunately, in many organizations, digitization projects still have not been fully realized as “mainstreamed programs.” Even if digital imaging activities are not quite institutionalized, digitization has found its place within a larger context and is directly related to work being done in the following areas:
- archival and preservation issues and activities
- managed repositories
- IT infrastructure (networks, databases, storage)
- on-going collection and digital project management and policy issues
- Web and online access; metadata and cataloging; and other digital library activities
In many institutions, digitization programs have forged relationships with allied departments, such as faculty media labs, academic computing centers, campus museums, faculty research projects, and with IT systems, such as bibliographic catalogs, collection management systems, digital asset management systems, etc. As we move towards supporting large scale digital imaging programs organizationally, technically, and with the dedication of more resources, there is a growing and sobering understanding of the significant investment that will be required to carry out effective digital imaging initiatives. This is especially true in the areas of staff expertise, IT systems and infrastructure, digitization and metadata specifications and standards, and the costs to create digital resources and manage them over the long-term.
As digital imaging activities also move beyond digitization in special collections in libraries and archives to include involvement in large-scale commercial partnerships for mass digitization of more general collections, the use and nature of these collections are being transformed as well from fixed, discrete, unique collections to resources that will provide the groundwork for networked information, research, and services that we had not envisioned prior to this time.
From some perspectives, a great deal of progress has been made in our understanding of digital imaging as a technology and how to use it within cultural institutions for reformatting collections and making them more accessible. Conversely, some, like Nicholson Baker, have argued that prior approaches to “preservation reformatting,” such as microfilming, were conceptually and technically flawed, and we are only carrying forward similar problems into the digitization environment.
It is undeniably true that the more we learn about digital imaging as a technology, and about digitization as an institutionalized program, the more we realize there is much more to learn. In reviewing efforts during the last decade to define digital imaging approaches, we conclude that in some areas, not a lot of progress has been made.
Goals of Digital Reformatting
“Technology” as a term has come to be synonymous with computers and information technology, particularly the term “high technology.” The dictionary definition of technology is “the practical application of knowledge especially in a particular area” (Merriam-Webster online, 2007). Technology is never “THE answer” to our problems in cultural heritage institutions (and in many ways the same can be said for life in general): technology is just the tools we have available to us to address problems. Over time, the nature and types of tools change, and so does our understanding of problems and how best to address problems. The goal is to become sophisticated users of all appropriate tools, by selecting and using them wisely. We need to acknowledge both the advantages and disadvantages of our tools, and our corresponding technological choices for solving specific problems. What are our goals and assumptions for digital imaging in the context of cultural heritage institutions?
Early digital imaging efforts focused markedly on the technology itself; on the technological feasibility of scanning, on defining work processes, and on accomplishing the actual conversion—rather than focusing on the bigger picture questions of how best to use digitization in an institutional, preservation, or other particular context. The complete range of issues needing to be addressed in order for digitization to be an effective tool within our institutions was not initially tackled. From an imaging perspective, this entails asking ourselves what are the essential characteristics of originals that we want to replicate and carry forward in the digital copy?
Essential characteristics will inform future users about the original resources. The definition of and selection of essential characteristics is informed primarily by curatorial/archival and preservation decisions. Often, they are defined by a variety of physical and qualitative properties (for photographs—such as generation, size, quality, condition, intended use, etc.). Also, they will be unique to the collection/record/media type, and at times are likely to be institution-specific. In all cases, they should be well defined and appropriate to the original resources. In the context of using digitization for preservation reformatting, the ability to define the essential characteristics of originals at a very high level allows us to determine whether the digital copy could truly “stand in” for the original.
The identification and definition of essential characteristics for collection materials in cultural institutions is not new. Approaches to analog preservation reformatting have included specific conceptual rationales regarding essential characteristics and corresponding imaging approaches to reproduce those characteristics. For example, industry standards relating to microfilming documents and preservation microfilming guidelines (http://www.loc.gov/preserv/usnpguidelines.html, http://www.oclc.org/preservation/microfilming/standards/default.htm, and http://www.archives.gov/about/regulations/part-1230.html#partc) focus on the essential characteristic of text legibility. Micrographics standards and guidelines outline specific imaging approaches to maintain this characteristic on the microfilm.
Early approaches to digital imaging came from primarily two perspectives: jump in and just do it, or conduct a pilot project to learn about the technology and the process and then to build a program around these experiences. Each of these approaches has fostered different cultures of digital imaging programs within libraries, archives, and museums. For the preservation community in particular, initial forays into digitization were in many ways an extension of brittle book reformatting projects—should we scan rather than microfilm? Early on, the community recognized that digitization increases and enhances access, but does not guarantee preservation in the same way as microfilm. 
Coming from a long tradition of microfilming, there was much initial interest in digitizing text. Early digitization approaches in the library community focused on defining essential characteristics for text-based originals and corresponding approaches to digitizing to match these characteristics. This includes work done at Cornell and Yale universities in the mid-1990s. In researching the scanning of text-based originals directly, the scanning of microfilm, and the feasibility of hybrid approaches (film-first and then scan, compared to scan first with the subsequent creation of computer output microfilm or COM), these projects focused on two essential characteristics—legibility of the smallest significant character and accurate rendering of type faces or fonts—as metrics for evaluating the appropriateness of specific digital imaging parameters.
While these are appropriate characteristics for high-contrast text based information, other types of originals have other or additional characteristics. Generally, early digitization guidelines adopted the approaches developed for high-contrast text. Many institutions simply accepted specifications developed from these projects, without much consideration of the particular originals or formats that were being digitized. In this phase of digital imaging in libraries, in many ways we were only asking - what is the digital equivalent of microfilm? Given other technological and economic limitations everyone was wrestling with to implement large-scale digitization initiatives, even trying to achieve the digital equivalent of microfilm seemed to be a major challenge.
Cornell University Library’s workshops on digitization in the 1990s provided the community with an excellent technical foundation for digitizing library collections. A major component of the workshop included an approach to defining essential characteristics, which was called benchmarking. The benchmarking concept was applied initially to text, then to graphic illustrations, and finally to photographs. The process focused on identifying the smallest significant character or feature as the primary metric for determining sampling frequency or spatial resolution. The benchmarking approach has had a significant influence on the development of digitizing guidelines over the last 10 years. The Digital Library Federation adopted recommendations for spatial resolution and bit depth as the Benchmark for Faithful Digital Reproductions of Monographs and Serials based on the extensive work conducted by Cornell and other institutions.
Unfortunately, over the last ten years the digital equivalent to microfilm is not holding up as an entirely acceptable model for digitizing text-based originals. Users are demanding that other essential characteristics be carried forward in the digital versions - such as color and the ability to see fine detail. Approaches to digitization are moving towards the consideration of both user expectations and the characteristics of the original resources, rather than being tied to an approach that replicates an earlier technology.
Another example of an analog reformatting approach that replicates essential characteristics is the National Archives and Records Administration’s (NARA) photographic duplication specifications (developed jointly with the Library of Congress over a period of several years with input from commercial vendors – available at http://www.archives.gov/preservation/formats/bw-copying-specs.pdf) for historic negatives. The NARA specifications and other approaches to duplicating still photographic negatives focus on the creation of duplicate negatives that have the same photographic properties as the original negatives. So that the photographic duplicates can be used and printed in the darkroom just like the originals, certain photographic properties are considered essential characteristics.
As described in the duplication specifications, we adopted a new approach to the tone reproduction for the duplicates, called shadow normalization. Traditional duplication approaches were set up to create duplicate negatives that matched the original negatives in terms of overall density, density range, and the relationship between the tones of the image. In order to optimize the duplication process for original negatives with large density ranges and to provide a means of objective assessment of the duplicates (using statistical process control), we opted to adjust the exposure for each negative and place the shadow density at a specified aimpoint on the duplicates. We concluded that the benefits of this approach outweighed losing one of the essential characteristics: the duplicates no longer had the same overall density of the originals (unless by coincidence the shadow density of the original was close to the aimpoint density).
We have certainly moved beyond the phase of limiting ourselves to deficiencies inherent in old technologies and approaching imaging as an extension or equivalent to microfilm. As we consider digitization as a means for preservation reformatting, we will have to weigh similar considerations as we define approaches to digital imaging. Defining characteristics for each type or class of digital object will most likely result in approaches that are not as consistent or standard across different resources, and may also be more difficult to implement.
Imaging Specifications and Guidelines
The following represents a chronology of digital imaging specifications and guidelines, and other articles and publications that have been influential (a fairly comprehensive list, but not intended to represent the “definitive” list, we apologize for leaving off any other significant documents).
|
1995 |
Digital Resolution Requirements for Replacing Text-Based Material: Methods for Benchmarking Image Quality CLIR Report pub53 By Anne R. Kenney and Stephen Chapman 1995 http://www.clir.org/pubs/reports/reports.html |
|
1996
|
Conversion of Microfilm to Digital Images Request for Proposal Library of Congress February 1996 http://memory.loc.gov/ammem/prpsal5/rfp5.pdf or http://memory.loc.gov/ammem/prpsal5/coverpag.html |
|
Requirements and Options for the Digitization of the Illustration Collections of the National Museum of Natural History National Museum of Natural History’s Collections and Research Information System By Donald D’Amato and Rex Klopfenstein March 1996 http://www.nmnh.si.edu/cris/techrpts/imagopts/ |
|
Recommendations for the Evaluation of Digital Images Produced from Photographic, Microphotographic, and Various Paper Formats By Franziska Frey and James Reilly May 1996 http://lcweb2.loc.gov/ammem/Ipireprt.pdf |
|
Digital Imaging for Libraries and Archives Cornell University Library By Anne R. Kenney and Stephen Chapman June 1996 http://www.library.cornell.edu/preservation/dila.html |
|
Digital Images from Original Documents – Text Conversion and SGML-Encoding Request for Proposal Library of Congress June 1996 http://memory.loc.gov/ammem/prpsal/rfp18.pdf or http://memory.loc.gov/ammem/prpsal/coverpag.html |
|
Digital Conversion of Research Library Materials – A Case for Full Information Capture D-Lib By Stephen Chapman and Anne R. Kenney October 1996 http://www.dlib.org/dlib/october96/cornell/10chapman.html |
|
1997
|
Digital to Microfilm Conversion: A Demonstration Project 1994-1996 Final Report to the National Endowment for the Humanities By Anne R. Kenney 1997 http://www.library.cornell.edu/preservation/com/comfin.html |
|
Conversion of Pictorial Materials to Digital Images Request for Proposal Library of Congress May 1997 http://memory.loc.gov/ammem/prpsal9/rfp9.pdf or http://memory.loc.gov/ammem/prpsal9/coverpag.html |
|
1998
|
Guidelines for Digitizing Archival Materials for Electronic Access U.S. National Archives and Records Administration By Steven Puglia and Barry Roginski January 1998 Guidelines - http://www.archives.gov/preservation/technical/guidelines-1998.pdf Matrix - http://www.archives.gov/preservation/technical/guidelines-matrix.pdf |
|
What is an MTF…and Why Should You Care? RLG DigiNews By Don Williams February 1998 http://www.rlg.org/preserv/diginews/diginews21.html#technical |
|
Digital Formats for Content Reproductions Library of Congress By Carl Fleischhauer August 1996 - http://memory.loc.gov/ammem/formatold.html July 1998 - http://memory.loc.gov/ammem/formatold.html |
|
Guidelines for Image Capture Joint RLG and NPO Preservation Conference – Guidelines for Digital Imaging By Stephen Chapman September 1998 http://www.rlg.org/preserv/joint/chapman.html |
|
Manuscript Digitization Demonstration Project By Louis Sharpe and Michael Ott For Library of Congress October 1998 http://memory.loc.gov/ammem/pictel/pictel.pdf or http://memory.loc.gov/ammem/pictel/index.html |
|
1999
|
Digital Imaging and Preservation Microfilm: The Future of the Hybrid Approach for Preservation of Brittle Books RLG DigiNews By Stephen Chapman, Paul Conway, and Anne R. Kenney February 1999 http://www.rlg.org/legacy/preserv/diginews/diginews3-1.html#feature1 |
|
Imaging Pictorial Collections at the Library of Congress RLG DigiNews By John Stokes April 1999 http://www.rlg.org/legacy/preserv/diginews/diginews3-2.html#feature |
|
Illustrated Book Study: Digital Conversion Requirements of Printed Illustration By Anne R. Kenney and Louis Sharpe For the Library of Congress July 1999 http://memory.loc.gov/ammem/techdocs/ibs.pdf or http://www.loc.gov/preserv/rt/illbk/ibs.htm |
|
Digital Imaging for Photographic Collections – Foundations for Technical Standards Image Permanence Institute By Franziska Frey and James Reilly December 1997 article - http://www.rlg.org/preserv/diginews/diginews3.html#com 1999 - http://www.imagepermanenceinstitute.org/shtml_sub/digibook.pdf |
|
2000
|
Image Quality Metrics RLG DigiNews By Don Williams August 2000 http://www.rlg.org/legacy/preserv/diginews/diginews4-4.html#technical1 |
|
Digital Imaging Production Services at the Harvard College Library RLG DigiNews By Stephen Chapman and William Comstock December 2000 http://www.rlg.org/legacy/preserv/diginews/diginews4-6.html#feature1 |
|
2001
|
Report of Imaging Practitioners Meeting on 30 March 2001 to Consider How the Quality of Digital Imaging Systems and Digital Images May be Fairly Evaluated Digital Library Federation By Stephen Chapman May 2001 http://www.diglib.org/standards/imqualrep.htm |
|
Digital Reproduction Quality: Benchmark Recommendations RLG DigiNews By Daniel Greenstein and Gerald George August 2001 http://www.rlg.org/legacy/preserv/diginews/diginews5-4.html#featured |
|
Guidelines for Digital Imaging Projects University of Illinois at Urbana-Champaign December 2001 http://images.library.uiuc.edu/resources/digitalguidev3.pdf |
|
2002 |
Benchmark for Faithful Digital Reproductions of Monographs and Serials Digital Library Federation December 2002 http://www.diglib.org/standards/bmarkfin.pdf or http://www.diglib.org/standards/bmarkfin.htm |
|
2003
|
Western States Digital Imaging Best Practices Collaborative Digitization Program (formerly the Colorado Digitization Program) January 2003 http://www.cdpheritage.org/digital/scanning/documents/WSDIBP_v1.pdf |
|
Debunking of Specsmanship RLG DigiNews By Don Williams February 2003 http://www.rlg.org/legacy/preserv/diginews/diginews7-1.html#feature1 |
|
2004
|
Technical Guidelines for Digitizing Archival Materials for Electronic Access: Production Master Files – Raster Images U.S. National Archives and Records Administration By Steven Puglia, Jeffrey Reed, and Erin Rhodes June 2004 http://www.archives.gov/preservation/technical/guidelines.pdf |
|
Digital Master Images – Sample Technical Specifications for Photograph Collections Library of Congress, Prints and Photographs Division Compiled by Kit Peterson June 2004 http://www.loc.gov/rr/print/tp/DgtlMastersSamplSpecsSelctdRcmndFinal7_2004.pdf |
|
Standards Related to Digital Imaging of Pictorial Materials Library of Congress, Prints and Photographs Division Compiled by Kit Peterson September 2004 http://www.loc.gov/rr/print/tp/DigitizationStandardsPictorial.pdf |
|
2005
|
Introduction to Basic Measures of a Digital Image for Pictorial Collections Library of Congress, Prints and Photographs Division By Kit Peterson June 2005 http://www.loc.gov/rr/print/tp/IntroDgtlImage.pdf |
|
FDsys Specifications for Converted Content – Digitization Specifications and Operating Procedures for Archiving Materials: Creation of Preservation Master Files U.S. Government Printing Office June 2005 http://www.gpoaccess.gov/legacy/FDsys_ccspecs.pdf |
|
CDL Guidelines for Digital Images California Digital Library July 2001 - http://chnm.gmu.edu/digitalhistory/links/pdf/chapter3/3.29b.pdf November 2005 - http://www.cdlib.org/inside/diglib/guidelines/bpgimages/cdl_gdi_v2.pdf or http://www.cdlib.org/inside/diglib/guidelines/bpgimages/ |
|
Digitization for Preservation Reformatting of Photographs DLF Fall Forum, BOF session Presented by Erin Rhodes November 2005 http://www.diglib.org/forums/fall2005/presentations/rhodes-2005-11.pdf |
|
2006 |
Technical Standards for Digital Conversion of Text and Graphic Materials Library of Congress December 2006 http://memory.loc.gov/ammem/about/techStandards122106.pdf |
General Trends in Digital Imaging
Looking at the above chronology, we can conclude the following:
In general, the trends for digital imaging have been:
- From lower minimal spatial resolution to higher spatial resolution
- From 1-bit scanning, to grayscale scanning, and finally to color scanning
- From low-bit (8-bits per channel) to high-bit (16-bits per channel) for grayscale and color images
- From scanning for a specific purpose to digitizing in a “use neutral” manner
Digitization has been limited by the technology:
- People did as little (as low resolution and/or as low a bit-depth) as they could get by with to facilitate only access.
- Minimum specifications (primarily for textual materials digitization) have been a big cost driver—scanning at lower resolutions means you can do twice as much.
- Digital storage was and is expensive. While less expensive today, high-capacity storage area networks (SAN) with automated tape libraries for backups and off-site mirroring (good IT practices for risk mitigation) all remain beyond the financial means of most cultural institutions.
- We are still wrestling with limitations of the science and technology—digital preservation repository infrastructure remains expensive and, to a large degree, undefined.
Early digitization replicated capabilities of the prior technology in the digital capture:
- Many of the early digitizing efforts matched digitization to microfilm.
- Early digitization emphasized scanning existing intermediates – a major problem with this approach is carrying forward the limitations of the previous technology (inaccurate tone reproduction and film grain), as well as carrying forward any defects in the intermediate (photographic and/or physical).
- Early guidelines were based on concepts like QI that came from the micrographic industry
- Realization early on that not all approaches used to assess microfilm quality worked for digital imaging—move toward SFR/MTF and away from resolution charts.
In general, for many projects items were and still are scanned at less than the recommendations cited in the DLF benchmark. This approach contributes to the “building a critical mass” of resources perspective. Often, large projects look towards scanning homogenous materials that are easy to scan both technically and legally, which correlates to a large amount of data created. This trend has accelerated the last few years with large-scale digitization efforts by Google, the Open Content Alliance, the Million Book Project, and the like.
The trend has been for the adoption of fixed approaches, rather than defining the process to achieve a specific result; for example, scanning at a fixed high spatial resolution for all originals, rather than assessing the characteristics of specific groups of originals and adjusting the digital imaging requirements to match the group. There are lots of assumptions in the field that have become truisms, such as fixed high spatial resolution and bit-depth is a good thing—but there is no guarantee of quality.
More recently, the focus has been on high spatial resolution and high-bit sampling, but there has been minimal effort put into defining other quality parameters. In the end, spatial resolution by itself is not a defining factor for digitization requirements, nor does it guarantee quality. It represents the maximum spatial detail or acuity a device is capable of achieving, if designed well. Bit-depth only indicates the maximum range of tones a device is capable of differentiating, but also is not a guarantee of quality. There needs to be more emphasis on ensuring the quality of the pixels, and this is still problematic for the field. Tools have not improved and imaging is still at a point where it cannot be done well without experienced people.
One conceptual approach for information capture is to regard digital imaging along a spectrum. As you move from one end of the spectrum to the other, the amount of information and the accuracy of information that is captured increases.

At one end of the spectrum is a very defined imaging environment. Capture is done to at least minimal specifications, spatial resolution is based on formats and sizes of originals, images are encoded in RGB, and images are processed in a manner to facilitate a specific output (i.e., images are adjusted for generic monitor display or for printing). In imaging science, this may be called an “output-referred” approach to the image state for the digital images. Both NARA’s 1998 and 2004 technical guidelines are based on an output-referred approach and recommend bringing all images to a common rendition that is based on generic monitor display.
At the other end of the spectrum, the imaging environment is less defined and minimal image processing is done, in this end of the spectrum the image state may be either “original-referred” or “input-referred.” Imaging may be done in a manner that relates the image more closely back to the original (although this is also possible with an input referred image state), images may exist in optimized three-channel or multi-channel color encodings, spatial resolution is uniformly high or based on assessment of the original, images are less processed for any particular use. Original-referred and input-referred images will need to be adjusted in order to be used, so that the display or output will look like the original. We are not entirely sure today just how to define digital image capture at this end of the spectrum. These are emerging approaches that warrant further investigation.
The unprocessed end of the spectrum will place a bigger burden on making these resources usable in the future. The more defined the output, the less work there is to do. The more open, more raw the resource, the more work it takes to make it usable. The approach of bringing images to a common rendering solves some of the usability problem.
Although there is the potential for having more functionality at the less-defined end of the imaging spectrum, we still want to ensure we have captured the appropriate essential characteristics that tell us about the original resource. From a preservation reformatting perspective, we would prefer to use an approach that is original-referred.
We feel it is feasible technically to define approaches to digitization that will produce very accurate visual surrogates (for many originals, accurate visual representation is a major aspect of carrying forward the essential characteristics) and create a “good data-set” as Carl Fleischauer of the Library of Congress describes it. We are treading a fine line between a traditional approach that defines a specific visual representation and moving forward to one that is more “use neutral” and accommodates other future (but undefined and unknown) uses.
Comments on Other Aspects of Digitization Guidelines
Scanner and digital camera assessment
- Still in progress, not as much as progress as we would like.
- Early guidelines were not based on the capabilities of the equipment; assumption was that the equipment performed at an appropriate level—even though no one was really measuring the performance of the equipment.
- NARA’s 2004 Guidelines were the first to define capture device performance parameters.
- Only have limits for noise and channel registration
- Higher limit for noise level for text docs—lower maximum density
- Lower limit for noise level for photographs—higher maximum density
- We picked limits based on actual equipment—we ran the tests on a range of scanners and digital cameras and picked numbers that were reasonable
- Other parameters used as a guide for determining the suitability of a particular capture device for a particular original
Viewing environment
- There has been acceptance of the standardized viewing environment defined by the graphic arts industry, if not wide adoption, implementation, and use.
- As we move toward preservation digitization, this becomes even more critical – particularly monitor calibration, if the monitor is used for a basic visual assessment compared to an analysis of the capture device performance.
Color management and ICC compliant workflows
- People are trying to implement color managed workflows.
- The current ICC color management process is not always useful for our work – a new CIE committee on Archival Color has been established and hopes to address the specific needs of our community.
- We still believe in doing the imaging/encoding/image state in a way that would allow us to ignore the ICC profiles – we want the option of interpreting the numerical values literally and still have reasonably accurate color and tone reproduction.
- Rendering intents – when performing color space transformations using the current ICC color management process – relative colorimetric intent is most appropriate for near neutral originals like old documents, and perceptual intent is most appropriate for photographic images.
- Color spaces – assumes RGB encoding (emerging practices may use other encodings)
- NARA’s 1998 guidelines – suggest using sRGB (by assigning), which we still think is appropriate for text documents (they tend to have a smaller color gamut and less saturated colors)
- NARA’s 2004 guidelines – moved to a recommendation of the larger-gamut AdobeRGB 1998 color space
- The future - assume in some cases an even larger gamut color space will be desirable, achievable only with high-bit sampling
Reference targets
- General targets and multiple test targets have been used, but usage and implementation has varied.
- Suggestion of using targets specific to the types of originals we are scanning – for example, aged albumen target for old albumen prints – but this would be very difficult to do.
- Don Williams has worked on an integrated target – the “Golden Thread” target that integrates multiple targets so all aspects can be evaluated.
- Current work at Library of Congress, overseen by Michael Stelmach, with Don Williams and Peter Burns – capture device assessment target, image characterization target (scanned with original), and software for automated analysis – being called the Digital Image Conformance Evaluation (DICE).
Tone and color reproduction aimpoints
- NARA’s aimpoints geared toward generic monitor display and an output referred environment
- Others geared toward prepress work - including the Government Printing Office’s guidelines (http://www.gpoaccess.gov/legacy/FDsys_ccspecs.pdf)
- Still a big question about the variability of digitization
- Currently the Library of Congress is trying to address this
Image processing workflows
- NARA’s illustrative sample workflow intended to minimize doing anything “bad” to the image quality versus leaving the images in a less defined, less processed, “raw” state
- When people have described their image processing – it is almost always specific to their local process – as presented at the IS&T Archiving Conference panel on imaging workflows, in Washington, DC, 2005
- Sharpening – do it or not? Still a question.
Image quality defects
Document Types
- Define text by the characteristics of the information and type
- 1-bit for printed high-contrast text
- 8-bit for low contrast, diffuse characters, staining, faded, mixed content, etc.
- 24-bit for cases where color is important to the interpretation of the information
- Define photographs by
- Transmissive camera originals – negatives, slides, transparencies
- Reflective positives – prints
- Pixel array tied to format and dimensions of the originals – a major departure from earlier guidelines – acknowledging the amount of information in originals varies
Quality Control
- Not standardized and no community-wide standards
Derivatives
- Fixed size vs. dynamic creation
- In general, moving toward JPEG2000 – although implementation in a high-demand environment is still problematic
- Although available for many years for on-the-fly creation of derivatives from traditional raster image formats like TIFF, dynamic creation is not limited to just the JPEG2000 format
File Formats for master files
- TIFF – still the de facto standard, advocated by people who like to “keep it simple”
- JPEG2000 being considered more seriously now
- Resiliency to corruption due to data redundancy
- On the fly conversion of derivative to any size
- Difficult to implement
- Limited choices for software toolkits to support JPEG2000 within IT infrastructure
- High demand on infrastructure when trying to create derivatives on-the-fly
Where are We Headed and What Still Needs to be Done?
A great deal of progress has been made in some areas, and in other areas not as quickly as desired, but much has been learned over the past decade. As we move towards better definition of what digital imaging means in a preservation context, there is still work to be done. It is a little humbling to look back and admit that we are still asking many of the difficult questions that we were asking over a decade ago – particularly about the relationship of digitization to preservation and agreement on approaches that are appropriate for preservation reformatting using digitization.
For the most part, we have accepted that digitization can meet current needs for facilitating access, and by doing so, also fulfill basic preservation needs by limiting handling of originals. However, digitizing is not yet completely synonymous with preservation – see Appendix A of NARA’s 2004 Technical Guidelines. Beyond the benchmarking concept, which has been well established for text-based materials through the work done by Cornell and later by the Digital Library Federation’s Benchmarks for the Reformatting of Monographs and Serials, only recently has there been community-wide discussion regarding requirements for preservation reformatting, particularly for non-text original formats. We are moving closer to a better understanding of what is needed for digitization as a preservation reformatting approach, and in many ways we have already defined some of the requirements that we would be willing to accept as preservation requirements – see “Digitization for Preservation Reformatting of Photographs.”
The move from creating a digital copy primarily for access purposes to one that is more focused on the quality of the digital copy—one that is worth sustaining over time—takes into account not only the properties of the original that are deemed important to carry forward on one hand, but also changing user expectations, the capabilities of the technology at the time, and the purpose or use of the digital image on the other.

Only at the highest technical quality level do you get a digital resource that matches the original, or even the analog preservation copy, if this is the intent. As Stephen Chapman from Harvard University notes, sustainability is a key attribute of good digital collections, and the best time to build in sustainability is at the point of creation. Although the concept of sustainability applies to both use and content, from a digital imaging perspective, what approaches do we follow to create digital images that are worth sustaining over time? How do we start to define a technical approach that could also serve as a preservation approach that takes into account some or all of the factors in the illustration above?

This article has discussed two broad concepts, information capture and essential characteristics, that we think address sustainability in an imaging context. Information capture addresses the concept of producing a good reproduction. Conceptually the community is moving toward capturing information to produce “good data sets.” These representations may not look like the originals we are copying, but can serve as of yet unknown needs, such as scientific analysis and research. 
Defining essential characteristics of originals helps us to move beyond the limitations of current technology to designing specifications based on these properties. These can be based on many factors, including physical and chemical attributes of the original, condition, quality, defects, date of production, generation (photos), curatorial or financial value, etc. We need further investigation into the essential characteristics of different classes of digital objects/files, how to tie these properties to digitizing approaches, and how to determine the best approaches to digitizing classes of originals with similar properties and characteristics. Essential characteristics can be identified via the capture process or in metadata about the image. Metadata should include information about the original and the digital resource, and should document information about characteristics that are not inherent in the digital version.
A focus on essential characteristics does not preclude valid reasons for digitizing collections based on concepts of intended use, affordability, and sustainability over time. “Fitness to purpose” has been a driver for digital imaging for some time; not every program will have the same goals of fidelity to the original, longevity, or preservation. Institutions should take into account the context and reasons for digitization in their individual cases. In a preservation context, however, there may be a higher risk of not achieving preservation goals at this end of the spectrum.
Even as we move toward less-defined imaging approaches, we still need to create images in consistent ways that will allow us to automate ingest into digital repositories - including characterization and validation of the digital objects and data formats, automated transformation of digital objects, and automated creation of reference and/or use copies.
There are gaps in specific technical areas that should be addressed by the larger preservation and imaging community in order for digitizing for preservation reformatting to be fully scoped and defined. In order to consider using digitization as a method of preservation reformatting, it will be necessary to specify more about the characteristics and quality of the digital images beyond specifying minimal and optimal levels of spatial and signal resolution. High-bit, high resolution imaging has been the focus of imaging specifications, but there has been minimal effort put into defining other quality parameters, such as tone reproduction, color reproduction, color mode, capture device performance, assessment of source, and image state, for example.
As mentioned above, one area that still needs a lot of work is capture device performance. Pixel resolution is a good marketing device, but the internal processing of the scanner or camera has a big influence on image quality. Besides tests for spatial frequency response and dynamic range, evaluation of the capture device might include tests to measure noise levels, uniformity in tone and color reproduction, channel registration, dimensional accuracy, etc. More importantly, can we define trustworthy pass/fail limits for each of these tests so that scanner and camera performance is more easily measured and documented? Although much effort has gone into quantifying the performance of scanners and digital cameras in an objective manner, there has not been enough progress in making device performance assessment readily understood, usable, and easily integrated into imaging workflows to date. There has not been advancement of comprehensive guidelines with sophisticated approaches to device performance assessment. Simple approaches, or simply taking the manufacturer’s specifications at face value, come at the expense of quality. We assume that the capabilities and performance of capture devices will need to improve to accommodate high levels of information capture. Currently, the Office of Strategic Initiatives at the Library of Congress is working with consultants on developing better test targets and software for automated evaluation of capture device performance.
We certainly need consensus on applying scanner performance test limits that are acceptable to the imaging community. The digital library community has been relatively silent on some of these technical issues, and there has been a heavy reliance on scanner manufacturers and the digital camera industry to define the criteria. Imaging practitioners should work to define both assessment criteria and pass/fail limits. To a certain extent, the imaging science community can assist us in this process. For particular applications, certain performance criteria will be more critical than others and will need to exceed established minimum limits, such as level of dimensional accuracy for aerial photography scanning.
We need to be sophisticated users of the technology and the tools. We should continue to acknowledge that to do imaging well is both difficult and requires expertise. While it is a versatile tool, it does not accomplish specific functions well without requiring a certain amount of operator expertise. In many cases it still does not work as well as we would like. People have been more than willing to accept very limited, undefined digital imaging guidelines as acceptable for preservation reformatting. We should look to designing and endorsing more comprehensive and sophisticated approaches to imaging, especially as evident in imaging specifications, guidelines, and best practices—which should include an articulation of the entire digitization approach, not just specifications for imaging “at the scanner.” Guidelines should take into consideration a wider range of technical parameters, assessments of the original on a more granular basis, and an acknowledgement that there will be different approaches depending on the role or purpose of imaging within a particular context.
 |
 |
 |
 |
 |
 |
 |
 |
 |
Feature Article 2 |
|
 |
 A Digital Decade: Where Have We Been and Where Are We Going in Digital Preservation?
Author: Nancy Y. McGovern - ICPSR (nancymcg@umich.edu)
 |
 |
 |

There has been measurable progress in the digital preservation community since the seminal work Preserving Digital Information: Final Report and Recommendations was published by the commission of the Commission on Preservation and Access and RLG more than a decade ago. Those concerned about digital preservation in 1996 did not have the Open Archival Information System (OAIS) standard to frame the development and discussion of digital preservation developments; or a set of attributes of trusted digital repository to delineate the organizational context for digital preservation; or a data dictionary for preservation metadata; or the concept of institutional repositories made real by a range of software options. All of these developments have emerged within the past decade. Today, we have conferences that are entirely devoted to digital preservation (e.g., the International Preservation (iPres)) conference and peer-reviewed journals for digital preservation, (e.g., The International Journal of Digital Curation). One can follow the maturation of the digital preservation community in a decade of RLG DigiNews articles.
Originally focused on “the converging fields of preservation and digitization,” the first article to specifically address digital preservation appeared in RLG DigiNews in 1998. In 2000, the RLG DigiNews editorial staff significantly expanded the coverage of digital preservation, highlighting articles with the now familiar symbol, which added digital to the established infinity notation from print preservation. The cumulative contribution by RLG DigiNews to the digital preservation literature over the past decade includes more than fifty feature articles plus a sequence of highlighted websites and FAQs. These articles and other features stressed practical steps in digital preservation with an emphasis on the development and evaluation of relevant strategies, applications of research results, the integration and use of tools, and national and community-level agendas.
This tenth anniversary review of digital preservation developments takes an informal gap analysis approach, measuring where we are (the “as is”) against where we might like to be (the “to be”). This gap analysis has three components reflecting the core aspects of digital preservation: organizational infrastructure, technological infrastructure, and requisite resources.
 Figure 1. Three-Legged Stool for Digital Preservation.
These three components comprise the three-legged stool for digital preservation (Figure 1), a concept developed at Cornell for the Digital Preservation Management (DPM) Workshop series, that was funded by the National Endowment for the Humanities from 2003-2006. The workshop curriculum uses the three-legged stool as a means for an organization to assess its development within the context of a maturity model comprised of five sequential stages: acknowledge, act, consolidate, institutionalize, and externalize.[1] This review takes a more basic step by considering the status of the three legs of the stool within the community from the “as is” and “to be” perspectives.
The Organizational Leg
The organizational leg determines the “what” of digital preservation—the mandate, the scope, the objectives, the staffing of an organization—for engaging in digital preservation. Ten years ago, the organizational leg was arguably the weakest leg as evidenced by the general absence of explicit mission statements that referenced digital preservation, policies that specifically addressed the preservation of digital assets and sustained digital preservation programs within organizations.
The “As Is”
There have been several important developments for the organizational leg over the past ten years, including the development and promulgation of the RLG/OCLC report on the Attributes of a Trusted Digital Repository (TDR), an increase in the development of digital preservation policies by organizations, and an acknowledgement of the central role of procedural accountability for audit and certification.
Trusted digital repositories TDR represents the best expression of the organizational leg for digital preservation and has become a de facto standard for the digital preservation community since its release in 2002. Prior to the development of TDR, the community had no formal expression of the organizational context for digital preservation.
 Figure 2. The Cornell Model for Trusted Digital Repository Attributes.
The Trusted Digital Repositories document defines seven attributes of a conformant organization: OAIS compliance, administrative responsibility, organizational viability, financial sustainability, technological and procedural suitability, system security, and procedural accountability. The relationships between the TDR attributes are portrayed in the Cornell model (Figure 2), developed to support the DPM workshop series. OAIS compliance is implicit in the diagram. TDR stresses the importance of the organizational context and places technology within that context. This placement recognizes that technology should be suited to the scope and requirements of each digital preservation program. The Cornell model for TDR added a “digital archives border” to the TDR attributes because one organization might maintain more than one repository instance, in which case the outer layers might be coordinated across the organization, and a group of organizations might come together to manage one repository (e.g., in a consortial effort).
Digital preservation policy development Policies and other documentation of decisions and actions represent one of the best indicators of the development of the organizational leg. At the 2006 Best Practices Exchange in North Carolina “participants stressed again and again that a successful digital preservation program requires a strong foundation…Participants identified four essential elements for building a strong foundation for a digital preservation program: support and buy-in from stakeholders; “good enough” practices implemented now; collaborations and partnerships; and documentation for policies, procedures, and standards.”[2]
This brief list of digital preservation policies is suggestive of the increase in policy development within the digital preservation community world wide.
The advent of the World Wide Web, which was also in its nascent stage in 1996, has made possible more effective and global exchange of information about policies and practices. More work is underway on developing policies. For example, the nestor policy project in Germany is working on a profile for a national long-term preservation policy.
Providing the evidence for audit and certification “A well-written policy should serve as historical proof of an institution’s commitment to digital preservation now and long into the future.” This conclusion from the 2006 Best Practices Exchange reflects an implicit principle that underlies the evidence requirements for the audit and certification of digital archives. The October 2005 issue RLG DigiNews featured articles on the major digital archive audit initiatives in the US, the UK, and Germany. The Center for Research Libraries (CRL) conducted a series of test audits of digital archives, with funding from The Andrew W. Mellon Foundation, and hosted a meeting of with the UK and German audit projects that produced a set of common audit principles. CRL released the “Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC)” in March 2007 and should be releasing the principles and test audit report soon. TRAC is a revised version of the RLG/NARA document, Audit Checklist for Certifying Digital Repositories, that was released for public comment in January 2006. An ISO standard development effort is underway that will build on the work of these initiatives and integrate the relevant requirements from the information technology and security domains. In considering the basis and means for digital archive certification, these initiatives have shifted their focus towards the benefits and tools needed for self-assessment and third-party audits. The results so far have demonstrated that self-assessments and audit effectively identify the strengths and weaknesses of digital preservation programs and define a development plan for organizations to incrementally address the full set of criteria defined for trusted digital repositories.
The “To Be”
Though the “as is” perspective on the organizational leg has improved substantively over the past decade, there are at least two notable areas of development for the “to be” view: the need to integrate the organizational policies for digital preservation into technological implementations and the need to develop and evolve digital preservation skills. 
Integrating policies into action The organizational leg (the “what”) and the technological leg (the “how”) of the digital preservation stool need to be coordinated to develop compliant and feasible digital preservation strategies. The theory is in place. The OAIS Reference Model, for example, identifies specific documents that are needed, including submission agreements, format standards, documentation standards, physical access control, database administration, storage management, disaster recovery, system evolution, migration standards, and procedures regarding most of these areas. In practice, the organizational leg, represented by policies, and the technological leg, represented by digital repositories, may develop separately and not always in parallel. There are ongoing developments to watch in bridging this gap between the organizational and technological legs. The EU-funded project, PLANETS, promises technology-based preservation planning and tools that reflect organizational policies. The PLEDGE project, a collaborative initiative by the Massachusetts Institute of Technology and University of California at San Diego Libraries and the San Diego Supercomputer Center, has developed a promising policy engine prototype. Integrating the organizational and technological legs represents a tangible intersection of theory (what should be done) and practice (what is done).
Developing requisite skills As technology evolves, digital preservation skills need to evolve. Preservation metadata provides an illustrative example of this often unmet requirement. In 2005, the OCLC-RLG Preservation Metadata Implementation Strategies (PREMIS) Working Group released the first version of the preservation metadata data dictionary and is continuing to revise and enhance their results. RLG DigiNews featured PREMIS updates in the October 2004 and December 2004 issues. PREMIS has become a de facto standard that may transform into a formal standard for preservation metadata in the future. Yet practitioners continue to struggle with implementing preservation metadata, as participants at Cornell’s DPM workshops confirmed. One aspect of this struggle is that there are digital preservation specialists who are able to devise digital preservation policies and strategies; and there are metadata specialists, who are versed on metadata standards, schemas, and tools. A useful hybrid skillset would be a digital preservation metadata specialist who is able to bring the best of both together and to apply the policies and requirements at high and low levels of granularity. As digital preservation strategies emerge and evolve, similar hybrid roles that combine organizational and technical skillsets may be needed for specific types of digital content, such as digital preservation workflow management and archival storage management. In the long-term, the digital preservation community will have developed a comprehensive set of specialized roles and skills for digital curators. The Digital Curation Centre in the UK and the Digital Curation Curriculum project at UNC Chapel Hill are examples of initiatives to watch in this area.
The Technological Leg
The technology leg addresses the “how” of digital preservation – the specific digital preservation strategies, staff, tools, equipment, and other means for achieving digital preservation objectives. The technology leg combines hardware, software, formats, storage media, networks, security measures, workflows, procedures, protocols, documentations, and skills, both technical and archival. A decade ago, the hope of a “silver bullet” for digital preservation, typically in the form of a technology-only solution, was still strong and served as an inhibitor to the development of organizational responsibility for digital preservation.
The “As Is”
Arguably, technology has been viewed as both the problem and solution for digital preservation. The lessons from the past decade have demonstrated to the community that a balanced three-legged stool with a sturdy technology leg will be more effective in establishing a sustainable digital preservation program than a technology pogo stick. Certainly, there have been notable technology leg developments, including the OAIS Reference Model and open source repository software and tools.
OAIS Reference Model The development of the OAIS Reference Model, begun more than a decade ago, reflects the work of an international group of experts, and it is intended for use in any context in which digital preservation occurs and represents the most formal and comprehensive expression of the archival process that is available to the community. The stages of development for OAIS can be traced on the OAIS website.
 Figure 3. The high-level diagram for the OAIS Reference Model.
The high-level OAIS diagram depicted in Figure 3 has become ubiquitous in digital preservation presentations. OAIS provides a common language and a set of functions for use in community-wide discussions and in mapping organizational developments. Cal Lee at UNC Chapel Hill wrote an evaluation of the OAIS development for his dissertation, “Defining Digital Preservation Work: A Case Study of the Development of the Reference Model for an Open Archival Information System (OAIS).”
Repository software and tools Examples of repository software developed over the past 10 years include: DSpace, the Flexible Extensible Digital Object and Repository Architecture (Fedora), Greenstone digital library software, the Berkeley Electronic Press (bepress) and the Dark Archive In The Sunshine State (DAITSS). Even with these examples of available repository software, organizations need to decide how to select an appropriate repository option by considering the capabilities and limitations of each and the extent to which the repository software meets archival requirements and suits the digital content to be preserved. Organizations may opt to build their own repository, such as the National Library of the Netherlands, or to subscribe to a digital preservation service provider, such as bepress or the OCLC Digital Archive. None of these options was available to organizations a decade ago.
Repository software may integrate digital preservation tools (or equivalent functionality) or an organization may define for itself a digital preservation workflow that integrates tools at appropriate points in the process. Recent examples of tools used for digital preservation include those that identify and evaluate file formats (e.g., JHOVE, DROID), that normalize files to preservable formats (e.g., XENA), that generate and capture metadata (e.g., the NLNZ metadata extractor), and that produce a unique identifier and aid in detecting changes to files (e.g., checksums). The October 2006 RLG DigiNews FAQ reviewed the NLNZ Metadata Extractor and several other tools. These developments represent progress, but the community has some ways to go before digital preservation is fully automated and fully-compliant digital preservation systems are available.
The “To Be”
There is significant research and development work underway that is targeting the development, enhancement, and scalability of tools and repository software. RLG DigiNews highlighted 10 promising digital preservation research programs in August 2005. The “to be” category for the technology leg could be categorized as making it possible to do more through automation and to provide the means to integrate audit requirements and measures into digital preservation management.
Scalable capabilities Scaling repository software to the increasing size of digital content containers (e.g., digital video files) and the extent of digital content to be preserved is a capacity and capability issue that largely remains in the “to be” category. The past decade has also seen the publication of recommendations from the National Science Foundation (NSF) in the US and the Community Research & Development Information Service (CORDIS) in the EU about the infrastructure that will harness the potential of technology developments to support and enable research. These programs provide a framework for development.
Workflows and suites of tools Still on the wish list for the digital preservation community is the capability to easily define, customize, change, and extend a digital preservation workflow that is modular to allow for the easy integration of tools. There have been developments in generating or extracting metadata for submissions, but this work is still in its infancy. It is also not always possible to easily incorporate tools into a workflow. Moving from individual tools to suites of tools and workflows that can be shared and exchanged between organizations seems to be a natural path for development.
 Figure 5. The Integrated Digital Preservation Matrix.
As more and more organizations develop trusted digital repositories that are based upon sound and continuous workflows, the potential exists for leveraging the capacity and capabilities across repositories and across the community to realize cost-savings, more effective results through collaboration, and community-wide action, as envisioned in the integrated digital preservation matrix (Figure 5, developed for the Cornell DPM workshop series).
Audit capabilities As institutions begin to rely upon each other, there is the need to develop trust through verification. It is not enough to provide assurances about performance and reliability in digital preservation; it is necessary to demonstrate effective and sustained action. With the development of audit and certification for digital preservation, organizations will require the means to conduct self-assessments and participate in external audits. Incorporating these tools into digital preservation repositories would lighten the burden of preparing for audits and make it easier – and less costly – for organizations to meet audit requirements. The audit and certification initiatives have provided tools for self-assessment and are increasingly providing examples for audit; organizations need to step up to contribute local examples and lessons learned.
The Resources Leg
The resources leg factors the “how much” of human, technological, and financial resources are needed to produce desired digital preservation outcomes. A decade ago, the question: “How much does digital preservation cost?” was enough to bring a digital preservation discussion to a shuddering stop. At that time, the resources component of digital preservation had not been explicitly separated from the organizational component. As a distinct component, the resources required for a digital preservation program can be identified, quantified, and measured comprehensively and objectively – although for the most part this potential has yet to be achieved.
The “As Is”
Unlike the organizational leg that is embodied in the TDR document and the technological leg that is defined in the OAIS Reference Model, the resources leg of digital preservation has no community document that expresses its scope and requirements. The inclusion of financial sustainability as an attribute of a TDR signifies an important development for digital preservation because it was the first time that addressing the cost of digital preservation was explicitly acknowledged as an organizational requirement. Additional indicators of progress towards the development of a sound resources leg include the designation of digital preservation funding by organizations (e.g., DSPACE at MIT); digital preservation programs that are lasting longer than digital preservation projects, as evidenced by organizations such as those that have developed digital preservation policies; and research funding for digital preservation that is ongoing if not permanent (e.g., JISC, NEH, NSF, NHPRC programs). In addition to these indicators, the digital preservation community has a growing base of literature that addresses digital preservation costs, including Brian Lavoie’s proposed economic models for digital preservation; Shelby Sannett’s research on cost models and cost frameworks; and the approach developed in the Netherlands (Oltmans and Kol) that provides a tool to compare the costs of migration and emulation over time. The most comprehensive cost formula for digital preservation was proposed by the LIFE project in 2006. These examples have contributed to a deeper understanding of digital preservation costs within the community, but do not equate to a comprehensive community document for the resources leg. Nor are organizations systematically collecting and sharing resource information.
 Figure 5. Integrating the organizational and technological legs of digital preservation.
The resources perspective considers the “what” and the “how” of digital preservation to determine the “how much” (represented by financial sustainability in Figure 4, developed for Cornell’s DPM workshop series). The resources leg is informed by the organizational context and tied to the technological implementation for an organization’s digital preservation program. Figure 4 illustrates the technological implementation expressed by OAIS within the organizational context expressed by TDR and the separation of financial sustainability within the organizational context for digital preservation.
The “To Be”
There has been progress in developing the resources leg, though two areas seem ripe for further development: the designation of funding by organizations for digital preservation and the definition of a community document that addresses resources.
Designating digital preservation funding Organizations are still struggling to secure resources for digital preservation. One of the research library directors interviewed for the recent Metes and Bounds report on e-journal archiving observed that digital preservation is a “just-in-case scenario, and this is very much a just-in-time operation.” (p. 11) Respondents to Cornell’s DPM workshop institutional readiness survey identified insufficient resources for digital preservation as the second highest threat to digital content after insufficient policies or plans. Survey respondents also identified a complicating factor in designating resources for digital preservation. It has been common practice for an organization to establish a digital preservation initiative by assigning a percentage of the digital preservation responsibility to several staff often located across an organization, making it difficult to consolidate or coordinate resources. The digital preservation community also needs a means for being transparent about resources, recognizing that specific details may include confidential or internal-only information.
Defining a community document for resources The “as is” examples of resource-related writings and developments for digital preservation (e.g., Lavoie, Sannett, Oltmans and Kol, and LIFE examples presented above) provide a starting point for defining a community document for resources. Common elements in TDR and OAIS include the definition of core concepts, the definition of roles and responsibilities, descriptions of the components and attributes, and the discussion of implementation issues with examples and/or recommendations. A productive first step for the community might be to consolidate and rationalize the resource issues and elements presented in the resource examples, then apply a gap analysis process to fill in missing elements. There have been few examples within the community of responses to these contributions to the strengthening of the resource leg of digital preservation.
Stabilizing the Three-legged Stool
Taking the three legs of the stool together, there are a number of indicators that the digital preservation community is coalescing and maturing. Communities by nature share common interests and objectives. Indicators of the development of the digital preservation community include accepted standards and practice and an increasingly effective communication network.
Standards and practice A decade ago there were no formal shared standards or practice for digital preservation. Today, we have OAIS, TDR, and PREMIS, for example. The sustainable formats website at the Library of Congress and PRONOM are contributing to the development of preservation strategies for classes of digital content. RLG DigiNews featured articles about PRONOM developments in the October 2003 and April 2005 issues. These examples reflect community practice as defined by representatives of archives, libraries, museums, and other cultural heritage institutions. Domain-specific developments, such as the Canadian Heritage Information network (CHIN) report on digital preservation for museums, have also contributed to the development of community-wide practice. In addition, the standards of our community are regularly supplemented by standards developments in other communities, including information technology, information security, telecommunications, and the Internet. We are moving towards more comprehensive codification of accepted practice, the promulgation of standards and practice through community channels, and the means to develop and maintain policies and procedures as needed.
Communication network A challenge for organizations that are engaged in digital preservation is to balance the time and resources devoted to developing the repository internally against monitoring the external environment for relevant developments, updates, standards, and warnings. The difficulty in keeping up with digital preservation developments is exemplified by a quick review of the RLG DigiNews August 2005 list of ten “watch this space” digital preservation research projects. Three of the project websites had updates and current information about the project that were fairly easy to locate. The current status of three of the project websites was unclear and the projects seemed to be stalled or abandoned based on obvious locations for updates and news on the websites. Three of the project websites had few or no updates since August 2005. It was possible to find results or presentations about the projects by searching, but it was difficult to confidently determine the current status. The URLs for two of the projects have changed and could not be easily found by searching. Of course, there are several possible explanations for that and the projects could be alive and thriving somewhere. One project website required logging-in. Requiring a log-in is not a bad thing, but logging in requires time and a bit more effort. If an organization is trying to track and follow a number of digital preservation developments, these examples represent potential barriers. The PADI website has provided an excellent information service to the digital preservation community for the past decade and other services contribute as well, but there is currently no “one stop shopping” for keeping up with digital preservation research and development. Keeping up takes effort, but it is worthwhile. The digital preservation community is active and offers many opportunities for organizations to participate, contribute, and learn.
“One participant [in the 2006 Best Practices Exchange] characterized a ‘community of practice’ as a flock of birds. Each bird may ultimately have a different end destination, but since they are flying in the same general direction, it is more efficient to fly together as a flock.” A fitting close to this anniversary review of the migration patterns of a community over the past decade. How far will we have gotten towards the “to be” by 2012 or 2017? Stay tuned…
Author's Addendum (7 May 2007): An alert reader contacted me about my list of digital preservation policy examples questioning the dates of some and the inclusion of another. I am submitting this brief response to correct and clarify my list. The reader wondered if I should have cited earlier dates for the National Library of Australia (NLA), the UK Data Archive (UKDA), and the Arts and Humanities Data Service (AHDS). After checking, I can report that 2001 is the correct date for the NLA digital preservation policy and 2004 is the date for version 1.0 of the AHDS digital preservation policy. Both of these institutions have been major contributors to digital preservation progress. An important caveat for the AHDS is that 2004 was the date of their first policy to address the preservation of the digital collections within their care; however, the AHDS developed an early strategic policy framework document (http://ahds.ac.uk/strategic.doc) in 1997 that reported the results of a study they conducted, including recommendations to the community on developing digital preservation policies. I should have cited the date for version 1.0 of the UKDA policy as 2003 and the date for the British Library policy as 2001. I included the Digital Library Sunsite policy because it is both a collection development and a preservation policy. It is an important early example of the definition of preservation levels for digital content and of a preservation policy that address Web content. An interesting thing about digital preservation policies is that even institutions that have been early adopters and pioneers in digital preservation often took a while to develop formal digital preservation policies. We should have many more policy examples that are readily available, though we should also be pleased with the progress we have made and continue to make. Thank you to the diligent reader and my apologies to the British Library and the UKDA for misdating their policies.
Notes [1] Anne R. Kenney and Nancy Y. McGovern, “The Five Organizational Stages of Digital Preservation,” in Digital Libraries: A Vision for the Twenty First Century, a festschrift to honor Wendy Lougee, 2003.
[2] Christy E. Allen, “Foundations for a Successful Digital Preservation Program: Discussions from Digital Preservation in State Government: Best Practices Exchange 2006,” RLG DigiNews, June 2006, Vol 10, No 3.
 |
 |
 |
 |
 |
 |
 |
 |
 |
Highlighted Web Site |
|
 |
 RLG DigiNews
Author: Richard Entlich - Cornell University (rge1@cornell.edu)
 |
 |
 |

For the past 10 years, the Highlighted Web Site (HWS) has provided a place to feature one or more websites of potential interest to our readers. In this, the last issue of RLG DigiNews under its original editorial mission, we are using this space to take a brief look back at RLG DigiNews itself.
Feature Articles
The two feature articles in this issue, ten year retrospectives on digital imaging and digital preservation, provide good overviews of the evolution of subject coverage in RLG DigiNews’ feature articles. The authors of those articles have represented a variety of institutions, including national libraries, archives, and museums, national and international consortia, universities from around the globe, commercial entities, and independent non-profit organizations. Although published in English, RLG DigiNews has always attempted to reflect the international nature of the research and development efforts it has explored. Included in that total have been 12 from the UK, 5 from Australia, 4 from the Netherlands, 2 each from Austria, Germany, and Norway, and one each from Canada, Japan, Finland, and Denmark.
Editor’s Interviews
The first editor's interview, an in-depth discussion with Kevin Guthrie about JSTOR, appeared in volume four of RLG DigiNews. Ten others followed, including noteworthy discussions with Brewster Kahle of The Internet Archive, Clifford Lynch of the Coalition for Networked Information, and Victoria Reich of the LOCKSS Program.
FAQs
This feature has appeared in nearly every issue of RLG DigiNews (54 out of 57 issues) and over time the content has moved toward more in-depth coverage. With this final issue produced by the editorial staff at Cornell, we can now come clean and acknowledge that many of the “questions” posed in the FAQ column were not only not “frequently asked,” but were often well out of the mainstream. That’s because they were chosen by editorial staff members and not submitted by RLG DigiNews’ readers.
Some of the esoteric topics covered included a piece on handwriting recognition (OCR for manuscripts) and an extended discussion of the science and engineering behind accelerated aging tests of removable storage media. A few topics have had broader appeal, including a detailed comparison of Web-based image search engines and an early thought piece on blog preservation.
Overall Trends
Issue length: Reflecting either a broadening focus on digital imaging and digital preservation over the years, the declining cost of digital storage, or the increasing verbosity of its editorial staff, the size of the average issue of RLG DigiNews nearly doubled over its first ten years. In 1997, the average issue contained about 60,000 characters of text. This increased to 110,000 characters/issue by 2006.
Appearance: As noted in the fifth anniversary issue of RLG DigiNews, the masthead has undergone several design changes over the years. However, the basic appearance of RLG DigiNews remained unchanged for most of its first five years. With the first issue of volume six, RLG DigiNews received a makeover featuring improved readability thanks to a change from serif to sans serif type. At the same time, improved layout and graphical content resulted from the addition of production staff with graphic design skills. Another major change in appearance occurred at the start of volume eight when production was shifted from Macromedia Dreamweaver to a CSS (Cascading Style Sheets) based content management system (CMS).
Topical emphasis: Most feature articles in RLG DigiNews were written by practicing professionals actively engaged in research on or application of digital imaging and digital preservation. Although the editorial staff invited articles on certain specific topics, the vocabulary employed by authors still provides some clues about topical trends over time.
As a crude quantitative measure of trends, we divided the text of the first ten years of RLG DigiNews into chunks containing three roughly equal groups of issues (1997-2000, 21 issues; 2001-2003, 18 issues; and 2004-2006, 18 issues). We then conducted frequency counts on a number of terms and phrases. The charts below indicate some selected terms that showed significant change, up and down, in frequency of occurrence within RLG DigiNews over ten years of publication. Whether the content of RLG DigiNews was merely reflecting existing trends or actually helping to influence those trends is for others to determine.

Notes: Term count for LOCKSS also includes CLOCKSS, term count for digitization also includes digitisation, term count for blog includes weblog(s), blogger, blogging, etc.
Highlighted Web Sites
It seems fitting to end this HWS with a brief look back at past selections and a look forward to the future of this one. The subject matter of the sites featured in the HWS has mirrored the migration seen in other parts of RLG DigiNews, from an emphasis on the mechanics of digitization (tools, techniques, standards, file formats, resources) as well as exemplary collections of digital images, to coverage of larger digital content issues, especially digital preservation, intellectual property, repository development, and open source. A total of 68 sites have been included over the course of 57 issues.
Some of these have stood the test of time very well. The Library of Congress’ American Memory Project website was cited in our very first issue as “the key contribution of the Library of Congress to the National Digital Library.” Today, American Memory (which can still be accessed through the original URL via a redirect) reports that “The National Digital Library exceeded its goal of making 5 million items available online by 2000” and that “American Memory will continue to expand online historical content as an integral component of the Library of Congress’ commitment to harnessing new technology ...”
On the other end of the longevity spectrum is a quite recent HWS, the Digitize Everything blog, which we highlighted in the April 2006 issue along with a group of other blogs, just a few months after its first post. However, on August 30, 2006, Digitize Everything announced that it was closing up shop, and both it and the parent digiwik.org site are now gone. Overall, 20 of the 68 (nearly 30%) Highlighted Web Sites are either gone or at least no longer accessible via the original URL.
The Future of Our Past
Given that digital preservation has been a major focus of RLG DigiNews during its first decade, it seems appropriate to ask how well the HWS for this issue (i.e, the existing archive of RLG DigiNews itself) is likely to fare over the long term. Thus far, the back run of RLG DigiNews has been maintained on the RLG website and archived through the Internet Archive’s Archive-It program, as well as the OCLC Digital Archive. A number of individual articles that have been deemed “high quality” and that meet other selection criteria have been given the “Safekept” designation by the National Library of Australia’s PADI (Preserving Access to Digital Information) program.
These steps may or may not prove adequate to ensure the long-term survival of the contents of RLG DigiNews. Only time will tell. What we are sure of is that digital preservation remains a vibrant and vital area of concern for libraries, archives, and all who value the historical record. Whether or not our efforts will ultimately be found to have been of enduring value, we are proud to have made a small contribution to this important area of endeavor, during a key period of its development.
 |
 |
 |
 |
 |
 |
 |
 |
 |
FAQ |
|
 |
 Copyright Keeps Open Archives and Digital Preservation Separate
Author: Peter B. Hirtle - Cornell University (pbh6@cornell.edu)
 |
 |
 |

I have read that if I publish with a “green” publisher or use one of the author’s addenda, my articles can be preserved in an open access digital repository. Is this true?
The short answer: probably not.
There has been growing interest in the development of open access repositories for scientific literature during the past decade. Open access literature – defined by Peter Suber, one of its primary proponents, as “digital, online, free of charge, and free of most copyright and licensing restrictions” – holds out the promise of fostering the communication and exchange of ideas that lies at the heart of the scientific endeavor. Open access repositories are primarily built through what is called self-archiving: the authors of papers deposit in a repository either a version of the paper prior to refereeing (a “pre-print”) and/or the version that includes the changes made in the refereeing process (the “post-print”). A publisher that will allow pre-print and post-print archiving by authors has been designated as “green” by the Sherpa RoMEO project on publishers’ copyright and self-archiving policies.
Open access and self-archiving are important tools in enhancing access to current research – enough so that some funding agencies and institutions are now requiring that publications from all funded research be made freely available after a brief period of time. In the U.K., the Wellcome Trust and the Medical Research Council (MRC) have ordered that the final copies of all research they fund be made freely available no later than six months after the journal publisher's official date of final publication, and the Biotechnology & Biological Sciences Research Council (BBSRC) has mandated that publications from research it funds after 1 October 2006 be deposited in “an appropriate e-print repository.” Research Councils U.K. (RCUK) has encouraged the other United Kingdom Research Councils to consider deposit of funded research in an open access repository. In the U.S., two efforts in 2006 attempted to mandate open access for some government-funded research. The proposed appropriations bill for NIH was modified in committee to mandate the deposit of copies of all NIH-funded research in an open access repository within twelve months of publication. In addition, Senators John Cornyn (R-TX) and Joe Lieberman (I-CT) introduced the Federal Research Public Access Act of 2006 (FRPAA), which would have required that peer-reviewed research funded by the largest federal research agencies be deposited and made openly accessible in digital repositories within six months of publication. One can anticipate that both of these initiatives will be reintroduced in 2007; a petition urging the reintroduction of FRPAA is available here.
Given that it would appear that more and more funded research is going to find its way into open access digital repositories, an obvious question is whether libraries can rely on those repositories to preserve that information. Unfortunately, they cannot, for at least two reasons.
First, as has long been recognized, open “archives” are primarily concerned with providing open access to current information – and not the long-term preservation of the contents. Most lack the technical, organizational, and financial support required for a true digital preservation program. In its draft position statement on access to research outputs, Research Councils UK noted the distinction:
RCUK recognises the distinction between (a) making published material quickly and easily available, free of charge to users at the point of use (which is the main purpose of open access repositories), and (b) long-term preservation and curation, which need not necessarily be in such repositories…. [I]t should not be presumed that every e-print repository through which published material is made available in the short or medium term should also take upon itself the responsibility for long-term preservation.
Similarly, the Cronyn/Lieberman bill did not assume that institutional or subject-based repositories would be able to preserve research articles. Instead, it required that long-term preservation of the research articles be done either in a “stable digital repository maintained by a Federal agency” or in a 3rd-party repository that meets agency requirements for “free public access, interoperability, and long-term preservation” (with the implicit recognition that not all 3rd party repositories would meet the requirements for long-term preservation).
Second, and more troubling, is that the agreements that make it possible for authors to deposit articles in an open access repository do not necessarily also convey the rights needed by the repository to preserve and make available digital information over time.
Digital preservation, by its very nature, must impinge upon the rights of the copyright owner.[1] In order to be kept alive and usable, digital files need to be copied and recopied; this potentially infringes on the copyright owner’s exclusive right of reproduction. In addition, as software and hardware changes, files will have to be migrated into new formats or new versions; this may infringe on the copyright owner’s exclusive right to make derivative versions of the original work.[2] As most RLG DigiNews readers know, there is no general preservation exemption in US copyright law. Preservation copying and reformatting activities undertaken without the explicit permission of the copyright owner can only be done in very limited situations.
The model DSpace distribution license signed by authors recognizes that permission of the copyright owner is needed to preserve material over time. It stipulates the following:
- You agree that MIT may, without changing the content, translate the submission to any medium or format for the purpose of preservation.
- You also agree that MIT may keep more than one copy of this submission for purposes of security, back-up, and preservation.
- You represent that the submission is your original work, and that you have the right to grant the rights contained in this license (emphasis added).
As the emphasized text above notes, the self-archiver must have the right to authorize DSpace (or other repositories) to make copies and reformat submissions. Prior to submission to a journal, an author would have that right. When copyright is transferred to a publisher, the publisher must then authorize the author/self-archiver to grant those rights. Yet in the typical copyright transfer agreement of even a “green” publisher, the explicit right to license preservation activities to DSpace is sorely lacking. Elsevier, for example, will allow you to keep a copy of a preprint on an institutional server “indefinitely,” but is silent on whether that version can be modified. In general, an author in the Elsevier agreement does not have the ability to grant third parties the right to copy or modify the work. The American Institute of Physics allows the author the right to “post and update the Article on free-access e-print servers,” but there is nothing in the agreement with the author that suggests he or she can grant the right to update an article as formats become obsolete to a third party (such as an organization managing the e-print server). Depending on the precise terms in the agreement, an author granting the rights required by deposit in DSpace may actually be a violation of the copyright transfer agreement with the publisher and consequently put the author of the article at risk of a suit for contract infringement.
In sum, most if not all of the “green” publishers only authorize the primary purpose of self-archiving: current and immediate access to article literature. None of the agreements I have examined explicitly authorize the rights necessary to ensure long-term continued access to the deposited literature. Authors can license those rights prior to copyright transfer to polishers, but they must ensure that previous grants of rights are not in conflict with the transfer agreement and that they are explicitly authorized to grant needed rights for post-prints. 
Are authors who attach an author’s addendum to their copyright transfer agreement any better able to grant the needed permissions to the repository?[3] In some cases, the answer is yes. For example, in both the SPARC Author’s Addendum and in the Scholar's Copyright OpenAccess-CreativeCommons 1.0 Addendum the author retains the right to authorize third parties (such as an open-access repository) to make limited non-commercial use of the article. Both agreements guarantee that the author will have the necessary authority to grant the rights required in the DSpace agreement. The Scholar’s Copyright addendum has the added benefit of protecting authors against any publishing clauses that restrict or forbid previous grants of rights (such as when a pre-print is posted, prior to a copyright transfer agreement).
Authors who use either the SPARC or Scholar’s Copyright addendum are likely to be able to grant to the open access repository the rights it needs in order to be able to preserve the digital files over time. (Whether the repository will technically, organizationally, or financially be able to do so is another matter.) Authors who submit material based on self-archiving provisions in publisher contracts are unlikely to be able to grant the rights needed. Self- archiving for other than immediate access may actually place the author and the open access repository at legal risk. How much risk is involved is difficult to say. Recently, it was revealed that the American Association of Publishers (AAP) has hired a very aggressive public relations firm to lead a campaign against open access. Furthermore, the AAP has left open the possibility of legal action against a university library on a different matter. It is not inconceivable that a legal attack upon long-term preservation in self-archives may be part of future anti-open access campaigns.
Open access archives can be a valuable tool in making information immediately available. With time, the license terms that permit self-archiving may mature to explicitly permit digital preservation of the files as well as third party use of the archived material (the other great lacuna in the current agreements).[4] For now, however, libraries will need to rely on the published journal literature for the long-term preservation of scholarly information. And, as library directors concluded in our recent report, E-Journal Archiving Metes and Bounds: A Survey of the Landscape, only journals that are part of formal third party journal archiving programs can be said to be effectively preserved. In sum, libraries cannot yet rely upon open archives for long-term access to the journal literature.
Notes
[1] On copyright issues in digital preservation, see Peter Hirtle, “Digital Preservation and Copyright”, Catherine Ayre and Adrienne Muir, “The Right to Preserve: The Rights Issues of Digital Preservation,” D-Lib Magazine 10:3 (March 2004), and Adrienne Muir, “Digital Preservation: Awareness, Responsibility and Rights Issues,” Journal of Information Science, 30:1 (2004): 73-92.
[2] In theory, displaying the files or performing audiovisual files could also be infringements, but it seems likely that these actions would be covered by the implicit license to self-archive.
[3] An author’s addendum is a standardized legal instrument that modifies the publisher’s agreement and allows the author to keep key rights. For more on the available addenda, see Peter Hirtle, “Author Addenda: An Examination of Five Alternatives,” D-Lib Magazine 12:11 (November 2006).
[4] Except for the Scholars Copyright addendum, which explicitly gives authors the right to grant a Creative Commons license to use the material, the self-archiving agreements are silent on third party use. In the absence of explicit user permissions, it could be argued that a faculty member would not legally be able to include a link in a course syllabus to a published article in an open access repository. Such linking could be seen to be systematic, and systematic copying is normally not allowed under fair use. The issue of expressing user permissions in open access archives is discussed in Elizabeth Gadd, Charles Oppenheim, and Steve Probets, “The Intellectual Property Rights Issues Facing Self-archiving: Key Findings of the RoMEO Project,” D-Lib Magazine 9:9 (Sept., 2003).
 |
 |
 |
 |
 |
 |
 |
 |
 |
Calendar of Events |
|
 |

 |
 |
 |

School for Scanning May 1 - 3, 2007 Minneapolis, Minnesota
Presented by the Northeast Document Conservation Center (NEDCC) and co-sponsored by the Midwest Art Conservation Center, the three-day School for Scanning examines digitization from theory into practice. The curriculum is geared for participants with a beginning or intermediate level of digital knowledge and provides those with experience in digitization an up-to-date briefing. School for Scanning is celebrating its eleventh year. Registration has been extended to April 23.
Document Imaging and Document Management May 4 - 6, 2007 Los Angeles, California
This course will be offered by the UCLA Extension. Topics to be covered include scanning technology; a wide range of image and document formats; system design issues in hardware, software, ergonomics, and workflow; and Internet, intranets, and extranet links.
MetaArchive Distributed Digital Preservation Workshop May 30 - June 1, 2007 Atlanta, Georgia
This workshop will provide information and training for institutions that seek to build or join distributed digital preservation networks based on the LOCKSS software.
International Web Archiving Workshop June 23, 2007 Vancouver, Canada
The seventh International Web Archiving Workshop will be held in conjunction with the ACM IEEE Joint Conference on Digital Libraries. The Call for Papers is out and the deadline for submission is May 1, 2007.
iPRES2007 October 11 - 12, 2007 Beijing, China
iPRES 2007 will be organized by the National Science and Technology Library of China and hosted by the National Science Library, Chinese Academy of Sciences. The theme for this year’s conference is “Digital Preservation: Sustainable Programs and Best Practices” and programming will focus on management, operation, and new directions in digital preservation. Abstracts for conference presentations will be accepted through June 15.
A Race Against Time: Preserving Our Audiovisual Media October 24 – 25, 2007 Cleveland, Ohio
The Conservation Center for Art and Historic Artifacts “offers educational programs throughout the year to provide training in a variety of collections care activities to support a preservation program for cultural collections.” This workshop will include lectures, discussions, and hands-on experiences aimed to expose participants to the basic principles for managing historical audiovisual collections on a variety of media such as videotapes, audiotapes, motion picture film, film strips, LPs, 78s, magnetic tape, wax cylinders, and audiocassettes.
 |
 |
 |
 |
 |
 |
 |
 |
 |
Announcements |
|
 |

 |
 |
 |

Trustworthy Repositories Audit & Certification (TRAC)
CRL and RLG-OCLC have released a revised and expanded version of the Audit Checklist for the Certification of Trusted Digital Repositories. The revised version is entitled Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) and is available online.
Around the World in Two Billion Pages
With grant funding from the Mellon Foundation, the Internet Archive will be attempting its largest Web crawl this summer: a two billion page global snapshot of the Web. Internet Archive staff are currently asking libraries, archives, and other cultural and memory institutions to submit urls to include in the crawl. They are specifically seeking international Web content from a large variety of countries, geographic regions, and language bases.
DRAMBORA Released
The Digital Curation Centre (DCC) and DigitalPreservationEurope (DPE) have announced the release of the Digital Repository Audit Method Based on Risk Assessment (DRAMBORA) toolkit and tutorials. The toolkit provides digital repository managers “…a means to assess their capabilities, identify their weaknesses, and recognise their strengths.”
Call for Centers of Digital Curation and Preservation
DigitalPreservationEurope (DPE) is compiling a list of “international competence centres for digital curation and preservation activity and expertise” for a report that will be used to inform the European Commission of the current landscape. DPE is calling for submissions of projects and institutions to include in the report. Organizations that are interested in joining the DPE list are asked to complete an online form on the DPE website.
CDP Merges into BCR
The Collaborative Digitization Program (CDP) and the Bibliographical Center for Research (BCR) have merged their efforts to become CDP@BCR. The CDP is a nationally recognized digitization program and BCR is a multistate library cooperative serving more than 1,100 member libraries. Both organizations are based in the western US. Together they aim to “build new service programs in the area of digitization and reach out to cultural heritage organizations in addition to bringing CDP-based training, best practices and guidelines, and consulting services to member libraries.”
Implementing Persistent Identifiers
A new report, “Implementing Persistent Identifiers: Overview of Concepts, Guidelines and Recommendations,” has been released from the Research and Development Department of the Göettingen State and University Library. The report explains the concepts of persistent identifiers including Handles, Digital Object Identifiers (DOIs), Archival Resource Keys (ARKs), Persistent Uniform Resource Locators (PURLs), Uniform Resource Names (URNs), National Bibliography Numbers (NBNs), and the Open URL.
CLOCKSS is Awarded Outstanding Collaboration Citation
The Association for Library Collections & Technical Services (ALCTS) has announced that the CLOCKSS initiative is the inaugural winner of the ALCTS Outstanding Collaboration Citation.
New Digital Preservation Mailing List
The American Library Association (ALA) has established a new mailing list: Digital Preservation or digipres for short (digipres@ala.org). The list has been active for a little over a month and boasts over 800 subscribers. You can view the archives or subscribe to the list at the link above.
Seeking Comments: ALA’s Draft Principles for Digitized Content
The Task Force on Digitization Policy of the American Library Association’s Office for Information Technology Policy has posted for review and comment its “Draft Principles for Digitized Content.” The goal of this document is to “succinctly voice the primary policy areas that can guide libraries as they make decisions regarding digitization.” Comments can be posted on the project blog until May 1, 2007. The Task Force plans to present the document for approval by ALA Council at the 2007 annual conference in Washington, DC.
 |
 |
 |
 |
 |
 |
 |
 |
 |
RLG News |
|
 |
 Marking a Significant Past and New Opportunities
Author: Robin L. Dale
 |
 |
 |

This issue does not represent the end of an era, but certainly marks some significant bits worth noting:
- It represents the end of a very successful, collaborative relationship with Cornell University’s IRIS Department and the fantastic people who have worked with us over the last ten years. I’m honored to have worked with Anne R. Kenney, Nancy McGovern, Oya Y. Rieger, Richard Entlich, Ellie Buckley, Peter Hirtle, and a host of others who have played key roles over the years. They are such a dedicated, talented, and intelligent group of people and it's been a privilege.
- It represents ten years of relevant and [relatively] easy to digest articles of content that served to keep the community informed of developments, events, and announcements. To us, it’s always been interesting, information, and occasionally entertaining.
- It represents an advancement of the field. The original impetus was to document projects, developments, and research about digitization, providing a space for news and continuing education for “managers of digital initiatives.” Over the years, DigiNews documented advances in theory and in practice, so for example in year four, we expanded the focus to also include digital preservation. So from a humble beginning with an issue that talked about new tools that would enable us to digitize books without disbinding them(!), we’ve moved to an environment where digital cameras are busily scanning thousands of books through several mass digitization projects and this final issue recaps advances in both digitization and digital preservation.
- It represents an opportunity…an opportunity to use our skills in a variety of new projects that leverage our acquired knowledge and previous investments in digital collections. We have the knowledge to contribute to the future work in mass digitization, as well as the digitization (“mass” or otherwise) of our incredibly special and unique materials. We need to contribute to the new collections, as well as think about how we can utilize aggregations of existing digital objects. And we need to continue assisting researchers and life-long learners their own personal research collections they are rapidly building.
In the coming months, OCLC Programs and Research will begin to address many of these issues through our new work agenda. We will also work to create and provide an environment to document and convey information about new areas that are increasingly important for libraries, archives, and museums. This new information resource is likely to cover key collaborative work falling into the following areas: Curating the Collective Collection, New Modes of Research, Teaching and Learning; Renovating Descriptive and Organizing Practice; and New Service Infrastructures. Stay Tuned.
Again, we want to acknowledge the strong readership and support of this publication. DigiNews was successful because of your contributions and readership. Your ideas and suggestions for the new publication, as always, would be welcome. Thank you.
 |
 |
|
 |