
Introduction
In 2003, the U.S. National Endowment for the Humanities (NEH) awarded Cornell University Library a grant to develop a digital preservation management training program, consisting of intensive weeklong workshops and the creation of an online tutorial. In 2005, NEH provided a second grant to continue the program through 2006. Seven workshops have been held, reaching 164 individuals from 110 institutions. Prior to each workshop, participants were asked to complete an online survey designed to assess their institution’s readiness to develop and maintain a digital preservation program. This article reports on the aggregate findings from this survey and offers comparisons where possible to RLG’s 1998 survey on the status of digital archiving in its member institutions. A comparison of findings from these surveys provides some means for assessing change in institutional response to digital preservation over an eight-year span.
Much has occurred in the interim to raise the visibility of digital preservation concerns at the institutional level:
- dominance of digital production for almost all forms of communication including scholarly content and institutional records
- ubiquity of the Internet and broadband communication—and the attendant vulnerability due to security concerns and the ephemeral nature of Web-based content
- increasing shift to e-only access
- massive digitization efforts to bring analog materials onto the Internet
- development of key research, standards, open source software, and legal requirements to support digital preservation
- mind-numbing changes in the hardware, software, media, and formats used to create and deliver information
Have these factors affected institutional responses to digital preservation needs? By no means definitive, the results from the Cornell survey do offer a picture of the growing trend by institutions to address these needs, as well as point to some of the areas requiring additional work.
The Cornell Survey
The survey of institutional readiness was designed to prepare participants for the Digital Preservation Management workshop by encouraging them to consider their organization’s digital preservation efforts in terms of scope, priorities, resources, and overall readiness. The survey itself was divided into three main sections, reflecting our approach to digital preservation training: organizational infrastructure, technological infrastructure, and resources framework.
Not all participants completed the survey, but a significant number of them did. Over the past seven workshops, we received surveys from individuals representing 100 distinct institutions, for a response rate of 90% overall of institutions that sent staff to one of the workshops. Fourteen institutions sent participants to different workshops. We included these survey responses as separate data points in the results to reflect the level of progress that had been made in the interim. The tally of institutions included in the survey results, therefore, is 114, greater than the total number of distinct institutions. The number of institutions counted in this report represents 82 respondents to the 2003-2004 survey and 32 to 2005 survey.
Participants came from across the United States as well as eight other countries. Half came from academic institutions and over a quarter represent government institutions (international, national, state and local), including five national libraries. Table 1 provides an institutional breakdown. “Other” includes consortial, corporate, foundation, public school, and religious entities.
|
Type of Institution |
Number |
Percentage |
|
Academic Library |
57 |
50% |
|
Government |
31 |
27% |
|
Institute/Museum |
6 |
14% |
|
Public Library |
4 |
5% |
|
Other |
16 |
4% |
|
Total |
114 |
100% |
Table 1. Survey respondents by institutional type.
Survey caveats:
Although many institutions are represented in this survey, there are some caveats to consider in reviewing the findings. First, although the participants represented those charged with some level of digital preservation responsibility at their respective institutions, individual assessments do not necessarily represent official responses. For instance, we offered participants the option to reply “don’t know” (or leave blank) to most of the questions. Presumably a formal institutional response would not include that response. It is also worth mentioning again that participants completed the surveys before coming to Cornell for the workshop, and it’s quite possible that some of their responses would have changed if they had responded after the workshop. Second, for the first three workshops, all participants were asked to complete the survey, including those coming from the same institution. Because the results showed some disparity in responses from participants from the same institution, we limited the survey submissions to one response per institution, beginning with the fourth workshop, held in July 2004. We asked participants from the same institutions to prepare a consensus response. Third, the survey instrument was not intended to capture all information reflecting an institution’s commitment to digital preservation, only indicators of efforts in key areas. Fourth, we used the same survey form for the first five workshops held in 2003-2004 (August and October 2003; May, July, and November 2004). In 2005 we slightly modified the survey form before each of the workshops (May and July) to capture additional information in response to new developments and to provide greater clarity to some of the questions asked. The current version and the 2003-2004 version are posted on the workshop website. Fifth, institutions that sent participants to the workshop were presumably fairly motivated by their concerns over digital representation. The results therefore may present an information bias and should not be construed as a representative sample of any institutional type as a whole.
Survey Findings
Participants submitted their institutional readiness surveys online. Cornell entered the responses into a spreadsheet and analyzed them before each workshop. Some data recoding was necessary in the case of invalid responses and in order to combine multiple responses from the same institution. Responses from the five 2003-2004 workshops (82 institutions) were aggregated and collectively compared. Similarly, responses from the two 2005 workshops (32 institutions) were aggregated for comparison. Due to changes in the survey form itself, it is difficult to compare several questions from the 2003-2004 data to the 2005 data, and some information is only available for 2005. This discrepancy accounts for some variation in years spanned for the reported findings below. It should also be noted that not all of the survey questions are presented in this article.
Organizational Infrastructure
Insufficient attention has been paid to the institutional context in which digital preservation programs must be developed. One measure of organizational readiness is the development, adoption, and implementation of policies that address digital preservation commitments and decisions. Too often, an organization undertakes responsibility for digital stewardship without first ensuring that the necessary policies and controls are in place or that the institution itself views digital preservation as a core mandate. In the 2003-2004 survey, only 26% of them reported that their organizational mission statement explicitly committed them to the long-term preservation of valuable digital materials that they had acquired or created. In 2005, the question was rephrased to ask whether institutional mission statements could be interpreted as supporting long-term preservation: 63% of the 32 institutions reported in the affirmative, 28% said no, and 9% didn’t know. The first challenge for those responding “no” will be to ensure that a strong institutional case for digital preservation can be made: without it the chances for maintaining a program over time will be low.
Asked whether the institutions had any policies and practices that covered long-term access to digital content (as opposed to those covering digitization), 52% indicated that they did, 36% did not, and the remaining did not know. This finding is consistent with that of the RLG Survey of 1998, when half of the institutions with digital holdings reported they had policies for managing them. However, in the RLG survey, these policies included guidelines for acquisition and digital conversion activities. It would be inaccurate, therefore, to conclude that little progress has been made in the development of policies specifically related to digital preservation. Indeed a Digital Library Federation survey in 2001 of its member institutions reported that only 33% of those responding (7 out of 21) had a formal preservation policy. A 2002 preservation survey, by the Council on Library and Information Resources of libraries at leading liberal arts colleges, land grant institutions, and mid-sized universities, indicated that only 6% of respondents (4 out of 67) had developed a preservation plan for digital resources.
Developing policies is a good first step, but they must be vetted and approved at the senior management level, and then implemented for a digital preservation program to develop effectively. Figure 1 indicates that only about one third of the surveyed institutions have completed all three steps.
Figure 1. Comparing the Availability of Policies to the Percent Vetted and Implemented, 2003-2005
In the 2005 survey, we included questions on the development of specific policies covering the following:
- stakeholder roles and responsibilities
- selection, de-selection, and acquisition
- quality creation requirements and procedures
- deposit guidelines
- transfer requirements
- preservation strategies
Figure 2 presents the responses to questions on these topics from the 13 institutions in the 2005 survey that indicated they had policies in place. Over half of these have developed policies relating to stakeholders, selection, and quality requirements. Fifty-four percent have defined digital preservation strategies, compared to 39% reporting guidelines for preservation action in the RLG survey. However, over 40% responded that they did not have formal policies covering deposit guidelines or transfers requirements. These responses indicate key areas needing attention at the institutional level.
Figure 2. Comprehensiveness of Digital Preservation Policies in 13 Institutions, 2005
Technology Infrastructure
Organizations tend to rely on or create digital content first and address long-term access issues later. The technology section of the institutional readiness survey addressed digital content, data management practices, storage, preservation actions, and depository development.
Figure 3 reflects the diversity and pervasiveness of content types managed by the institutions surveyed. Of the 11 digital object types mentioned in both surveys, only one, GIS files, was held by under half of the institutions (45%). Over ninety percent of institutions held websites, digital images, and PDFs; over 85% held word processing files and databases/spreadsheets; over 70% held audiovisual digital content. This represents a significant increase in the prevalence of content types over those reported in the RLG survey. In that survey only 36 of 54 institutions reported digital holdings. Of those, 55.6% reported word processing files, 50% reported audio files, 38.9% video, and 38.9% spreadsheets. Both our survey and the RLG study also looked at the number of formats that an institution needs to maintain. RLG found that the majority of institutions maintained at least six different formats. In the Cornell survey, on average, each institution affirmed responsibility for maintaining 9 of the 11 formats mentioned.
Figure 3. Common Digital Object Types Managed by Institutions, 2003-2005
Digital storage is a significant factor in digital preservation programs. We asked institutions to report the kinds of storage they were using, and Figure 4 reflects the breakdown. Nearly 90% of all institutions reported using optical/magneto-optical disks and 85% reported using online storage. In 2005, we asked participants to distinguish between access copies, master files, and backup in terms of storage options. Figure 5 provides those results. We expected to find that access copies were stored online. What surprised us was the percentage of masters and even backup copies that were maintained online. Similarly, we were intrigued to note that 63% of institutions in 2005 still maintain access copies on a removable storage device (CD, DVD, etc.).
Figure 4. Storage Media Used, 2003-2005
Figure 5. Storage Options Used by Function, 2005
The next set of questions in the survey covered various aspects of the file management program that would support digital preservation, such as the use of high quality media, backups, security, and the like. Responses indicated that most institutions practice good storage management processes in terms of redundancy, media used, and storage location. Figure 6 compares the responses from the 2003-2004 to those in 2005, indicating a slight trend toward better file management practices. For instance, in 2005, 75% reported that they provided an environmentally controlled location for digital content compared to 43% in 2003-2004, a 32 percentage point increase. Fifty-nine percent had disaster recovery plans in 2005 compared with 28% in 2003-2004, a 31 percentage point increase. Neither group, however, has done much to develop a media testing or refreshing/migration program. This is clearly an issue that institutions should address as part of their digital preservation program.
Figure 6. File Management Storage Practices, 2003-2005
One of the greatest threats to continuing access to digital content is the rate of obsolescence of file formats, storage media, and the supporting hardware/software to access and use digital objects. In the 1998 RLG study, 41.7% of the institutions reported that they lacked the operational/technical capacity to mount, read, and access some digital materials in their holdings. In 2005, 44% of institutions participating in the Cornell survey reported that same difficulty, although the actual figure may be higher: 28% of respondents left the question blank or indicated they didn’t know.
Responding effectively to obsolescence is an absolute imperative for a digital preservation program. But how many institutions have already risen to that challenge? In the Cornell survey, over 50% of all institutions (2003-2005) had already undertaken actions to extend the life of threatened digital content. Figure 7 breaks this down in terms of specifics for the 2005 respondents: the most common action reported was keeping pace with the changes in storage media.
Figure 7. Actions Taken to Extend the Life of Digital Content, 2005
Digital repository development is a relatively new phenomenon, but those that specifically address digital preservation are of very recent vintage. Two key documents have spurred this activity. The first is the publication of the final version of the Reference Model for an Open Archival Information System (OAIS) in January 2002. Development of OAIS began in 1995 and OAIS became an international standard, ISO 14721, in 2003. The second is the Trusted Digital Repositories: Attributes and Responsibilities, an RLG-OCLC report, published in May 2002. Prior to 2002, the digital preservation community lacked foundation documents like these to serve as the basis for institutional programs and to enable the effective exchange of information and developments between institutions.
We asked institutions to report whether they had established any kind of depository arrangements for managing their digital collections and, if so, how. Over one-third of all institutions reported that they had. Nearly forty percent indicated that their institution was committed to the development or use of one that was OAIS compliant, although another 39% did not know or left the question blank. Figure 8 indicates the type of arrangements for depository development these institutions have chosen. Most are developing in-house solutions—creating their own software or relying on such developments as DSpace and Fedora—rather than contracting with third party services or making consortial arrangements.
Figure 8. Choices made by institutions establishing depository arrangements, 2003-2005
Resources Framework
Once the need to establish a digital preservation program is recognized and there is the will to do so, institutions must be ready to build and sustain the program. This requires the ongoing commitment of resources: financial, human, technical and other.
The survey form used in 2003-2004 asked respondents whether their institutions currently had set aside funding or made an ongoing commitment to the long-term maintenance of digital collections. In 2005, the question was rephrased to ask whether there was funding dedicated for the long-term maintenance of digital collections. A little over one-third of the institutions reported that they did. Many more wrote of relying on one-time monies or grant funds to support the program. An unsettling number of respondents did not know whether there was any ongoing support for the program.

Figure 9. Sustainable Funding for Digital Preservation, 2005
In the 2005 survey we included questions on human resource commitment to digital preservation. First we asked whether there were staff members specifically charged with digital preservation responsibility. 59% responded that there were, 38% replied there were not, and the rest did not know. In addition to staff, organizational and technical expertise is needed to build and sustain digital preservation programs. Impressively, the majority of institutions were confident that they possessed both. This is in sharp contrast to the findings of the 1998 RLG Survey where the lack of staff expertise was a common problem: under 25% of institutions holding digital content ranked their staff as expert. Organizational expertise seems to be in shorter supply than technical expertise, as highlighted in Figure 10. Expertise can be gained through training, and 44% of institutions reported that training was adequately supported. It can also be supplied by outside experts: 34% of institutions participating in the 2005 workshops indicated that they were currently using outside experts.
Dedicated staff and the requisite level of expertise are critical human resources. Neither will turn a burgeoning initiative into an ongoing effort without strong administrative support at the top. Close to half of the institutions reported that their senior management views digital preservation as a key priority. Figure 10 highlights the human resource commitment of institutions participating in the 2005 survey.
Figure 10. Human Resource Commitment, 2005
Overall, survey respondents seemed optimistic about the resources dedicated to their technological infrastructure. Fifty-nine percent felt that their organization had sufficient hardware and software to build and/or sustain a digital preservation program, with requisite upgrades and enhancements overtime. Only 12% did not feel that the infrastructure was adequate (Figure 11).

Figure 11. Adequacy of Current Technological Infrastructure, 2003-2005
Conclusion
There is increasing evidence that cultural institutions are taking seriously the need to safeguard digital heritage materials—in large measure because they can no longer avoid the problem. The correlation between the acquisition of digital content and digital preservation practice was noted in the RLG survey. The prevalence of digital content and concerns over its continuing accessibility was documented in the Cornell survey as well, as were the actions taken to protect these assets. We’ve noted the development of policies, the growing awareness among senior managers, the commitment of resources, the adequacy of technical infrastructure, and direct practical experience with some preservation activities. Probably the biggest difference between the RLG survey and the Cornell survey was the shift in focus from technology concerns to organizational ones. In 1998, RLG member institutions ranked technology obsolescence as the greatest threat to the loss of digital materials. In 2005, respondents to the Cornell survey ranked it as the fourth in a list of five major concerns. Nearly twice as many of them cited insufficient policies and plans as the greater threat. Figure 12 lists the threats to digital content as ranked by twenty-two institutions.
Figure 12. Threats to Digital Materials, 2005
Cornell will continue to collect survey data through 2006. And, as announced in this issue, the Digital Preservation Coalition has launched the UK Digital Preservation Needs Assessment survey. Surveys such as these will help build a picture of the current state of digital preservation activity and identify the most pressing needs. This baseline information will help institutions and consortia alike in planning concrete steps to meet the greatest preservation challenge facing society today. At the end of each workshop, we asked participants what would increase their confidence levels in addressing digital preservation. The most common response was practical experience and hearing from others about what works and what doesn't.
i. In 1998, at the recommendation of the 1997-98 RLG Preservation Working Group on Digital Archiving, RLG funded a study of the status of digital preservation practices and needs in RLG institutions. Research by Margaret Hedstrom and Sheon Montgomery at the University of Michigan mapped concerns for and obstacles to digital preservation in our member libraries, archives, and museums. It outlined their current policies and practices for preserving digital materials, and captured their expectations and priorities for RLG.
ii.Seven institutions submitted two individual responses to the survey and one institution submitted three responses. When different responses were received for questions that could be answered as “yes,” “no,” or “don’t know,” Cornell counted the “yes” or “no” responses rather than the “don’t know” responses. In the cases that two participants responded with a “yes” and “no” respectively, the affirmative response was kept.
iii. In 2005, the question was rephrased to ask whether the institution had written policies and procedures that addressed specific aspects of long-term access.
