Maggie Jones, The Arts and Humanities Data Service
Background
Digital preservation has been given increasing prominence and priority over some time. In 1996, a specially commissioned U.S Taskforce on Digital Archiving1 published the final report of its work. The impact of the work of the Taskforce has been felt world-wide. In the U.K, it was a key influence in a workshop sponsored by the Joint Information Systems Committee (JISC) and the British library2. A series of research reports were commissioned by JISC and the National Preservation Office (NPO) which served to highlight various aspects of digital preservation3. The reports provided a broad overview of the issues and two in particular (Beagrie and Greenstein4 and Hendley5 ) recommended that further research be undertaken to explore the issues they raised in more detail. Both the JISC/NPO studies and a second workshop in digital preservation organised by the JISC and the British Library in 1999 at Warwick identified the need to improve guidance on digital preservation. At about the same time, a survey commissioned by RLG6 investigated the needs of member institutions. In addition, Government led initiatives such as the People's Network have resulted in substantial funding to provide high quality content via the New Opportunities Fund (NOF) digitisation program7. The e-government strategy provides further evidence of the determination to fully exploit the benefits of technology for more efficient and effective services8.
A clear picture emerged from these activities of a complex and rapidly changing environment in which those creating and/or acquiring digital resources would require guidance in how to manage those resources most effectively. Given the increasing reliance on digital access and rapidly accelerating digital resource creation, there is a sense of urgency in providing a tool aimed at fast-tracking knowledge and understanding of the longer term implications of the technology.
There is a rapidly increasing volume of information which exists in digital form. Whether created as a result of digitising non-digital collections, created as a digital publication, or created as part of the day-to-day business of an organisation, more and more information is being created digitally and the pace at which it is being created is accelerating. This activity is occurring in an environment in which there is a growing awareness of the significant challenges associated with ensuring continued access to these resources, even in the short term. There is also increasing acknowledgement of the crucial role of the creator of those resources in facilitating later preservation effort by others. At the same time a significant body of experience is emerging from research projects into digital preservation and from established data archives in the sciences and social sciences.
Given this conjunction of factors, it seemed timely to embark on a study which aimed to identify and promote good practice in creating and managing digital resources in order to maximise the potential of their initial investment.
In 1999 the Arts and Humanities Data Service (AHDS) submitted a proposal to the Library and Information Commission (which later became Re:source) as part of their Preservation of and Access to the Recorded Heritage Research Programme. The proposal aimed to build on work which has already taken place in identifying the broad issues and challenges associated with digital preservation, and to provide more detailed guidance to all those creating and/or acquiring digital resources. The AHDS has considerable experience in collecting and managing digital resources and has been active in providing guidance in creating digital resources for the arts and humanities. Many of the challenges associated with ensuring continued access to digital resources are identical regardless of how or where they are created, so it made sense to build on this practical experience and to aim at a wider audience. The project was awarded funding of £33,561 from the Library and Information Commission in June 1999, with contributions in kind from AHDS and the Joint Information Systems Committee (JISC), the Advisory Group, and participating case studies. The work was undertaken between July 1999 and September 2000.
An Advisory Group consisting of experts in the field of digital preservation was formed, all of whom have first hand knowledge of the range of complex issues involved. An early decision was that a workbook would be the most appropriate mechanism to provide the range of advice and guidance required for such a diverse audience. Research to compile the workbook combined traditional desktop research, utilising the World Wide Web as a source of freely available current information, as well as subscription-based print and electronic journals, supplemented with case studies and specialist interviews. Three very different case studies were selected to help develop the practical nature of the workbook and to ensure that it addressed key issues currently being faced by diverse organisations. Through structured interviews with selected specialists, workshops and conference presentations, and the case studies, it was possible to assess the overall level of awareness and understanding of digital preservation and to transfer that information to the development of the workbook.
In general, the study found that the level of awareness of and interest in digital preservation is gradually increasing but is not keeping pace with the level of digital resource creation. In particular, institutions that have not played a role in preserving traditional collections do not have a strong sense of playing a role in preserving digital resources. Individual researchers were keen to "do the right thing" but frequently lacked the clear guidance and institutional backing to enable them to feel confident of what they should be doing. The difficulties of allocating responsibilities for preservation and maintenance in an environment in which digital resource creation is frequently a by-product of collaborative projects, which may well be funded by yet another external agency, was also mentioned. Overall, it appears that there is still a need to raise the level of awareness of digital preservation, particularly among funding agencies and senior administrators with responsibility for the strategic direction of an institution. This needs to be combined with more detailed guidance and training at the operational level. Moreover, the guidance needs to be able to accommodate people with varying levels of awareness and understanding of digital preservation, in a wide range of institutional settings, all of whom have significant constraints on their time.
With these external influences, the workbook developed as a resource which could be used either in discrete sections or as a whole. It would bring together key existing sources of advice and guidance in one convenient package, as well as providing a focus for issues not yet resolved. The structure of the workbook has attempted to reflect different organisational hierarchies. The need to tailor the workbook to the needs of individual institutions, including those where digital preservation may be outsourced and those where digital preservation may only be short-term, means that the workbook needs to be seen as acting as a catalyst for further concerted action within and between institutions. A further project, this time funded by the British Library Co-operation and Partnership Programme, will provide the opportunity to test how the workbook translates into practice, within specific institutional settings.
Audience
Digital preservation has many parallels with traditional preservation in matters of broad principle but differs markedly at the operational level and never more so than in the wide range of decision makers who play a crucial role at various stages in the lifecycle of a digital resource. Consequently, the workbook is aiming at a very broad audience. In the first instance it is intended to provide guidance to institutions at national, regional and local levels who are involved in or contemplating creation and/or acquisition of digital materials. Within those institutions, the workbook is aiming at both administrators and practitioners and is accordingly structured to include a mix of high level strategic overviews and detailed guidance. In addition, the workbook is aimed at service providers who may be in a position to provide all or part of the services needed to preserve digital materials. It is also relevant to funding agencies who will need to be aware of the implications of creation of digital materials. Finally, it will be of interest to data creators whose involvement in the preservation of their digital publications is still crucial, despite being restricted by the overarching business needs of their organisation.
A project funded by the British Library Co-operation and Partnership Programme will take place between October 2000 and March 2001. This will continue the development of the workbook and involve collaboration between the British Library, the Universities of Oxford and Cambridge, and the National Preservation Office.
Major objectives of the Workbook
- Awareness raising
As mentioned earlier, this was regarded as still being very necessary and relating largely to the very different approach needed for digital preservation than for traditional preservation. The very word preservation tends not to resonate with people who tend to assume the traditional model in which this is a somewhat mysterious activity undertaken by highly specialised staff and at a much later stage than the more organic approach required for digital preservation. So this workbook will play a role in trying to reach a large number of people who may not to date have considered that digital preservation has anything to do with them. It is aiming at a very wide and diverse audience, basically all institutions engaged in creating and/or acquiring digital resources. Within those institutions, it is aiming at both senior administrators, those people who have influence in both the development of corporate policies and the allocation of resources, as well as middle level managers, basically those at the coalface who are actually needing to deal with this material.
- Collating existing relevant advice and guidance
There is a great deal of helpful information out there, especially in terms of creating digital resources, not least of course, the AHDS Guides to Good Practice series. But these resources are not necessarily readily accessible, especially to those people who are relatively new to this area. The sheer volume of information can be quite bewildering so an important aim is to help people navigate their way through the mass of existing information, making it simpler to find the resource which best suits their particular purpose.
- Providing advice and guidance
In addition to providing links to guidance documents which already exist, the workbook aims to provide guidance in the form of a decision tree, checklists, and tables which go through the stages in creating and acquiring digital resources, drawing attention to issues which may not necessarily have been thought of. A major theme running through the workbook is that it is much better to take decisions as early as possible, preferably at creation, to avoid the risk of losing access to the digital material at a relatively early stage.
- Empowering organisations to take action
This final aim is an important one, given the complexity of the issues and the speed of developments. It would be easy to simply defer developing any corporate policies and strategies relating to digital preservation until the whole scene has settled down and the results of research are known. This is an area where this workbook meshes quite well the Cedars project, which is conducting critically important research into preservation methods. In particular its work will be crucial in helping to identify the most appropriate long-term digital preservation strategies for particular categories of digital materials. However it is important for institutions to know that they can and should take some action now, they don’t need to wait until everything is resolved. Indeed there is likely to continue to be a state of uncertainty and rapid change for the foreseeable future, but that needn’t inhibit institutions developing an approach to creating and acquiring digital materials based on sound principles and policies. This approach will help to provide those materials with a significantly improved chance of survival.
Feedback on the workbook
A consultation period for peer review and assessment was provided between 8 August and 4 September 2000. A total of sixty-seven individuals, including the advisory group, case study interviewees, and a number of experts either in digital preservation or a related area of expertise, were invited to comment on the workbook. The invited respondents represented a very diverse geographic and sectoral constituency, in keeping with the broad audience the workbook is aiming at. A total of twenty-seven responses were received by the end of September 2000, representing around a 40% response rate. The overall quality of the responses was very high with several incredibly detailed commentaries. The value of this feedback, in terms of testing the effectiveness of the workbook and providing constructive suggestions for improvements, would be difficult to overestimate. We disagreed with a few comments while we felt others were valid but not feasible, certainly within the tight timeframe we needed to work to at the time. Though even when we disagreed with comments, they still stimulated us to clarify what we were trying to convey as we realised the message wasn't always as clear as we had thought. For the most part however, we did agree with the suggestions, which ranged from quite straightforward stylistic comments, to substantive proposals involving a fairly major re-assessment. Many of the suggestions were incorporated in time for the PDF version of the workbook (which at the time of writing is expected to be available on the AHDS website from November 2000). Some presentational suggestions were deferred to the hard copy publication, due in June 2001, while other comments were put on a temporary back burner, awaiting further development work at a later stage.
Major lessons from Feedback
- Definitions.
Digital preservation is an emerging discipline so unambiguous communication is difficult. People use the same terms differently across and between sectors. We accepted that the definitions provided in the workbook were unlikely to be universally accepted but we did at least ensure that they were consistently applied throughout the workbook.
- Guidelines vs guidance.
Some people wanted more concrete recommendations especially for formats. We did provide sources of detailed guidance on formats for specific subject areas or classes of digital objects but had deliberately avoided making global recommendations as:
a) it is difficult to be prescriptive in such a rapidly changing environment; and
b) there are many different scenarios and, depending on what's needed by the specific institution and their clientele, there may be perfectly reasonable pragmatic decisions made which are not necessarily the perfect preservation format (even assuming such a thing can be said to exist). Nevertheless, we accept that many institutions would find such advice useful and have taken the suggestion on board for possible future development. The approach we have in mind is developing a tool which will help to identify the most appropriate format for a particular purpose, as opposed to rigid rules.
- Too repetitive.
Some felt there were the same issues coming up repeatedly. This is a valid comment, but the original idea was that the workbook could be used in discrete sections and would appeal to a wide audience not all of whom would want to, or even need to, read through the whole text. The same issues therefore do tend to occur albeit in slightly different contexts. While some repetition was removed in the pre-PDF final editing, there are a number of key messages, which are reinforced throughout the document, which some may regard as repetitive. As the workbook develops, particularly as an interactive web publication, this issue may be revisited.
- Too scholarly!
Several pointed out its large volume of text and comprehensive nature made it quite inaccessible at times. Some suggested more graphics, though it was difficult to condense many of the issues into attractive but also comprehensible pictures. Others suggested good ideas for making the large amounts of text more easily digestible through better layout and using simple but effective devices such as footers to help keep track of where you are in the workbook. We adapted some of these ideas for the PDF version of the workbook and additional formatting will be incorporated into the print publication. Despite adding to the overall length, we feel it is important to include more practical examples to help illustrate key points in the workbook. These will be added to later versions of the workbook. A later project (the JCEI project discussed in more detail below) will develop a nested set of guides with Cedars which will be integrated with the workbook. These will be expected to be 2 sides of A4, compared with the 150 or so pages of the workbook, so will be appropriate for novices.
- Not enough information on costs.
Some felt this was the weakest part of the workbook, even while acknowledging the difficulties of finding reliable data on cost models. It is however the part most organisations feel they need help with, as their need to plan effectively is perceived to be largely dependent on accurate predictions of costs. We were unable to provide more than an indication of what cost elements need to be taken into account and felt unable to go beyond this. This was partly due to a dearth of empirical data on long-term costs associated with the full range of digital objects organisations might reasonably expect to need to manage. We also tried to draw a parallel with traditional preservation, where precise costs are also often difficult to pin down, though this has not necessarily proved to be an impediment to the development of effective preservation programmes. With digital materials, where the costs of providing current access often overlap with preservation costs, and the predicted future access demands are not always possible to forecast, it is unrealistic to fix exact costs. The workbook has been designed primarily as a training tool and it may be that this is an area which lends itself particularly well to using the workbook in conjunction with a workshop.
- UK Focus.
Overseas respondents drew attention to the fact that we had slipped into using specifically UK examples at times and we needed to be conscious of explicitly stating when we were doing this. Overall the issues are of course global in nature and we tried to reflect this more clearly in the final draft.
- Emphasis on digitisation.
This was a slightly contentious issue. A few respondents felt disappointed that we had placed so much emphasis on digitisation in general, and digital imaging in particular. Some implied this was something of a "cop-out" as they felt we had concentrated on "the easy stuff". One even suggested we had contradicted ourselves as we had taken some pains to clarify in the introduction that the emphasis in the workbook was on the preservation of digital objects however they come into being, and not in the potential use of digitisation as a preservation reformatting tool. We tried to clarify this issue in the final draft. Our major concern was to draw attention to the need to take account of digital preservation issues when creating digital surrogates. Since this is a major source of current activity, and one of the driving forces behind the decision to make the proposal in the first place, we felt we needed to give it prominence in the workbook too. In addition, there are excellent sources of existing guidance in this area, particularly for digital imaging, and we wanted to raise awareness of those too. Another point to bear in mind is that the workbook cannot be, and does not claim to be, cutting edge in its approach to technology. It is to a large extent reflecting what's currently happening but attempting to encourage good, if not best, practice.
- Overall message.
By this I mean the extent to which respondents took away the main message we wanted to deliver. We trod a fairly fine line between the need to acknowledge the very real challenges that still exist in digital preservation and the desire to encourage action by institutions in developing policy and procedures. One commentator felt we had focussed too much on the difficulties and not enough on the considerable progress which has already been made. Another felt that the sheer size of the workbook might in itself be alarming and therefore off-putting to some readers. In general the majority of respondents seemed to feel the overall tone was encouraging and genuinely helpful but we took care to try to correct any impression that digital preservation is either brand new or frightening. We believe the workbook will be most effective when it is used in conjunction with other training materials and workshops and this will help to clarify any potential mixed messages.
- Need for maintenance.
Many respondents mentioned the need to maintain the currency of the workbook, particularly important given the speed of developments. This had already been the focus of much attention by the advisory group who agreed that the workbook needed to have a continuing life beyond the end of the specific project which brought it into being. The recently established JISC Digital Preservation Focus has as a major priority the establishment of a digital preservation coalition. This will be an ideal mechanism to continue development of the workbook, both to further develop its usefulness as a training tool as well as to ensure its currency.
In general, responses to the workbook have been positive. The favourites have been the decision tree and the number of resources included in the workbook. Several respondents mentioned having discovered new resources this way, yet another indication of the burgeoning volume of material dealing with digital preservation issues.
Where To From Here?
In the mean time, a six-month project funded by the British Library Co-operation and Partnership programme will put the workbook into practice. A major challenge of the workbook was in appealing to a very wide and diverse audience. A challenge of this project will be whether the workbook can successfully be applied in more specific institutional contexts. The project will involve deposit libraries, specifically the British Library, and the Universities of Oxford and Cambridge so there are many similarities though there will also be significant differences between them. The NPO is also actively involved in this project and continuity with the earlier project will also be provided through the advisory group, some of whom were also represented on the advisory group for the Re:source funded project. This will ensure input from relevant organisations such as RLG, HATII, ULCC, and Re:source, as well as the participating institutions. As the workbook emphasises, collaboration is crucially important in the digital world and it is fully intended to put that advice into practice through this and subsequent projects.
A further project, this time funded by the JISC Committee for Electronic Information (JCEI), has been approved which will further develop the workbook. This proposal includes a number of elements but the one of most relevance to this paper is the promotion of good practice, awareness raising, and training in digital preservation, in this instance, within the HE and FE sectors. The development of workshops and guides, in collaboration with the AHDS and Cedars, will endeavour to bridge a gap between the high-density information contained in the workbook and simple overviews. The guides should hopefully take on board some of the concerns expressed at the length of the workbook, and will aim to make some of the information contained in it more easily digestible. The JCEI project is scheduled to commence in April 2001.
Conclusion
The workbook joins a growing number of resources and initiatives focussing on digital preservation. The increasing prominence given to digital preservation world-wide is an indication of how seriously it is being treated. There is unlikely ever to be a single definitive solution and certainly nothing will preclude the need for individual institutions to commit time and effort to addressing their specific requirements. However there is now much to build on and increasing practical examples to provide both inspiration and guidance. The overall theme of the workbook is that while the issues are complex and much remains to be clarified (and may never be definitively resolved), there is nevertheless much that can still be undertaken immediately. This activity will help to protect the initial investment in digital resource creation and offer considerably improved prospects for the long term.
References
- Task Force on Archiving of Digital Information, Garret, John and Waters, Donald (chairs). Preserving Digital Information: Report of the Task Force on Archiving of Digital Information. Commission on Preservation and Access and the Research Libraries Group. 20 May 1996.
- Long Term Preservation of Electronic Materials. A Report of a Workshop Organised by JISC/British Library, held at the University of Warwick on 27-28 November 1995. British Library R & D Report 6238.
- The seven commissioned reports are available at the UKOLN website
- Beagrie, N. & Greenstein, D. (1998). A Strategic Policy Framework for Creating and Preserving Digital Collections. Version 4.0 (Final Draft). ELib Supporting Study P3. Library Information and Technology Centre, South Bank University, London.
- Hendley, T. (1998). Comparison of Methods and Costs of Digital Preservation. British Library Research and Innovation Report 109. London: the British Library.
- Hedstrom, M. & Montgomery, S. (1998). Digital Preservation needs and Requirements in RLG Member Institutions. Mountain View CA: RLG.
- Non-digitise programme: stage two support.
- Central IT Unit (CITU). Information Age Government Champions.