Margaret Byrnes, National Library of Medicine (U.S.)
The need to ensure the long term survival of information that is issued only in electronic form has been a topic of discussion at the National Library of Medicine for several years. NLM has a formal mandate to collect and preserve the record of biomedicine. We understand that mandate to include "born digital" biomedical materials as well as other formats. This issue has been a recurring topic at meetings of the NLM Board of Regents and has emerged as one of the goals in our long range plan for 2000 – 2005. Under the objective "Acquire, Organize, and Preserve Biomedical Information" the plan includes the following goals:
- Take a leadership role in ensuring permanent access to important digital materials in health and biomedicine, including electronic journals, databases, documents published on the Web, and new kinds of scholarly communication and documentation of knowledge, using NLM’s own electronic output and services as initial test beds
- Work with national libraries and other appropriate organizations to develop, test, and implement standards and strategies for permanent access to electronic information.
To achieve these goals, NLM decided that the first step should be to look at our own electronic publications and decide how best to ensure access to those we consider to be of lasting value. We thought that having a model for our own publications would prove useful in future discussions of permanence issues with publishers of other electronic biomedical research information. We also hoped that by developing a system for communicating NLM’s commitment to permanence for some of our electronic publications, we could contribute to other efforts in the library community to develop preservation metadata and archiving strategies.
The Working Group on Permanence of NLM’s Electronic Information (WGP) was asked to address a user’s need to know whether a resource he or she creates, uses, or cites will remain available, unchanged, and in the same location the next time it is needed. The WGP’s specific charge was:
To examine the range of electronic information produced by NLM and develop recommendations for:
- levels of permanence suitable for different categories of NLM information
- methods of recording and communicating permanence levels
- procedures for ensuring that permanence levels are implemented.
After a year of deliberations, the WGP proposed the following rating system:
Identifier Validity (IV)
- Transient
- Guaranteed
Resource Availability (RA)
- No guarantee
- Permanently Available
Content Invariance (CI)
- Dynamic
- Stable
- Unchanging
The three main concepts underlying this system are:
Identifier Validity: the extent to which a resource’s identifier will remain the same over time and retrieve the same resource
Resource Availability: the degree to which users can be assured that a given resource will be there the next time it is needed. A rating of Permanently Available communicates a commitment that NLM will archive the resource.
Content Invariance: the degree to which the content of an electronic resource could change. Under this category there are three possibilities:
Dynamic content can change at any time. It could be heavily revised or completely replaced.
Stable content means that the user can expect only minor additions or corrections to the content.
Unchanging content means that content will remain the same over time.
Growing and Closed are terms used for aggregate resources such as databases or newsletters to indicate that new items are still being added or that the resource is no longer growing.
Under the initially proposed system, ratings were coded. For example, NLM’s commitment to archive a resource would be expressed by a rating of:
IV:2 (Identifier Validity Guaranteed)
RA:2 (Resource Permanently Available)
CI:3 (Content Unchanging)
Because it was thought difficult for NLM staff who create the resources to remember the codes and for users of the resources to understand them, the following system of simplified ratings was developed to express the concepts of the initial system in a condensed way:
Permanent: Unchanging Content
Permanent: Stable Content
Permanent: Dynamic Content
Permanence Not Guaranteed
The first three ratings are for permanent resources. This means that their identifiers will not change and the resources will always remain available. For example, a piece of correspondence in NLM’s Profiles in Science digital library collection would be rated Permanent: Unchanging Content because it is a scanned image of an original paper document and its content will not change. A bibliographic record in NLM’s MEDLINE database would be rated Permanent : Stable Content because it is subject only to correction and minor additions. The NLM Home Page would be rated Permanent: Dynamic Content because it is a permanent resource with content that changes frequently. Permanence not Guaranteed would be assigned to resources such as training manuals which could disappear from the Web or be given different identifiers over time.
The Working Group on Permanence also looked at the major resources that NLM has available on the Web and developed a list of resource categories. We wanted to see whether it would be possible to assign default ratings to categories of resources so that it would not be necessary to assign ratings to each resource individually. We found that this could be done for many categories. In categories such as databases and digital library collections, however, the variations among the resources are such that they will need to be rated one by one. The following are a few examples from the list:
| Resource Category |
Default Rating |
| Announcements |
Permanence Not Guaranteed |
| Application Forms |
Permanence Not Guaranteed |
| Bibliographies |
[No Default Rating] |
| Clinical Alerts |
Permanent: Unchanging Content |
| Databases |
[No Default Rating] |
| Database Records |
Permanent: Stable Content |
| Digital Library Images |
Permanent: Unchanging Content |
| Digital Library Collections |
[No Default Rating] |
| Exhibitions |
Permanent: Stable Content |
NLM staff who create the resources would select resource categories from a drop-down menu based on the list on the left. Default ratings from the list on the right would be assigned automatically for many resource categories. Default ratings can be changed. Major online exhibitions, for example, would automatically be given a default rating of Permanent: Stable Content but that rating could be overridden if the Library had no intention of retaining a particular exhibition for the long term. To assist staff to use the rating system, guidelines for assigning ratings to each of the resource categories were drafted by the Working Group.
The following is an example of a preliminary metadata record for an NLM electronic resource:
| Title: |
Breath of Life |
| Subject.Keyword: |
Asthma |
| Publisher: |
National Library of Medicine |
| Date.Issued |
1999-03 |
| Identifier.URL |
http://www.nlm.nih.gov/hmd/breath/breathhome.html |
| Identifier.NLMID: |
06148 |
| Permanence.Level: |
Permanent.StableContent/Closed |
| Permanence.Guarantor: |
National Library of Medicine |
| Contact.E-mail: |
custserv@nlm.nih.gov |
| Contact.Affiliation |
History of Medicine Division |
| Language: |
EN |
| Type: |
Exhibitions |
| Rights: |
Public Domain |
Some of the fields (e.g., unique identifier, publisher, date) would be entered automatically. Others would be entered by the creators of the resources. In this example, Permanence Status: Provisional indicates that the record has not yet been reviewed and authorized by NLM. The element Permanence.Guarantor is important because the agency responsible for archiving a resource could be different from the agency listed as publisher. It is essential that NLM be able to notify users and other libraries of its commitment to providing permanent access to specific resources.
For resources that are rated Permanent, preliminary records will be upgraded by NLM’s Cataloging Section staff and appear in the online catalog as a MARC record.
To ensure that the rating system is implemented consistently across the Library and that it continues to operate over time, the WGP recommended that permanence coordinators be assigned from each major program area that issues electronic resources on the Web. The coordinators would keep staff informed about the rating system and any changes that occur. They would review ratings that have been assigned in their respective program areas to make sure that staff are using the system properly. As needed they would meet with permanence coordinators from other program areas to resolve questions that arise, suggest changes to the system, and generally ensure that ratings are being applied consistently across the Library. Periodically the permanence coordinators would issue reports to the NLM Director on how well the system is working. The WGP also recommended that the Coordinator’s Group always include a representative from the NLM Archive so that due consideration would be given to the importance of each resource to the history of NLM.
The WGP’s recommendations for further action concern the systems work needed to implement permanence ratings. These include:
- specifications for the format and location of the ratings and identifiers that will be used for all resources that have been rated Permanent
- a set of applications to assist in recording and maintaining the ratings and for linking to each resource from its unique identifier regardless of where the resource is located
- a prototype system
- guidelines for managing all of the NLM servers on which permanent resources are stored.
Work will begin on the systems development phase of the project in January, 2001.
Information on future developments may be obtained by contacting:
Margaret Byrnes
Head, Preservation and Collection Management Section
National Library of Medicine
Margaret_Byrnes@nlm.nih.gov