HomeAboutProjectsProducts & ServicesPublicationsSupport
RLG Logo
  Issue index
 
 
· Apr 15, 2007
 
 
· Dec 15, 2006
 
 
· Oct 15, 2006
 
 
· Aug 15, 2006
 
 
· June 15, 2006
 
 
· Apr 15, 2006
 
 
· Feb 15, 2006
 
 
· Dec 15, 2005
 
 
· Oct 15, 2005
 
 
· Aug 15, 2005
 
 
· Jun 15, 2005
 
 
· Apr 15, 2005
 
 
· Feb 15, 2005
 
 
· Dec 15, 2004
 
 
· Oct 15, 2004
 
 
· Aug 15, 2004
 
 
· Jun 15, 2004
 
 
· Apr 15, 2004
 
 
· Feb 15, 2004
 
 
· Dec 15, 2003
 
 
· Oct 15, 2003
 
 
· Aug 15, 2003
 
 
· Jun 15, 2003
 
 
· Apr 15, 2003
 
 
· Feb 15, 2003
 
 
· Dec 15, 2002
 
 
· Oct 15, 2002
 
 
· Aug 15, 2002
 
 
· Jun 15, 2002
 
 
· Apr 15, 2002
 
 
· Feb 15, 2002
 
 
· Dec 15, 2001
 
 
· Oct 15, 2001
 
 
· Aug 15, 2001
 
 
· Jun 15, 2001
 
 
· Apr 15, 2001
 
 
· Feb 15, 2001
 
 
· Dec 15, 2000
 
 
· Oct 15, 2000
 
 
· Aug 15, 2000
 
 
· Jun 15, 2000
 
 
· Apr 15, 2000
 
 
· Feb 15, 2000
 
 
· Dec 15, 1999
 
 
· Oct 15, 1999
 
 
· Aug 15, 1999
 
 
· Jun 15, 1999
 
 
· Apr 15, 1999
 
 
· Feb 15, 1999
 
 
· Dec 15, 1998
 
 
· Oct 15, 1998
 
 
· Aug 15, 1998
 
 
· Jun 15, 1998
 
 
· Apr 15, 1998
 
 
· Feb 15, 1998
 
 
· Dec 15, 1997
 
 
· Aug 15, 1997
 
 
· Apr 15, 1997
 
 


Click for printable version of this pagePrintable Version
 Contents of: Volume 10, Number 2 ISSN 1093-5371  Print entire issue
  Feature Article 1: Six Lessons Learned: An (Early) ARTstor Retrospective  
  Feature Article 2: MIC (Moving Image Collections)  
  Highlighted Web Site: Five Blogs  
  FAQ: You've Got Mail—Now What? Regulatory & Policy Dilemmas in Email Management
Part I: US Federal Environment
 
  Calendar of Events  
  Announcements  
  RLG News: RLG Launches Web Archiving Program  
  Publishing Information  
 Feature Article 1  Print this article only

Six Lessons Learned: An (Early) ARTstor Retrospective

Author: Max Marmor - ARTstor (Max.Marmor@artstor.org)


ARTstor is a digital library of images intended for educational and scholarly use. Founded by The Andrew W. Mellon Foundation, ARTstor became an independent not-for-profit organization in January 2004. ARTstor started licensing its library to US institutions of higher education in April of that year. We welcome this opportunity to look back at ARTstor’s early development as we mark two major milestones: the passage of ARTstor’s second birthday and the number of institutions participating in ARTstor having recently surpassed 500. We are especially grateful to RLG for this invitation to step back and reflect a bit since, like others active in this arena, we are usually fully occupied navigating the swirling waters—and the occasional rapids—of this complicated and swiftly changing landscape.

In this “ARTstor Retrospective,” we look back at the lessons we have learned, with particular attention to six lessons we believe will be of particular interest to others engaged with similar work. We think the key lessons we have learned fall into these categories:

  1. the feasibility and importance of building a campus-wide resource that engages users across a range of disciplines without being balkanized into narrow, discipline-specific collections;
  2. the importance, when it comes to digital images, of providing tools for teaching and research;
  3. the ramifications of such a resource for “buy vs. build” decisions on the part of libraries and other campus entities;
  4. the trade-offs entailed by building valued, “user-driven” collections while also striving to accommodate a strong interest in interoperability with other collections and services;
  5. the (perhaps unique) complexities surrounding contemporary art;
  6. the challenge represented by the lack of appropriate assessment metrics for online resources that support both research and classroom teaching.

Six Lessons Learned

Lesson One: The feasibility—and importance—of building a campus-wide resource that engages users across a range of disciplines without being balkanized into narrow discipline-specific collections

The “ART” in “ARTstor” requires some unpacking if our name is not to mislead. For despite its name, ARTstor was conceived from the outset as what participating libraries have encouraged us to call a “campus-wide” resource. ARTstor is intended to provide for teachers, students, and scholars throughout the arts, humanities, and social sciences—and indeed even beyond their boundaries—the kind of image library, and associated services, that have traditionally existed only in academic slide libraries serving departments of art history and related programs. We know from long experience that most academic programs outside the arts have needed, but notoriously lacked, such substantial image resources, and ARTstor is intended to redress that imbalance and to break down the organizational barriers that created and continue to sustain it. We have recently learned, for example, that two structural engineering faculty at Johns Hopkins University are using ARTstor to teach about bridge construction. This strikes us as opening the door to exciting possibilities that traditional visual resources collections have rarely had reason or opportunity to explore before locking the doors at night.[1

At the same time, ARTstor is born out of the conviction that the image needs of scholars, teachers, and students across the arts and humanities converge and overlap in important ways. Consequently, while the needs of individuals in any given discipline may sometimes be unique to that field, we believe there is a vast underlying body of art and visual culture—in short, of images—that engages the interest of humanists of all stamps, and indeed of scientists like the engineers just cited. ARTstor’s goal has been to begin to define these shared needs, to address them in developing digital collections, and simultaneously to begin building out from this shared core of content to respond to the more specialized needs of both art historians and scholars in other fields, who are engaged with images in their teaching and research.

We believe the value of this approach to building the ARTstor Digital Library has been affirmed over the first two years of ARTstor’s existence, especially as we learn how many users from different fields have come to regard ARTstor as an essential source of images for teaching and learning.

Lesson Two: The importance, when it comes to digital images, of providing tools for teaching and research

Users want to do things with digital images. “Read only” is not enough. They want to assemble images, often in huge numbers, to shuffle and re-shuffle them into unpredictable and unanticipated permutations, to sift and filter them in sometimes indiscernible ways, and then to actively use them in teaching, learning, and research. All of these activities demand tailored, “bespoke” software tools. At ARTstor, we concluded early on that we ourselves needed to provide such tools for our users, both to support appropriate uses and simultaneously to help address the concerns of content owners about potentially inappropriate uses. And we concluded rather reluctantly that we had to build these tools ourselves, since we felt that those that were available elsewhere were either too elementary to support this range of activities or not intuitive and “usable” enough to enlist the engagement of a wide range of users, particularly as many of those users were unfamiliar with or even averse to learning new ways of doing familiar things. But, we also believed that it was in the best interest of our users if we developed our own tools, since only in that way could ARTstor be maximally responsive to user needs in an ongoing way.

A case in point is our “offline image viewer.” Conceived as an alternative to PowerPoint and other commercial presentation software, it has been designed specifically to support the needs of image users in higher education and museums. The “OIV” in its latest version (2.5) has proved so successful that we will shortly release a basic freeware version to the larger community.

And so a second lesson we have learned is that digital image users do indeed need appropriate, full-featured but intuitive software tools. And if one is to be responsive to user needs on this front as on others, it is best to manage software development autonomously—and where feasible and appropriate, to make one’s tools available for use with other resources as well.

Lesson Three: The ramifications for “buy vs. build” decisions on the part of libraries and other campus entities

The research library’s instinct is—appropriately—to build and to steward local collections. Despite decades of efforts in collaborative collection development and innovative initiatives in the area of interlibrary loan services—and despite the degree to which libraries now invest in, and their users depend fundamentally upon, licensed electronic resources—the urge to “own” rather than merely to “access” resides deep within this community. Unsurprisingly, that instinct is also in evidence when it comes to addressing the increasing need for digital image resources. This natural bent of the research library is reinforced by the traditional approach in visual resources collections—slide libraries and photographic archives in institutions large and small. Slide curators and the faculty they serve are accustomed to (literally or at least meaningfully) owning images, with all that implies for security, trust, and ease of ongoing access. Visual resources collections and visual resources professionals are now making the same transition from “ownership” to “access” that libraries have made so successfully in recent decades and with the same coupling of anxiety and excitement.

ARTstor is intended to enable and foster that transition by providing both a very large core body of digital images capable of supporting shared curricula in the arts and humanities and, increasingly, the kind of deep “special collections” of primary resources that alone can turn an instructional resource into an online information resource capable of supporting and fostering advanced research and scholarship. This transition clearly has implications for libraries as well as visual resources curators and image users. What is the library’s role in this development? When should image collections be developed and managed locally and when should they be licensed? Should libraries or other campus entities redundantly invest in sustaining digital image archives? Is it more cost-effective and scalable to invest in local infrastructure (and staffing) for digitizing, cataloging, archiving, and supporting the use of digital images—or to depend upon a trusted third party for many of these activities and services? These are all questions that ARTstor poses, sometimes implicitly (simply by building collections meant to be of broad value) and sometimes explicitly (as through our nascent hosting service through which we “host” image collections on behalf of participating institutions). These questions implicate libraries—and library budgets—in the areas of collections, technology, and services. And indeed, since ARTstor is both a library resource and an educational technology service, these questions have larger implications for academic programs and those who plan and administer services to them.   

And so our third lesson is that libraries and librarians—and other campus officers as well as end users—should and in fact do take into account these challenging questions in assessing the value of a service like ARTstor. At the same time, we recognize that the entire community is still seeking to define the right balance between locally supported and remotely licensed content, tools, and services. We are working to evolve with the community that we serve, and that is one of the many important benefits of undertaking this pioneering journey as a non-profit institution. We have no wish to convince the community of something so that we can cash in on financial rewards; we are here to serve the community as myriad individual institutions find their way through this complicated and shifting terrain.

Lesson Four: The trade-offs entailed by building valued, “user-driven” collections while also striving to accommodate a strong interest in interoperability with other collections and services

ARTstor means to be a bridge between the international community of content owners (archives, libraries, museums, photographers, et al.) and the community of end users, in higher education especially, but also in museums and other cultural organizations. That these two communities do not always see eye to eye when it comes to the educational and scholarly use of digital resources is well known and was amply demonstrated by the CONFU process some years ago.[2] Our goal has been to work with the community of content owners, on an international scale from Berkeley to Beijing, to build a digital image library that will be highly valued by scholars and teachers and, indeed, shaped fundamentally by the needs of scholars and teachers, while also advancing the missions of collecting institutions.

This has entailed compromises at both ends of this spectrum. Put metaphorically, we have concluded that this essential “bridge” must in important respects remain a “covered bridge”—at least for now. By that we mean that in order to balance the concerns, interests, and needs of content owners with those of end users, we have felt obliged to create a secure network on the Internet, within which digital content can be used in appropriate ways by educators and scholars, and without, for the most part, allowing that content to be removed from the digital library for use in other environments. We have, in short, wrapped ARTstor content in the ARTstor software. And we have thereby placed real limits on our ability to interoperate with other systems and services. We have taken this approach for two reasons: First, we believe that this is the only way we can build the kind of valued collections our users say they most want from a service like ARTstor; and second, we believe it is important to keep these two communities in dialogue—a mission-driven goal we would jeopardize if we fully accommodated the interoperability interest some institutions and individuals have expressed.

Having said this, we have also learned that there is much we can do to address these conflicting interests in appropriate ways. Part of our rationale in building software tools and in developing a hosting service and a personal collections tool, for example, has been to enable the individual end user to easily integrate their own images (whether personal or institutional) with ARTstor images, both offline and online. Similarly, we are now offering a growing range of interoperability solutions, beginning with federated searching via a recently released ARTstor XML Gateway.[3]

While content owners and end users will continue to face points where they might not agree completely, we are pleased that the “covered bridge” we have erected is now providing for some early passage—and that it is becoming a two-way street. As we continue to establish relationships with archives, libraries, museums, and other collection development partners here and abroad, we are beginning to ease some of the restrictions on content in ARTstor. We expect, for example, to allow for significantly larger downloads of some ARTstor images later this year, especially for use in teaching. We also anticipate working with supportive content owners to enable fuller use of ARTstor images in the larger context of scholarly communications.

Lesson Five: The (perhaps unique) complexities surrounding contemporary art

Digital images rarely travel unaccompanied by rights issues, issues that are frequently ambiguous, invariably complex, often contested, and always exigent. These issues cannot—and will not be!—ignored. And these issues are shape shifters, appearing differently from one context to the next and one country to the next.

This already complicated picture becomes still more ambiguous, complex, and contested when the underlying work of art is still under copyright. This is, of course, the case with contemporary art and much of 20th century art as well.

ARTstor is making great efforts to provide its users with a substantial body of modern and contemporary art images; we are, for example, just launching a project to digitize more than 100,000 images of works of contemporary art shown in galleries throughout New York City in the last third of the 20th century. In some instances we pursue such projects with some dependence upon the US exception to copyright law of fair use. But we are also making great efforts to reach out to artists, estates, and artists’ rights organizations. And because we are committed to making the ARTstor Digital Library available internationally (it is now available in the US and Canada, and pilots are underway in the UK and Australia/New Zealand), we want to secure a firm foundation for providing access even in countries where there are no traditions comparable to fair use.

What lessons have we learned from these explorations? The obvious lesson is that this is, first and foremost, a profoundly complicated terrain, especially in the international arena, with many risks to be managed on the part of all concerned. We have also learned that artists and those who represent them care about the noncommercial, educational, and scholarly use of digital art images and that these individuals and institutions will often lend their support to the effort to facilitate such uses. And finally, we have learned that understandings can be reached that address mutual concerns without jeopardizing or compromising fair use—for ARTstor and its users or for others active in this community. The perspective we have adopted on this set of issues is a long-term one. We believe that the community is determining today whether the effort to define and enable educational use will be a collaborative effort or a confrontational one, and we are doing what we can to try to keep the exploration collaborative.

Lesson Six: The challenge represented by the lack of appropriate assessment metrics for online resources that support both research and teaching

Finally, as ARTstor prepares to enter its third year as a “live” digital library with expanding collections and services, we often ask ourselves how well we are performing. Are we building the “right” kind of collections? Do our software tools lend themselves to the uses we hope to support? How is ARTstor being used, by whom, and in what ways? And of course libraries must ask themselves—and us—these questions as well. One thing we have discovered is that conventional metrics for measuring the use and value of online resources are less helpful than one might hope or expect. We attribute this to the fact that ARTstor is not a typical online information resource. It is both a reference/research tool and a tool that enables course support via shared image groups and classroom presentations. It is an educational technology, providing tools supporting classroom applications (lecture preparation and presentation), as well as student study (there are definite spikes in usage during midterms and finals!), and other pedagogical uses. It lends itself to integration with learning management systems due to the provision of stable URLs for all ARTstor images as well as for all “image groups” created by users. And above all, ARTstor exists both as an online resource among many others and as a personalized resource that has both online and offline versions. Thus ARTstor users perform a range of activities that are difficult to track and difficult to assess and interpret and that—above all—bear little resemblance to the ways in which the vast majority of online information resources are used.

And thus our final “lesson learned” is that we—as a community—need better ways to assess the use and value of electronic resources that are conceived to pioneer new paradigms for teaching and learning.

We are still learning further lessons, both by getting some things right and by making mistakes, and we look forward to hearing from, and learning with, all those who have a stake in how this fascinating and important exploration plays out.

Notes:
[1] For further discussion of this see B. Rockenbach and M. Marmor, “ARTstor’s Digital Landscape,” Library Journal, July 15, 2005.
[2] See http://www.arl.org/info/frn/copy/confu.html.
[3] See B. Rockenbach and W. Ying, “ARTstor: Enabling Cross-Resource Communication,” Library Hi Tech News 22/9 (2005); 21-23.


 Feature Article 2  Print this article only

MIC (Moving Image Collections)

Author: Jane D. Johnson - Library of Congress (jjohnson@loc.gov)

An Innovative Partnership

MIC (Moving Image Collections, pronounced “mike”) is the product of an innovative partnership between the Library of Congress (LC) and the Association of Moving Image Archivists (AMIA).[1] Emerging from the National Moving Image Preservation Plans,[2] MIC began as a preservation initiative. MIC has successfully evolved to demonstrate that the practical requirements of preserving analog artifacts can provide the foundation for an advanced R&D platform.  The audience for MIC extends beyond archivists by exploiting the most current developments in non-textual indexing, digital rights management, and educational use; though MIC also meets the daily needs of archivists with informational resources and support for collaborative preservation, access, digitization, exhibition, and metadata initiatives. MIC has served as a model and building block for similar initiatives, including the Women Artists Archives National Directory (WAAND)[3] and the New Jersey Digital Highway.[4] Grace Agnew, Associate University Librarian for Digital Library Systems for Rutgers University Libraries, is MIC’s architect. Rutgers University Libraries is the lead developer, working with the University of Washington and Georgia Institute of Technology. Project management is provided by the Library of Congress, which will serve as the permanent host site.

The MIC website integrates a Union Catalog, an Archive Directory, and informational resources within a portal structure that delivers customized information on archival moving images, their preservation, and the images themselves. The Archive Directory was originally envisioned as a complement to the Union Catalog, providing collection descriptions at the repository level. Within the Union Catalog, users are able to limit searches to digital video that can be downloaded or streamed. A custom mapping utility facilitates participation in MIC’s Union Catalog by allowing other archives to map their own local metadata schema into the MIC core registry for import. A cataloging utility is in the works that will enable archives to input records directly into the Union Catalog—records incorporating not just descriptive metadata, but all the types of metadata required to manage a resource through its entire lifecycle.

A Tool for Preservation

Beginning in 1994, the Library of Congress published two documents detailing the crisis in film preservation. Redefining Film Preservation (1994), mandated by the US Congress as part of the National Film Preservation Act of 1992, was the first national moving image preservation plan. Three years later it was complemented by a follow-up plan entitled Television and Video Preservation 1997. MIC began when the Library of Congress turned to AMIA for assistance for developing strategies to implement the numerous recommendations included in the two preservation plans. AMIA identified the first and most crucial step in any preservation solution: a standardized way to identify holdings, particularly unique titles, so that strategic planning and collaborative decision making could occur.

Through MIC’s Union Catalog, archivists can identify past preservation work and emerging critical need, thereby reducing duplication of effort and preventing loss through deterioration, while ensuring that titles are preserved from the best surviving footage. On the public side, MIC seeks to raise awareness about preservation issues and risks to our film, television, and video heritage by enlightening readers about methods to care for home collections, the role of archives, and the preservation process. MIC’s informational resources, numbering in the hundreds, have been created and gathered by experts within AMIA, working in accordance with that organization’s educational mission. The Library of Congress’ role is to provide the management and technical infrastructure for MIC’s long-term maintenance and ongoing development. This distinctive partnership between a professional association and the National Library maximizes the strengths of both organizations to make a total contribution to the field beyond the sum of its parts.

MIC is committed to helping underfunded archives with tools and standards. Any organization holding archival moving images can participate in MIC. What are “archival moving images?” Moving images can be film, video, or digital files. Audio recordings associated with moving images, such as soundtracks, are also within MIC’s scope. Archival moving images are defined as those intended to be kept for future generations, regardless of their age at the time of acquisition. MIC is documenting moving images all over the world, whether held by organizations or individuals, and wherever they reside: in corporations, museums, historical societies, and motion picture studios.

That said, MIC was meant to be extensible. In fact, the acronym was chosen to also stand for “Media In Collections,” anticipating the future inclusion of sound recordings of all kinds. MIC’s architecture has also been carefully designed to allow expansion from a catalog and directory to a cataloging utility with full asset and digital rights management implementations as well as low-level indexing capability. It can be viewed as a prototype for all types of materials.

A Tool for Education

MIC was never envisioned solely as a tool for archivists. Early on, we knew that access for the public and for educators was the key to a sustainable preservation strategy. MIC’s mission is to immerse moving images into the education mainstream, recognizing that what society uses, it values; what it values, it preserves. This mission is addressed in three ways: by providing access to more moving images, by enabling greater processing of those images, and by facilitating digital rights management to expand use of moving images, including those outside the public domain.

Very recently Google announced its plans to partner with the National Archives and Records Administration (NARA) to make NARA’s public domain film holdings available on the Web for free.[5] For moving images, this ready browsability alone is a huge advance over traditional means of access. Nonetheless, some have criticized Google for its non-disclosure policy and speculated that Google is starting, at least, with the “low-hanging fruit” of video (as opposed to digitized film).[6] Often Google relies on extant text (such as closed captions) for retrieval.[7] MIC also allows users to search across numerous archives and limit results to digital video. In contrast to Google, however, MIC exploits the metadata that libraries are already creating for their own patrons, for all materials. This substantially increases search precision and leverages the labor that libraries will continue to employ. MIC allows users to search all moving images in its catalog, whether digitized or not, putting these resources at the fingertips of educators and students. By aggregating records from many moving image collections and by providing access to moving images exclusively, MIC provides services beyond those traditionally provided by libraries and archives.

As Don Waters has pointed out, digital materials can provide greater functionality for teaching and research than legacy analog resources.[8] “What unites our interest in digitization and open access in a digital world is that the material becomes ‘processable,’ or subject to computational processing.”[9] The Union Catalog’s MPEG-7 export capability enables segmenting and low-level indexing to facilitate this type of processing for indexing and creation of learning objects.

MIC can also advance educational use of moving images by making full digital rights management capability possible. A digital rights management implementation would allow teachers and educators to incorporate moving images and sound recordings into the classroom and electronic scholarly publications. Development of standards and practices that support broad, non-infringing use of digital multimedia requires incorporation of rights metadata into MIC’s planned cataloging utility and an expansion of MIC’s LDAP directory architecture to authoritatively identify rights holders. The LDAP directory is designed to interact with both the international authority work spearheaded by the Library of Congress and International Federation of Library Associations (IFLA) and the directory-based authentication and access initiatives spearheaded by Internet2, such as Shibboleth and the emerging eduPerson directory standard. A community collaboration that engages both nonprofit and for-profit moving image archives is important for demonstrating that both communities can come together to promote non-infringing use of digital resources for education.

Finally, it’s not just teachers that license footage for educational purposes. News and documentary production, for better or for worse, is largely driven by the availability of topical footage. Increased availability of moving images and expanded licensing opportunities promise to expose the general public to new ideas and previously unknown or underreported events via televised documentaries and theatrical releases. As an added bonus, historically underfunded repositories could use this licensing mechanism to generate revenue to better support their organizational missions and preservation activities.

Many Routes to Multi-Functional Components

To provide this multi-faceted functionality, we have developed MIC incrementally. The Archive Directory was built first, then the Union Catalog, now a mapping utility, and next a cataloging utility. MIC is designed to be a flexible, extensible, and interoperable tool. It relies on open-source software and is easy to maintain and use. MIC offers a number of services to archives and features components with multiple functions and multiple avenues for archives to participate and collaborate.

For example, while Web directories typically serve merely as pointers to participating organizations’ websites,[10] the MIC directory can actually take users to the organization’s records in the MIC Union Catalog, or directly to the organization’s own catalog, or to its website. A feature of MIC is the integration of Archive Directory and Union Catalog databases. This integration allows end users to limit Union Catalog searches to a single archive or a selection of archives. Once a Union Catalog record is retrieved, the system queries the LDAP directory database and displays information about obtaining the resource with the bibliographic record itself. This information, specific to the organization holding the title, is pulled from the Archive Directory entry. This functionality was crucial for obtaining buy-in from the archival moving image community. Much of what is held by moving image archives is inaccessible. Film may be on negative or other pre-print stock, or it may be unique. Video materials frequently reside on obsolete formats for which viewing equipment is no longer available. In field-wide planning interviews conducted by Grace Agnew, archives rightly insisted that organization-specific access policies be prominently displayed with each record.

Finally, the Archive Directory is a tool for collaboration and community building. Through the Directory input form, information is systematically gathered about the collections, services, and cataloging, and preservation activities of a wide array of archives. These detailed descriptions give archivists the information they need to evaluate archival activities in similar repositories and to identify organizations with common interests. They can then utilize MIC’s portal structure to build communities for collaborative projects. The Directory also enables the Library of Congress and AMIA to identify community needs, potential collaborations, and emerging trends, in order to focus community training and support.

End users can enjoy any of several modes of access to the MIC Union Catalog. First, of course, there is the Web search from any of several MIC portals. The database can also be accessed remotely using MIC’s Z39.50 capability, and Dublin Core export supports OAI harvesting. Even within a Web search, there are multiple avenues of access, since MIC includes six separate portals, and different portals yield different results. For example, the Science Educators portal, dubbed “Science Goes to the Movies,” retrieves only moving images related to science. And while the MIC records display is consistent across portals, Archivists Portal users have the option to display records in any of a number of different schemas, including MIC XML, MARC HTML, MPEG-7 XML, and Dublin Core XML. Similarly, informational resources vary by portal, as do Archive Directory displays. The Archivists Portal, for example, is the only place where Directory entries include information about cataloging and preservation activities.

By the same token, archives are offered several means by which they can contribute their records to the catalog. MARC records can be loaded and mapped automatically. For local metadata schemas, organizations can use the mapping utility to map and ingest records into the catalog. Direct input will be a third option once the cataloging utility is up and running.

Just as there are multiple ways into the catalog, the catalog records serve multiple purposes as well. MIC allows users to search across multiple repositories to find current detailed descriptions of moving images, and the images themselves, for the first time. As with any large aggregation of records, searching can reveal heretofore-unseen relationships that can suggest new areas of research. Because the MIC Union Catalog includes only bibliographic descriptions of moving images, users wishing to search only moving images can do so easily. In many large academic institutions, users need a complicated set of instructions simply to limit search results to moving images; in some major research organizations, a comprehensive search of only moving images is impossible. The MIC Union Catalog contains only records for moving images. Users can search across all or selected repositories or limit to moving images within a particular institution.

Finally, MIC enables research and development in emerging technologies by making a sizeable set of bibliographic records representing a cross section of the archival moving image community available in a variety of metadata schemas. Computer science researchers can partner with the library and archival community to explore low-level indexing, authority control, events-based directories, FRBR implementations, digital rights management, active privacy policies, and fair use.

A Metadata-Driven Strategy

MIC’s innovative design employs a metadata-driven strategy to simultaneously address multiple goals of expanding education, outreach, access, preservation, and research in culture and information technology. This strategy is five-fold:

1. Embrace the inherent diversity in the field.
2. Promote metadata standards.
3. Democratize digital resource management.
4. Enable exploration of new technologies.
5. Provide a model extensible to other archive and library communities.

The most salient characteristic of the archival moving image field is its diversity. Currently moving images can reside in almost any type of organization, from large national institutions like the Library of Congress to small public, private, or non-profit institutions with specialized collections. The diversity of these organizations, like the diversity of materials they hold, continues to grow as media permeates all aspects of society. Organizations can be big or small; often, critically important works are held by individuals. With differences in size come differences in available financial resources. Similarly, differences in repository missions suggest a variety of user needs. The end result is that repositories everywhere employ a huge variety of metadata schemas, many of them local or proprietary. Developing a system that would embrace these differences rather than force conformity would attract the largest possible number of participants. Since MIC’s original mission was preservation, it was important to document extant moving images wherever they might be found.

Central to the system is MIC’s Core Registry, a list of about 50 data elements for moving image description. This schema’s context-independence is the key to its accommodating the multiplicity of extant schemas. Context independence also means that the schema will address needs not just of current users, but also future users and constituencies, some of them as yet unidentified.

The MIC Core Registry is a rigorously maintained and standardized application recorded in a modified ISO 11173 registry format. Data elements from virtually any schema can be mapped to these fields for later export in any of several other standard schema, including MARC21, Dublin Core, and MPEG-7. Other mappings in the works include PBCore, MODS, IEEE-LOM, and SMPTE.

Mapping a few core data elements across schemas is relatively simple. More difficult to achieve is a rich mapping that supports a range of user information needs, some as yet unforeseen. MIC data must also meaningfully participate in other collaborations using other metadata standards, and be extensible beyond descriptive metadata, to incorporate METS-compliant preservation, rights, and technical metadata. This range of functionality requires more careful design and effort, so that data can be exported in different schemas for different purposes.


Figure 1. MIC Mapping 

For example, MARC records allow interoperability, especially with print materials and can be retrieved via Z39.50 protocol. Dublin Core enables OAI harvesting for National Science Digital Library and other consortia. MPEG-7 is one of the few metadata schemas developed specifically to describe, manage, and provide access to moving images. Unlike MARC and Dublin Core, MPEG-7 was designed to support multiple manifestations and accommodate technical and administrative metadata. Moreover, MPEG-7 supports the automatic generation of segments for storyboards and summary clips that can be combined into learning objects. By providing schemas that process the digital bit stream for low-level, non-textual, digital video indexing applications, it allows for retrieval of digital objects by properties like color, shape, and texture. Using the MIC export utility, an archive employing Dublin Core for ease of use in-house can make its records available in MARC for Z39.50 access or in MPEG-7 for low-level indexing. It has been said that “the historical antecedent for this kind of thing is the Rosetta Stone.”

The MIC Core Registry is detailed enough to retain the richness of extensive archival descriptions, but simple enough to provide users with readable, succinct displays of heterogeneous metadata from widely divergent source records. The MIC Core Registry supports the core user needs as defined by IFLA’s Functional Requirements of Bibliographic Records (FRBR): to find entities that correspond to the user’s stated search criteria, to identify an entity, to select an entity appropriate to the user’s needs, and to obtain a copy of a selected entity.

Once MIC’s Core Registry was established, with maps to MARC21, Dublin Core, and MPEG-7, a mechanism was needed to allow organizations with local systems to map their own schema to MIC’s schema. As recently as 1998, Footage: the Worldwide Moving Image Sourcebook documented 3,000 moving image archives in the US alone, and numbers have increased exponentially since then. To ingest millions of records into the MIC Union Catalog would require a production line-like mechanism. In the archival moving image community, a significant number of small institutions rely on commercial databases like FileMaker Pro, or even spreadsheets, to hold their descriptive metadata. For maximum participation, a self-service mapping tool that worked for just about everyone with minimal use of intermediaries was needed. Yang Yu, database programmer at Rutgers University Libraries, worked with system architect Grace Agnew and Project Manager Jane D. Johnson to design the MIC mapping utility to serve this need.

The utility is currently in beta testing and has been successfully used by several organizations to map homegrown schema into MIC. The first step in the mapping process is submission of an online application by the archive. The 20-question form takes a few minutes to fill out and includes contact information, cataloging practices, record format, intended mode of delivery, and space for comments. If an organization uses more than one schema, it can assign a memorable name to each for reference purposes. Once the application is submitted and reviewed, the organization uploads a small set of sample records and its own list of data elements. (If the organization finds it easier to submit its entire record set, it can do that, and we will sample the records at our end.) The system then populates the online mapping form with the field list. The form leads the user through the list of MIC data elements. It asks the user to select, from a pulldown menu, the organization’s own equivalent for each MIC data element and to provide a sample value for each. At any point in the mapping the user can opt to preview its sample values in a MIC display. Once completed, the mapping is reviewed by the MIC administrator, revisions are made as necessary, and the mapping is approved. Finally, the organization submits its full record set, the MIC administrator identifies links to digital video for indexing, and the records are ready for ingest.

The MIC mapping utility has the potential to transform and democratize cataloging. Any moving image repository, using any metadata schema, can easily map its own records for sharing globally through the MIC Union Catalog and elsewhere. Small institutions can make their holdings accessible at a low cost on the Web, comply with national and international standards, and use existing personnel and infrastructure. Larger institutions employing multiple or legacy metadata schema can map different collections through the mapping utility and export records in a single metadata schema. Individual collectors are given a platform for making their materials available to a wider public, exposing previously unknown footage and genres to scholars, educators, and the public.

With standard mappings and the mapping utility in place, it is almost possible for any archive with machine-readable records to readily load records into the MIC Union Catalog. Still, Union Catalog records are limited to largely descriptive metadata, which is not enough metadata for management of archival resources through their lifecycles. The next development will be the MIC cataloging utility to provide a front-end input form for inputting records directly into MIC. This form would accommodate descriptive, administrative (including preservation, technical, rights), and structural metadata. A downloadable METS-compliant cataloging utility, available from Library of Congress and MIC websites, would extend metadata for collection description, management, and access to any repository, regardless of technical readiness or cataloging expertise. This product would leverage LC-supported standards for description (MODS) and digital preservation (PREMIS), but go further to address the integrated management of legacy analog source materials and digital resources and rights for access to digital video resources. Rutgers has developed the bulk of this utility as its Workflow Management System. The Rutgers University Libraries is a recognized leader and innovator in the development of digital library technologies, particularly in the development of repository architectures and services with a dual focus on the long-term preservation and discovery of digital resources.

Although MIC is dedicated to accommodating local and proprietary schemas, one of MIC’s core missions is to promote metadata standards, reflecting the Library of Congress’ leadership role in this area. MIC is committed to open source, standards-based interoperability protocols. By easily ingesting records in standard formats and providing homogenous displays from widely divergent collections, MIC illustrates by example the value of standards. Beyond that, MIC’s informational resources promote standards use by educating archivists in their use.

Indeed, MIC’s informational resources were originally conceived to educate archivists in the use of cataloging standards and practices. The scope of MIC’s education and outreach space has now expanded to include preservation, exhibition, and collection management information, as well as resources for the general public. In the General Users Portal, links to popular sites such as Internet Movie Database and Rotten Tomatoes are intended to draw people to the MIC site, where they can learn about preserving their own home movies or find out how they can contribute to public moving image preservation efforts.

Conclusion

MIC is a collaborative effort to promote discovery, preservation, and educational use of moving image materials. By acknowledging the diversity in the field and building an innovative and extensible platform to accommodate it, MIC provides broad and open access to asset management, promotes standards, and supports collaborative preservation. Building a preservation infrastructure on an organization-by-organization basis is not practical; collective action is required. MIC’s metadata strategy provides an exemplar in its dual commitment to widespread access and collaborative preservation of moving image resources, in both analog and digital form. Its impact is demonstrated by impressive statistics of use. As of March 2006, the site has had 2.8 million total hits, currently averaging over 3,000 hits per day. There have been 323,000 total visitors from 92,000 unique IP addresses. The site averages over 350 visitors per day.

Notes:
[1] The Association of Moving Image Archivists (AMIA) is a non-profit professional association established to advance the field of moving image archiving by fostering cooperation among individuals and organizations concerned with the acquisition, preservation, exhibition, and use of moving image materials. The AMIA website may be found at: http://amianet.org/.
[2] Melville, Annette and Scott Simmon.  Redefining Film Preservation, a National Plan: Recommendations of the Librarian of Congress in Consultation with the National Film Preservation Board. Washington, D.C.: Library of Congress, 1997.  Murphy, William.  Television and Video Preservation 1997. Washington, D.C.: Library of Congress, 1997.  Copies of the plans are available at: http://www.loc.gov/film/filmpres.html.
[3] The Women Artists Archives National Directory, or WAAND, is a Web directory to US archival collections holding primary source materials by and about women visual artists active in the U.S. since 1945. The WAAND website may be found at: http://waand.rutgers.edu.
[4] The New Jersey Digital Highway is a portal providing access to the collections of several New Jersey cultural heritage institutions.  The website may be found at: http://www.njdigitalhighway.org.
[5] “National Archives and Google Launch Pilot Project to Digitize and Offer Historic Films Online” (press release posted on Business Wire, February 24, 2006); Borland, John, “Google puts National Archives video online,”  ZDNet News, February 24, 2006.  
[6] See, for example, Rick Prelinger’s February 24, 2006 posting on AMIA-L:  http://lsv.uky.edu/scripts/wa.exe?A1=ind0602&L=amia-l.
[7] Olsen, Stefanie, “Coming soon: Google TV?ZDNet Australia, November 30, 2004.
[8] Waters, Donald J., “Managing Digital Assets in Higher Education: an Overview of Strategic Issues,” Managing Digital Assets: Strategic Issues for Research Libraries, October 28, 2005:  Forum Proceedings, p. 9.
[9] Joseph J. Esposito, cited in Waters, Donald J., “Managing Digital Assets in Higher Education: an Overview of Strategic Issues,” Managing Digital Assets: Strategic Issues for Research Libraries, October 28, 2005:  Forum Proceedings, p. 14.
[10] See, for example, the UNESCO Archives Portal at: http://portal.unesco.org/ci/en/ev.php-URL_ID=5761&URL_DO=DO_
TOPIC&URL_SECTION=201.html
.


 Highlighted Web Site  Print this article only

Five Blogs



Five Blogs

RLG DigiNews highlighted David Mattison’s informative Ten Thousand Year Blog back in August 2004. Since then, other’s have found a voice for their digital efforts via blog-based platforms. This issue’s Highlighted Web Site points readers to five additional blogs on digitization and/or digital preservation. For each blog, we note its author and description, as well as a link to the blog’s first post, which sometimes suggests the blogger’s motivation and serves as a nice introduction.


digitizationblog

Blogger: Mark Jordan, Head of Library Systems, W.A.C. Bennett Library, Simon Fraser University
From the “About” page: “digitizationblog focuses on digitization and related activities (such as electonic publishing) in libraries, archives, and museums, and is intended to be a source of news relevent to people who manage and implement digitization. projects.”
First entry: Sunday, November 14, 2004


Digitization 101

Blogger: Jill Hurst-Wahl, MLS, consultant and owner of Hurst Associates, Ltd.
From the “About” page: “This blog is the creation of Hurst Associates, Ltd. (http://www.HurstAssociates.com) and is THE PLACE for staying up-to-date on issues, topics, and lessons learned surrounding the creation, management, marketing and preservation of digital assets. (A few other topics are covered when the mood hits!)”
First entry: Monday, August 30, 2004


digitize everything

Blogger: Michael Yunkin, Web Content/Metadata Manager, University of Nevada, Las Vegas, Libraries
From the “About” page: “Digitize Everything is a blog about digitization of all types.”
First entry: Friday, January 13, 2006


D A V A
Digital Audiovisual Archiving

Blogger: Gilad L. Rosner, Media Matters, LLC
From the tagline: “A blog focused on the digital transformation and preservation of audiovisual material.”
First entry: Thursday, March 10, 2005


File Formats Blog

Blogger: Gary McGath, Digital Library Software Engineer, Harvard University Libraries
From the tagline: “News and comments about technical issues relating to file formats, file validation, and archival software.”
First entry: Sunday, November 28, 2004


 FAQ  Print this article only

You've Got Mail—Now What? Regulatory & Policy Dilemmas in Email Management
Part I: US Federal Environment


Author: Richard Entlich - Cornell University (rge1@cornell.edu)

Embarrassing or incriminating email messages seem to be at the core of more and more government and corporate scandals. What’s to keep these entities from simply erasing these files as a means to avoid future problems?

Note: This FAQ will be responded to in two parts. Part I (below) will cover the US Federal government’s laws and regulations. Part II (in a subsequent RLG DigiNews) will provide a detailed survey of regulations and policies in the 50 United States.

Introduction

Consider the humble email message. Though email has evolved considerably in the 35 or so years since it was formally recognized as a distinct means of communications, at its core an email message is nothing more than a packet of character data with an address of origin and destination, labeled with a subject line (optionally), and the date and time of sending. Yet this simple technology has become astoundingly popular and increasingly integral to the operations and productivity of many organizations. There are over 1 billion active email accounts worldwide, generating over 30 billion daily messages. In the US, a Pew Internet poll conducted in December 2005 found that email usage averaged 91% across all age groups, with remarkably little variation in age cohorts from pre-teens to septuagenarians.

Any technology that provides access to such a large proportion of the population is bound to be misused. The 2005 annual report from UK Internet security firm MessageLabs, found that an annual average of 68.6% of emails were spam, 2.8% contained a virus or trojan, and phishing accounted for 0.3% of all email traffic. For users, email has become very much a mixed blessing. Though a powerful and eminently useful technology, sifting through spam can be a major annoyance (as well as a significant waste of time), while viruses and phishing are a threat to security and privacy.

Yet as difficult as email has become to handle for users, the challenge its management presents to information technology (IT) and records management (RM) personnel has become especially daunting. This may seem odd given how long email has been around, but the growth in usage has been extremely nonlinear. More than 20 years after its development, in 1992, only 2% of the US population had access to email and that number had grown only to 15% in 1997, less than 10 years ago. The recent dramatic surge in usage and the ever-changing legal and technological landscape around it has placed email at the center of controversy and scandal in recent years.

In profiling some real cases, it becomes evident that improper management of email messages can cause organizations a great deal of trouble. Consider a few recent examples from the corporate and government arenas:

  • Zubulake v. UBS Warburg: This was a sexual discrimination lawsuit that resulted in a jury award of $29.3 million to the plaintiff in April 2005. However, in July 2004, the federal district court of the Southern District of New York imposed sanctions on the defendant for failing to preserve emails and their backups after litigation began. One of the consequences for the defendant was a so-called “adverse inference instruction,” meaning that jurors were told that they could conclude that the destroyed emails contained evidence of wrongdoing.
  • For an extended period in 2000 and 2001, a congressional investigation into alleged fundraising improprieties in the Office of the Vice President spun off a secondary investigation into the circumstances leading up to and subsequent cover-up of the failure of a White House email system to archive 2 and 1/2 years of incoming messages. A GAO (General Accounting Office) report on the problem details how most of the archiving failure resulted from a minor configuration glitch in which an email server named “Mail2” was mistakenly referred to as “MAIL2.” It took staff 15 months to notice the problem, six more to diagnose it, and another five to fix it.
  • United States v. Philip Morris USA: This was one of the massive tobacco litigation cases in which the federal government was seeking to force cigarette manufacturers to give up $289 billion in profits that it claimed had been garnered as a result of improper marketing practices. In the midst of the case (also in July 2004, just one day after the ruling cited in Zubulake v. UBS Warburg), the federal district court in the District of Columbia sanctioned Philip Morris for improper deletion of emails after litigation began. The company was fined $2.75 million and eleven of its defense witnesses were barred from testifying for their role in document destruction.
  • Emails released as a result of freedom of information requests and congressional investigations in the aftermath of Hurricane Katrina held a number of embarrassing revelations. One showed that two days before the storm made landfall in New Orleans, the White House had received a detailed message that accurately predicted the massive flooding and loss of life and property. This contradicted statements from the administration that the level of damage could not have been anticipated. Other messages, particularly those involving former FEMA director Michael Brown, were characterized by some as demonstrating a lack of leadership and concern for the storm’s victims.
  • On December 3, 2002, the SEC (Securities and Exchange Commission), New York Stock Exchange (NYSE), and National Association of Securities Dealers (NASD) jointly sanctioned five securities firms (Deutsche Bank Securities Inc., Goldman, Sachs & Co., Morgan Stanley & Co. Inc., Salomon Smith Barney Inc., and US Bancorp Piper Jaffray Inc.) $1.65 million each for violations of rule 17a-4 of the Securities Exchange Act of 1934, NYSE rule 342, and NASD Rule 3010, all of which deal with requirements to maintain email communications for specified periods of time and to be able to make those communications accessible on demand.

After reviewing a few of these cases, it becomes apparent that in the current regulatory and legal environment, deletion of emails containing potentially incriminating evidence may result in outcomes worse than those that disclosure of their contents would have produced. But as we will see in reviewing the legal background of records retention law, the practice of RM for email embodies a number of difficult choices.

Background on US Electronic Records Law

It is a long-established tenet of RM that the nature of the content, rather than the format in which it is recorded, determines whether a particular item qualifies as a record that must be maintained and preserved. Nevertheless, US laws and regulations have gradually been amended to specifically address some of the unique attributes of electronic records.

General records retention requirements for executive branch officials and federal agencies are addressed in a number of US laws and regulations. Of particular importance are several chapters of Title 44 of the US Code (including the Federal and Presidential Records Acts) and substantial portions of Chapter XII of Title 36 of the US Code of Federal Regulations (CFR). Additionally, the rules of many federal agencies, such as SEC, FDA (Food and Drug Administration), and IRS (Internal Revenue Service), which are detailed in several different titles of the CFR, impose records retention obligations on various public and private entities, including financial services companies, pharmaceutical companies, health care providers, and employers. Non-governmental bodies such as NASD and NYSE also regulate record retention policies for some entities.

Updating of existing records retention laws to specifically address electronic records has been a slow and evolutionary process. The Federal Records Act of 1939 mentioned punch cards as potentially archival records. Later revisions used a more generic description of records, but specific mention of “machine-readable materials” was returned to the statute in 1976 (see “Amendments”). NARA (National Archives and Records Administration) did not explicitly identify email messages as potential federal records until 1995. NARA continues to refine its rules for retention of email messages by federal agencies, with the latest change (dealing with “regulations to provide for the appropriate management and disposition of very short-term temporary email”) having gone into effect just a few weeks ago (on March 23, 2006). For a list of pointers to federal records management laws and regulations, see the NARA Federal Laws, Policy & Regulations page.

In general, existing rules regarding email retention lay out criteria for determining whether a message is of sufficient import to be classified as a record, and if so, for how long a period it should be retained. They can also specify what kind and how much metadata must accompany each message, what kind of media they should be stored on, in what manner they should be backed up, and what kind of retrieval should be possible. Two events in the development of the current regulatory environment merit special attention.

In January 1989, on the last day of the Reagan administration, the National Security Archive (a non-profit institution, currently housed at George Washington University in Washington, DC, not to be confused with the National Security Agency) sued the Executive Office of the President (EOP) to block the destruction of several years’ worth of Reagan White House emails. The case became known as Armstrong v. EOP (or just “Armstrong”), after Scott Armstrong, the lead plaintiff, founder and director of the National Security Archive, and former Senate Watergate staff member. The American Library Association was a co-plaintiff in the case.

Armstrong dragged out in the courts through the entire Bush I administration and was finally decided in 1993 in favor of the plaintiffs. This landmark case led to sweeping changes in federal regulations regarding retention of email messages by federal agencies. As described in a detailed review for the ARMA Records Management Quarterly of the case’s implications from an RM perspective, the most important rule change that followed Armstrong requires that the recordkeeping copy of electronic mail messages deemed to be Federal records be moved to a true archival system unless the electronic mail system itself meets several minimum criteria. The system must be able to:

  • allow for “grouping of related records into classifications according to the nature of the business purposes the records serve.”
  • “[p]ermit easy and timely retrieval of both individual records and files or other groupings or related records.”
  • “[r]etain the records in a usable format for their required retention period.” In other words, the system must truly preserve the records and not just warehouse them. The corollary implication is that records managers have to budget for the long-term cost of preservation.
  • make electronic mail messages “[b]e accessible by individuals who have a business need for information in the system.” That is, “secret” email records files are not permitted.

And finally,

  • “agencies must permit transfer of permanent records to the [NARA].”

The second event was the passage of the Sarbanes-Oxley Act (commonly referred to as “SOX” in the literature), which became law on July 30, 2002. This legislation was spurred by malfeasance revealed during investigations into the financial collapse of corporate giants Enron and WorldCom. Among other things, SOX stiffened records retention requirements for publicly traded companies and added criminal sanctions to the fines and penalties already on the books for non-compliance. According to a recent report by the National Electronic Commerce Coordinating Council (NECCC, “an alliance of state and local associations dedicated to the advancement of electronic commerce within governments”), in addition to publicly traded companies, SOX “has been interpreted as applying to those public agencies (such as public universities) that operate foundations and report financial transactions to oversight boards of trustees.”

So What’s the Problem?

None of these rules by themselves explain why management of email has been problematic for so many. Basic storage of email messages isn’t particularly difficult, though the task became more complicated in 1996 when the original Internet Message Format Standard, which allowed only ASCII characters in the message header and body was superceded by MIME (Multipurpose Internet Mail Extensions), which accommodates non-US-ASCII characters, binary attachments, and multi-part message bodies.

Even though fully complying with existing record retention laws for email can be technically complex, that isn’t what has tripped up many of the companies and government entities found to be in violation. In most cases, the culprit has been human nature.

The entities covered by email record retention laws recognize that the preservation of those records is not necessarily for their own benefit and that the most likely circumstances under which disclosure will be required are as part of the legal discovery process in a lawsuit or criminal investigation or in response to a Freedom of Information Act request. Retention of emails can be expensive and time-consuming and may seem like an investment in self-incrimination.

It’s not hard to understand why managers try to cut corners and save money while complying with email retention requirements, but the options for doing so are not all that attractive. For example, given the volume of email the typical organization produces today, saving everything indiscriminately would require huge amounts of storage. Furthermore, keeping non-record email increases the risk that highly personal email messages will be exposed, as happened in 2003 when the Federal Energy Regulatory Commission posted half a million Enron emails from about 150 different accounts on its website. Records retention statutes don’t require that every email be saved, but sorting out the ones that don’t qualify as records, or only need to saved short-term, can also be expensive and time-consuming.

There are also tensions between access and confidentiality. In some cases, the conflicts result from competing intentions in federal law. For example, for health care providers, the privacy requirements of HIPAA (Health Insurance Portability and Accountability Act) can run head-on into the disclosure requirements of other federal laws and regulations.

For those who fear that there may be evidence of wrongdoing in retained emails, there is the knowledge that the more messages are saved, and the longer they are kept, the greater the risk that something in them will be the target of an adverse legal action or unfriendly freedom of information request. These managers are left in the unenviable position of weighing the risk of discovery against that of possible sanctions for flouting retention rules. In the case of Philip Morris, the fine for violation of email retention statutes was a miniscule percentage of the profits the company stood to lose, but as we shall soon see, this kind of cost-benefit calculation can go dangerously awry.

Even for the organization that feels it has nothing to hide, it may be difficult to strike a balance between the risks in keeping too much for too long, against those of keeping too little for too short a period.

Lessons Learned??

One of the reasons courts have chosen to impose such hefty fines on intentional, or in some cases, even merely sloppy violators of email retention laws, has been to send a message to others who may be contemplating similar action. However, it seems the adage “once bitten, twice shy” may not apply here. For example

  • On July 13, 2005, UBS Securities (the company previously known as UBS Warburg, see above) was fined $2.1 million for willful violation of SEC rule 17a-4 for failure to preserve email messages sufficiently long or in systems designed for preservation.
  • In early February 2006, Special Counsel Patrick Fitzgerald revealed in the case of the prosecution of I. Scooter Libby, former chief of staff to Vice President Dick Cheney, that “not all e-mail of the Office of Vice President and the Executive Office of the President for certain time periods in 2003 was preserved through the normal archiving process on the White House computer system.” The missing emails were discovered a few weeks later and turned over to the Special Counsel, but not before speculation raged about the reasons for their absence.
  • Also in February 2006, investment bank Morgan Stanley (again, see above) was fined $15 million by the SEC for failure to preserve email messages. The size of the fine was determined in part by Morgan Stanley’s failure to comply with previous SEC orders to improve its handling of email following the smaller fine in 2002. Of much greater consequence however, is the fact that the missing emails played a pivotal role in Morgan Stanley’s loss of a $1.58 billion lawsuit filed against it by financier Ronald Perelman. Due to the bank’s inability to produce emails requested by the plaintiff, the judge reversed the burden of proof, requiring Morgan Stanley to prove its innocence. Morgan Stanley is appealing its loss to Perelman, but if it stands, the judgment will dwarf the SEC fine by a factor of 100 to 1.

Clearly, as these repeat offenses demonstrate, despite stiff fines (for corporations) and political costs (for the White House), compliance with federal email retention laws and regulations is still being resisted. On a larger scale, a survey of 2000 members of ARMA (Association of Records Managers and Administrators) and AIIM (Association for Information and Image Management) International by Cohasset Associates in 2005 found that nearly half (49%) of the respondents had no formal email retention policy. Though this is an improvement from the previous survey in 2003, when the number was 59%, Cohasset concluded that “Given the extraordinary publicity e-mail evidence has had in the mainstream press, the real ‘news’ may not be [that] there was substantial improvement, [but] rather [that] the rate of improvement was not greater.”

Some politicians have adopted a different technique for sidestepping the risks associated with federal email retention requirements. In her review of Armstrong for the ARMA Records Management Quarterly (cited earlier), Catherine Pasterczyk showed remarkable prescience when she postulated that “Agency personnel may learn not to use email in order to avoid dealing with recordkeeping requirements and ‘unwanted premature’ disclosure of recorded thoughts.”

That might have seemed more plausible in 1998 when the article was written and email wasn’t nearly as widely used as it is today, yet the congressional investigation into the response to Hurricane Katrina revealed that neither Homeland Security Secretary Michael Chertoff nor Defense Secretary Donald Rumsfeld use email. Chertoff was quoted in an interview with the New York Times saying “I don’t use email. One reason is when you write an e-mail, you have to be mindful of the fact that nothing ever disappears. It can be deleted, but it is still in the system somewhere.”

Naturally, this revelation was greeted with some surprise. In the article about Chertoff and Rumsfeld’s non-use of email, Dr. Irwin Redlener, a disaster-preparedness expert at Columbia University reacted by saying “This can’t be true. It’s almost inconceivable in 2006 for officials at that level of government not to be directly connected to systems of communications.”

Some politicians and political appointees may feel that they can afford to abstain from email use while in office and avoid the consequences of unwanted disclosure. That’s not an option for organizations. Fortunately, however, despite some difficult dilemmas in the management of email, there is a growing body of literature on best practices in the existing regulatory environment. See the resources below for starting points.

Resources

The Sedona Conference Working Group on Best Practices for Electronic Document Retention & Production, “Best Practice Guidelines & Commentary for Managing Information & Records in the Electronic Age,” September 2005. (Name and email address required.)

NECCC Records Management Work Group, “Regulatory Impacts on E-Records Management Decisions,” 2005.

Mayer, Paul, “Email Storage & Management Best Practices: A Regulatory & Business Requirement,” October 2003.

SANS Institute, model “Email Retention Policy,” July 2003. (Also available as a Microsoft Word document.)

Wallace, David A., “Recordkeeping and Electronic Mail Policy: The State of Thought and the State of the Practice,” June 1998.

Graduate School of Library  and Information Science, The University of Texas at Austin, “Managing E-mail as Records,” Managing Electronic Records Seminar, Technology Summer Camp 1997.

In the next (and final) installment, we’ll survey how individual states are coping with the email retention issue by taking a detailed look at their policies.


 Calendar of Events  Print this article only





Archival Perspectives in Digital Preservation
April 27-28, 2006
State College, Pennsylvania

This Society of American Archivists (SAA) seminar explores how concepts such as integrity, authenticity, and trust are embedded in specific digital preservation programs, including OCLC/RLG, InterPARES, and selected European initiatives.

Vanishing Bits & Bytes: Preserving Information for the Future
May 8, 2006
Houston, Texas

Clifford Lynch, Director of the Coalition for Networked Information (CNI), Victoria Reich, Director and co-founder of the LOCKSS Program, and others will speak at this one day conference sponsored by the Houston Academy of Medicine / Texas Medical Center, University of Houston, and Rice University.

DELOS Summer School
June 4-10, 2006
San Miniato, Italy

This six day course led by internationally established lecturers will broadly cover issues of digital preservation including: selection and appraisal, workflow modeling, metadata definition, ingest process management, audit and certification of digital repositories, and other techniques and practices.

Digital Curation & Trusted Repositories: Seeking Success
June 11-15, 2006
Chapel Hill, North Carolina

This workshop, to be held in conjunction with the Joint Conference on Digital Libraries (JCDL 2006), will facilitate dialogue about emerging principles of digital curation, present technical and managerial models for producing successful digital repositories, and explore ways to evaluate and measure the success of digital repositories.

An Expedition to European Digital Cultural Heritage: Collecting, Connecting–and Conserving
June 21-22, 2006
Salzburg, Austria

This International Conference on the Digitisation of Cultural Heritage, An Expedition to European Digital Cultural Heritage: Collecting, Connecting–and Conserving, will present the practical challenges of collecting, connecting, and digitally conserving cultural treasures and scientific information and lead attendees on “an expedition to the vision of a common European digital cultural heritage space.”

Preserving Photographs in a Digital World
August 19-24, 2006
Rochester, New York

The Image Permanence Institute will be sponsoring this week-long introduction to photographic preservation technology, digital imaging, and archival practice. Sessions will include lectures, round tables with experts, and hands-on labs.

International Web Archiving Workshop
September 21-22, 2006
Alicante, Spain

This workshop will take place in conjunction with the 10th European Conference on Research and Advanced Technologies for Digital Libraries (ECDL). The workshop organizers have opened the call for papers on topics relevant to Web archiving. Submissions are due by June 10, 2006.

International Conference on the Preservation of Digital Objects
October 8-10, 2006
Ithaca, New York

The theme of this years International Conference on the Preservation of Digital Objects (iPRES2006) is Words to Deeds: Collaboration in the Realm of Digital Preservation. Sessions will address preserving multimedia objects, e-journal preservation, certification of digital repositories, and national efforts in digital preservation. Submissions for presentations are currently being accepted through August 15, 2006 by sending a brief abstract to ipres2006@cornell.edu.

Sofia 2006: Globalization, Digitization, Access and Preservation of Cultural Heritage
November 8-10, 2006
Sofia, Bulgaria.

Sofia 2006 is the fourth in this bi-annual conference series. The meeting will feature keynote speaker Michael Gorman, President of the American Library Association.
Topics to be addressed will include:

  • Libraries, museums, archives, and record centers
  • Digitization and access
  • Intellectual property
  • National and international information policies and projects
  • Preservation
  • Library/information science education

 Announcements  Print this article only





Scholarship and Libraries in Transition Symposium: Webcasts and Blog

Webcasts and an event blog are available from this symposium which was presented by the University of Michigan University Library and National Commission on Libraries and Information Science on March 10-11, 2006.

DPC Decision Tree

The Digital Preservation Coalition has released a new digital preservation tool: a “Decision Tree for Selection of Digital Materials for Long-term Retention.” The Web-based, interactive form is designed to help organizations develop and test their selection policies for digital content considered for long-term preservation.

Mind the Gap

The Digital Preservation Coalition has released a report, “Mind the gap: assessing digital preservation needs in the UK,” based on a survey that polled a wide range of institutions about their current efforts in digital preservation. While most institutions acknowledged awareness of the threats to digital information,  “less than 20% of UK organisations surveyed have a strategy in place to deal with the risk of loss or degradation to their digital resources.”

Standards and Best Practices for Preserving Sound Recordings

The Library of Congress (LC) and Council on Library and Information Resources (CLIR) has issued a new report that assesses current standards and best practices for capturing sound from analog discs and tapes. The 43 page report, based on a meeting of audio experts hosted by LC in January 2004, is available free of charge from the CLIR website.

Call for Papers: Special Issue on Digital Preservation

The International Journal on Digital Libraries is soliciting manuscripts for a special issue focusing on digital preservation. Submissions are due June 30, 2006.

Digitization Project Management Workshop

The presentation and several supplementary materials from the Digitization Project Management Essentials Workshop given at Computers in Libraries on March 25, 2006 by K. Matthew Dames and Jill Hurst-Wahl have recently been made available on Hurst-Wahl’s Digitization 101 blog.

IMLS Report: The Status of Technology and Digitization

The Institute of Museum and Library Services has released a new report, The Status of Technology and Digitization, which details the results of their second survey (conducted in 2004 as a follow up to their 2001 survey) to document “status of new technology adoption and digitization in the nation’s museums and libraries.”