RLG DigiNews
BROWSE ISSUES
SEARCH
RLG
   
  October 15, 2003, Volume 7, Number 5
ISSN 1093-5371


Table of Contents


Editor's Interview
The Digital Opportunity Investment Trust: An Interview with Anne G. Murphy

Feature Article 2
Digital Preservation logoPRONOM—A Practical Online Compendium of File Formats, by Jeffrey Darlington

Highlighted Web Site
JHOVE

FAQ
Recent developments and improvements in hardware for scanners and digital cameras, by Don Williams

Calendar of Events

Announcements

print this article

Editor's Interview

The Digital Opportunity Investment Trust (DO IT)Digital Promise

Anne G. Murphy
Digital Promise Project



Your Web site includes references to several project names: the Digital Promise, the Digital Opportunity Investment Trust (DO IT), and the Digital Gift to the Nation. The historical precedents cited for the proposed funding are interesting. Would you provide a brief summary of the project and the ways in which it has evolved since its inception in 2001?

The names reflect the evolution of the project. In 2001, the Century Fund and the Carnegie Endowment asked Mr. Minow, former Chair of the FCC and Mr. Grossman, former President of NBC News and PBS to look at questions concerning telecommunications and the not-for-profit sector. This project was dubbed the Digital Promise Project.

A Digital Gift to the Nation book coverIn 2002, the project published its findings and recommendations in a book called The Digital Gift to the Nation, which proposed the creation of the Digital Opportunity Investment Trust. The recommendation was built on historical precedence. In each of the past three centuries, Congress made a bold, farsighted investment in educating all our citizens. The Northwest Ordinance of 1787 set aside public land to support public schools in every state. The Land-Grant Colleges Act of 1862 established 105 land-grant colleges that made America’s agriculture and industry the most advanced in the world. The GI Bill of 1944 profoundly expanded educational opportunities for veterans of World War II.

Now, we propose, it is time for the fourth major educational initiative to advance those great legacies into the 21st century. Congress should create the Digital Opportunity Investment Trust (DO IT) to open the door to a knowledge-based future for Americans of all ages. The Trust would be funded by revenues from a publicly owned asset, the electromagnetic spectrum, the 21st century equivalent of the publicly owned land that earlier financed America’s public schools and public higher education. Within the next decade, congressionally mandated federal auctions and fees for the commercial exploitation of these spectrum frequencies are expected to produce tens of billions of dollars. Over the years, DO IT will accumulate approximately $20 billion of this revenue. Conservatively invested, this would back the Trust with an annual budget of approximately $1 billion.

The focus is on education, but digitizing materials has a prominent place in the project. What is the scope of the digitization portion of the Digital Promise and have you set guidelines for conversion and types of content to be included? Will born-digital materials be included as well?

breakout quoteWe need to keep in mind that, at this stage, the Digital Opportunity Trust is a proposal. No activity has taken place, and certainly no definitive allocations have been made. As a result of our studies, we propose that one of the main objectives of the Digital Opportunity Trust be to assist in the digitization of the collections of universities, museums, libraries, and cultural institutions—America’s heritage is stored there. DO IT will help to digitize these collections and to set standards to conserve born-digital materials, ensuring their accessibility to all. It will assist in the development of content and software to integrate the riches of our cultural institutions into classroom curricula and stimulate research in the humanities.

The art of digital capture is in a state of flux, which may be unavoidable at this point in order to ensure that progress in the technology continues. But the dizzying array of individual projects being undertaken makes it difficult to develop consistent, reliable management of the growing digital collections. It’s not enough to rely entirely on powerful search engines that struggle to make sense of countless formats and products of highly varying quality. Compatibility must be assured and standards must be set by an impartial public body such as DO IT. One major issue is the core set of information that should be attached to digital information—whether it’s a text, a recording, an image, or some other form. Standard methods have been developed for book and article citations (the ISBN provides a unique identifier for books), and the music industry has a variety of standards. The problem becomes increasingly challenging when the original material is an artifact, an excerpt from an animation, or another format.

There is disagreement today about the minimum set of information that should be provided (author, location, etc.) and how it should be recorded. The Dublin Core Metadata Initiative has received approval as an international “resource discovery metadata standard on the Internet” (ISO 15836). It provides “a foundation block of modular, interoperable metadata for distributed resources.” But much work remains before the standard is universally accepted and ambiguities about interpreting the standard are resolved.

How will this initiative complement recent and current digitization projects by libraries, archives, and museums? Does this initiative relate to the National Digital Information Infrastructure and Preservation Program (NDIIPP) under way by the Library of Congress?

Our work has been deeply informed by the work of Laura Campbell and her staff at the Library of Congress. We see the work of DO IT as a complement to the fine work being done by the Library, the NEH, and IMLS.

breakout quoteThe digital content created for the project will exist within a distributed network of education providers. How will the content be managed over time?

As proposed, DO IT would provide funding to advisory groups to develop criteria and help to establish urgently needed guidelines and interoperable standards. The Trust will encourage their widespread adoption by acting as an important funding source for digitization projects. It would establish task forces to tackle the issues of intellectual property, metadata, and the variety of problems encountered in ensuring continuous migration of archival collections to ensure compatibility and the longevity of these resources.

Your project has been compared to work by the National Science Foundation (NSF), National Institutes of Health (NIH), and the Defense Advanced Research Projects Agency (DARPA). All of these groups, but NSF in particular, have included support for long-term access to content created by their projects. What priority will preserving the digital content created by your project have? Will the resources for the project include sustainable funding for preserving your digital content?

We need to build trusted digital repositories where these sources will persist, where they will be unaltered in form or content by hackers and others, where their provenance is known, and where they can be easily located using sophisticated user interfaces.

What is the current status of Congressional support for the Digital Promise, and what are your plans for promoting the initiative this year?

In 2003, Congress allocated funds for an in-depth study of the rationale for the creation of DO IT, the development of a Research and Development Roadmap, and a proposal for the structure and governance of the new entity. This report will be delivered to Congress in October 2003. Shortly thereafter, Senator Dodd and others will introduce legislation to implement the report.

We invite your readers to read the report on our Web site (available October 24, 2003). If they agree with our recommendations, we request that they ask their senators to support the Dodd legislative initiative.

What do you see as the greatest enablers for the project? … the greatest barriers?

breakout quoteIn the course of the project, we have developed an outstanding Leadership Circle and Coalition of Organizations that support this initiative. In the coming months we will be working with members of both groups and their constituencies to raise awareness of the proposal and to develop widespread expressions of support.

The greatest barrier is the burden on the federal budget. We are reminded, however, that each of the other major transformations in learning was enacted during a period of war. The Land-Grant Colleges Act, perhaps the greatest piece of education legislation ever enacted, was signed by Abraham Lincoln at the height of the Civil War.

 

 

print this article

PRONOM—A Practical Online Compendium of File Formats

Jeffrey Darlington
The National Archives (UK)


There is a pressing need to establish reliable, sustained repositories of file format specifications, documentation, and related software. We recommend the establishment of such repositories for format-specific materials related to migration as a preservation strategy.”

Risk Management of Digital Information: A File Format Investigation,
Council on Library and Information Resources, Washington, D.C.,
June 2000

breakout quoteThe problem faced by the National Archives and others who aim to preserve history by preserving records is that records are not always designed for longevity. They may be as ephemeral as messages written in the sand at low tide, or as permanent as the inscriptions on granite monuments. It is ironic that the primitive technology of ancient times has produced records lasting hundreds of years, while today’s advanced electronic world is creating records that may become unreadable in a few years’ time.

The correct interpretation of records has always required knowledge of the language in which they were written, and sometimes of other subjects too - medieval penmanship, for example. Fortunately enough of this knowledge has survived that we can make sense of most of the records that have come down to us. Modern technology has further complicated the problem of interpretation by making the viewing of records dependent on hardware and software environments whose own longevity is doubtful. Just as interpretation of the 1086 Domesday Book depends on the dictionaries and grammars for medieval Latin painstakingly compiled by long-dead scholars, interpretation of contemporary electronic records in the future will only be possible if the necessary methods and tools are compiled, documented, and preserved now.

When, in the evolution of computing, punched cards were superseded by magnetic tape, computer records became invisible and readable only by computers. Preservation programs were started up as early as 1962, and it was soon recognized that the incompatibility of tape formats would complicate the task of preservation. This realization led to the development of the ASCII character code, and other standards for media and recording formats that have served the preservation community well over the years. The new problems generated by the evolution of the personal computer and the word processing application were not so readily recognized. The new tools were seen at first as merely a method of creating paper documents that could be archived on paper. File formats were not standardized and soon proliferated alarmingly, every new software product having its own format.

Dimensions of Incompatibility

Besides the incompatibility between products, there is the time dimension. For each product, new versions of its file formats quickly superseded earlier ones. As more and more facilities were added, the formats became more complex. Sometimes the old format was a subset of the new, giving forward compatibility, but often the appearance of an old record was not rendered correctly by new versions of the software. These problems have multiplied as records have evolved from simple texts to complex assemblies of diverse elements, including embedded formulae, images, and charts. Web sites go a stage further with animations, video clips, and dynamic content. There is no paper analog that could be archived in these cases. And yet another dimension is that of dependencies. There are layers of application software that depend on operating systems that in turn depend on hardware.

Anyone responsible for managing and sustaining access to electronic records, even over relatively short timescales, must deal with these challenges to ensure that valuable records are not left stranded in formats that are no longer supported. One approach to a solution is to migrate records from obsolete formats into current ones, ideally into formats with published standards. Another is to preserve records in their original formats and keep copies of software products that can interpret those formats. To preserve the ability to run those products, copies of operating systems must also be kept. And in principle, old models of hardware must be either preserved (the museum approach) or emulated in software. The emulation approach is described by Rothenberg. Technical information about file formats and the software products that support them is a prerequisite for any digital preservation regime.

Introducing PRONOM

breakout quoteThe National Archives is committed to preserving historic electronic records indefinitely, and has embarked on a program to make this feasible on a practical level. One strand in this program is PRONOM, our database of file formats, and its supporting library of software products. This collection of electronic materials is aimed initially at helping with the problem of software obsolescence, which seems the most urgent of the problems facing the preservers of electronic records. Hardware and media obsolescence are also to be addressed in future.

The first version of PRONOM was developed in March 2002 in parallel with the development of our Digital Archive system. It was designed to hold reliable technical information about the nature of the electronic records to be stored in the archive. For example, for Microsoft Word 97, PRONOM will tell you when it was launched, by whom, whether it is still on the market and whether it is still supported, what formats it writes, and what formats it reads. Interest expressed by other national archives led to the concept of distributing the database on CD, and PRONOM 2 was released in December 2002 to provide support for multilingual versions of the system.

The system was designed from the outset with a Web-based user interface and XML system interfaces, to conform to the UK e-Government Interoperability Framework. The latest version, PRONOM 3, is now being launched on the Web to make it available to the whole preservation community. We have simplified the user interface in the light of our own usability testing. The main search page allows the user to look up a file extension and see all the formats PRONOM recognizes with that extension, some extensions being shared by a number of products. It also allows the user to search for potential migration paths for a given format.

Content Development

breakout quoteIn parallel with system development we set out to collect information about file formats from software vendors and from the Internet. This turned out to be more difficult than we expected. Vendors did not generally have systematic records of their old formats that were easily available. Many software vendors have been taken over by companies that have been bought up in their turn, and contacts for their products are difficult to find, especially if the product has been discontinued. A number of enthusiasts have Web sites containing collections of file format information, but their coverage is incomplete, particularly for vendor and product details. The Representation and Rendering Project (PDF) at the University of Leeds provides a useful survey of these sources.

We also began to build up our library of software products. This was an easier task, and it turned up a useful new source of information. The boxes in which distribution CDs are packed often display information about operating system dependencies and compatibility with other products that is not published elsewhere.

Since that time, considerable effort has been devoted to the collection of PRONOM content. National Archives staff have undertaken intensive research and liaison with major software developers in order to create an initial core data set of software product information. Microsoft has been particularly helpful. For the initial data load the focus is on the most commonly used office products for PC operating systems from PC-DOS onwards. We intend to load information on about 450 products over the next few months. The content development work is ongoing, and we have at least some information on over 3,000 file formats yet to be verified. We encourage software developers and others to be proactive in providing information - there is an online submission form on the PRONOM Web site.

Preservation Strategies

breakout quoteInformation on file formats read and written by a product supports the migration of data by suggesting migration paths. However, this data needs interpretation and validation to be practically useful. Ideally, migrating a record into a new format is done at an early stage in its life and by the original author, who can check that the essentials of his or her work have been preserved. Later migrations are more risky, especially if more than one generation of format has intervened and the original author is no longer available. Unfortunately these are the conditions under which an archive may have to work. For example, there survive large collections of documents created by the early word processor WordStar that need to be preserved. Other word processing products may claim to read WordStar files, but how can we be certain that the full contents of the migrated document are captured?

We need a measure of how much the information content of a record is altered as a result of a particular migration process. Information content includes formatting, and functionality that is integral to the record rather than to the creating software (the difference between a hyperlink in a Word document and the fact that Word includes a spell check function). The design of PRONOM allows us to keep a measure of the “content invariance” of a migration path. Our intention is to define an objective and rigorous methodology for testing migration paths to measure content invariance, and to record this within PRONOM for each format that a particular product can read.

Migration paths for WordStar data are complicated because early DOS versions of WordStar used seven-bit ASCII characters, the eighth bit being used as a line wrap marker. When viewed by later products, these characters are wrongly interpreted as eight-bit ASCII equivalents, and to achieve a successful migration it is necessary to strip out the marker bits from the WordStar files. Since line wrapping is handled differently in later products, the loss of the eighth bit normally makes no difference, and at worst causes the text to be adjusted to different margins. This example shows the part that detailed technical knowledge plays in implementing a workable migration strategy, and also the importance of keeping the original bit-streams.

breakout quoteThe alternative to migration is to have the original bit-streams interpreted by preserved software from the software library. To support this strategy, it is necessary to preserve old operating systems and the ability to run them. Fortunately the dominance of the PC platform and Microsoft operating systems makes this approach practical for many collections of records, and the compatibility between successive PC models means that a PC emulator is not always needed. However, some older operating systems are dependent on a 16-bit architecture, and future PCs may eventually outgrow the current 32-bit architecture; emulation will be needed to overcome these problems.

Technologies are constantly evolving, and archivists must be aware of these changes and the implications for the electronic records they are preserving. As old software products cease to be supported and become obsolete, preservation activity will be needed for the file formats that depend on those products, whether by content invariant migration or by preserving the software. A “technology watch” process to identify triggers for preservation actions is a component of the PRONOM program.

There would be many advantages in migrating records to an XML-based standard format. The development of practical tools for this task depends on detailed technical descriptions of how each format actually works. This information is beyond the present scope of PRONOM, but we plan to collect it for the next major release. A subset of this information would be useful to develop tools for the recognition of file formats, a function included in our Digital Archive and at present provided by commercial viewer software. We expect to include this recognition function in a later version of PRONOM.

Development roadmap graphic

The Web-enabled PRONOM 3 completed this month, marks the latest stage in the evolution of the system. At the same time, it is a starting point for the development of PRONOM as a major shared online resource for the international digital preservation community. The National Archives has plans for major enhancements over the next few years, including the development of a number of specific tools to support digital preservation activities. Further information about our future plans is available on the National Archives Web site. We hope that through our continuing research and contributions from the Web community, the content of PRONOM will expand to give a comprehensive coverage of file formats that will support worldwide preservation initiatives.

Acknowledgments
Thanks to Adrian Brown, Jo Pettitt, and Rob Taylor of the National Archives Digital Preservation team.



Highlighted Web Site

JHOVE

Image of JHOVE

Digital preservation would be fairly straightforward if all computer files used the same format and that format didn't evolve over time. In reality, there are thousands of file format variants, with new ones introduced each year. Almost every significant step in the management of files in digital repositories, from ingest to delivery to making migration decisions, can be executed more effectively if accurate and detailed file format information is available.

JHOVE (pronounced "jove"), the JSTOR/Harvard Object Validation Environment, is a tool to automate the validation of file formats. Unlike less reliable approaches that rely on superficial indicators such as file extensions and MIME types, JHOVE uses format-specific modules to probe a file's internal structure. JHOVE's plug-in style architecture will allow the work of developing format modules to be shared. The JHOVE site will eventually include a tutorial on module writing, and a full explanation of the module interface.

The pre-release version of JHOVE (requires Sun's Java) can be downloaded for review and testing purposes.breakout (In its present state of development, JHOVE is easiest to install in a Unix environment, but with some tinkering it can be made to run under Windows). It includes modules for the identification, validation and characterization of arbitrary byte streams, ASCII and UTF-8 encoded text, TIFF, and PDF. A set of example files for testing the five modules is also available, and the JHOVE Web site includes a tutorial. When complete, JHOVE will be made available under an open source license, with support for Unix/Linux and Windows. Development of JHOVE is funded in part by the Andrew W. Mellon Foundation through a grant to JSTOR for the recently launched Electronic-Archiving Initiative.


print this faq

FAQ

What are recent developments and improvements in hardware for scanners and digital cameras?

For this FAQ, we asked Kodak's Don Williams to look back over the past five years and identify significant new developments as well as important incremental changes in scanner and digital camera design, with special attention to the needs of libraries and archives. Don Williams is a senior research engineer in the Image Science Division of Eastman Kodak Co. He has written extensively about digital image capture specifications and imaging performance metrics and is a regular participant on digital imaging standards committees.

Recent developments and improvements in hardware for scanners and digital cameras

When asked to contribute an update on advances in digital image capture technologies to this forum, I hesitated momentarily, gauged my instincts, and accepted. After all, in the past five years all sorts of new image capture devices have been introduced from which I could draw. What could be easier? Then the penny dropped.

I realized, somewhat humbly, that there actually have been few fundamentally new approaches applied to digital image capture, especially for museum and library community level tasks. Rather then chasing promising but unproven new scanning technologies, most efforts have focused on perfecting existing ones. The good news is ... this is not bad news. Freed from the onerous learning curve of technology adolescence, manufacturers have concentrated on multiple incremental improvements that come with maturity. For the user, the impact is nothing but positive. Imaging performance, cost, and speed (think workflow) have dramatically improved. This benefits not only research organizations but, notably, resource poor local/regional sites with their own conversion tasks.

But, to suggest that nothing is new is remiss. Certainly some exciting technologies have been introduced and are implemented in a few products. These and the cited maturity improvements are briefly discussed below. Being a scanner gearhead at heart, I have chosen to organize these according to four scanner subsets. They are 1) document handling, 2) illumination, 3) sensors/detection, and 4) data processing. Some items may be scan mode (e.g., transmissive vs. reflective) or hardware specific (e.g., flatbeds vs. cameras) and will be emphasized as such.

  1. breakout quote Document handling—Two words immediately come to mind, "Book Cradles." This class of camera hardware has improved from the yawning, static, manual contraptions of several years ago to the robotically articulated page-turning wonders of today. Like any new technology there is likely to be an optimization period for these devices, but the forecast is good. See, for example, Conservation by Design's Preservation Book Cradle and 4DigitalBooks' ™ automatic digitizing system.
    Less seductive but equally pragmatic is the trend from cameras with horizontally constrained document placement to those with a vertical document mounting option. Gravity and conservator concerns have dictated this change. Although appearing to be a trivial modification to existing camera design, doing so while maintaining resilience, portability, and utility of the supporting structures can be a challenge, especially for very large documents. Nevertheless, many designers have achieved this adaptability with minimal compromise, some elegantly so.
  2. Illumination—Though hardly noticed, illumination systems of flatbed scanners, both reflective and transmissive, have improved considerably. Largely, this is attributable to improvements in cold cathode fluorescent illumination sources used in these scanners. Their low cost, rapid warmup, stability, and improved color quality have made them nearly a universal choice for illumination sources in this class of scanners. Improvements in illumination optics and increased bit depth for these scanners have also provided dramatic uniformity performance.
    Several years ago it was advisable to avoid the platen margins on these scanners because of the uncompensated illumination falloff. Today, low cost scanners can be found where literally all of the platen area is effectively illuminated within 5.0 % uniformity. Epson flatbed scanners are particularly good, but any scanner can be tested simply by scanning a known uniform flat field document, like a Munsell paper sheet.
    From a conservator perspective, it is encouraging to see that some camera system breakout quotemanufacturers (e.g., Lumiere Technology) are proactive in designing ultraviolet and infrared friendly light sources for especially sensitive documents. The fading and heat characteristics of these portions of the radiation spectrum are a very real concern from a conservation perspective, particularly for high quality scans of long duration.
  3. Sensors/Detectors—Despite the hype on the benefits (lower cost, higher level of feature integration) of CMOS (Complementary Metal Oxide Semiconductor) sensors several years ago, they continue to have inferior imaging performance (higher noise, lower dynamic range) than their CCD (Charge Coupled Device) counterparts. To my knowledge they are used exclusively as area array camera sensors and not as scanning linear arrays. This makes them perfectly suitable for many consumer or prosumer (i.e. professional consumer) camera applications but risky for demanding conversion projects. For this reason, CCDs continue to be used as the imagers of choice for conversion grade scanning applications. Several important changes to the sensor "imager package" are noteworthy. They can apply to either CMOS or CCD type imagers and are:
    1. Pigmented color filters—For color scanners where the color filters are coated onto the sensor, some manufacturers are beginning to use pigmented rather than dye based filters. The reason for doing so is the same as for using pigmented dyes in inkjet print applications—stability.
    2. Depth-wise color detection—This is a new color detection technology for digital cameras developed by Fovean. Its claim to fame is that it can capture a fully pixel-populated RGB digital image using a single area array detector in a single frame. Most of today’s studio cameras use scanning linear arrays (slow), color filter wheels with area arrays (requires multiple frames), or sparsely populated RGB color filter arrays (requires de-mosaic interpolation). Fovean has accomplished this by taking advantage of the well-known fact that different colored light penetrates to different depths within the detector bulk. Red light penetrates the furthest, green light less so, and blue light even less. By reading out the charge associated with different depths within the sensor one can in fact create a color image without the explicit use of color filters. This is not an easy task though and may require aggressive data processing to achieve the demanding image performance levels of imaging for the cultural heritage community. Currently cameras employing this technology cater to the prosumer market.
    3. Smaller pixel sizes—The individual sensors associated with a single image pixel have become progressively smaller over the years. Indeed, this has allowed prosumer/consumer digital cameras to increase their total pixel count without significantly changing overall detector size. Today, typical sizes may range between 3-5 microns per pixel compared to 7-11 microns of the past. These smaller sizes are not without their imaging performance tradeoffs. To achieve the same signal levels per pixel, about four times the illumination level is required (can you hear the paper conservators gasp?). Without these increased levels, a greater reliance is placed on subsequent image processing to deliver the image. Depending on the processing aggressiveness, this almost always increases image noise levels, which lead to lower image quality.
    4. Support Electronics—Perhaps the most impressive changes have come in terms of reducing the size of the camera/scanner’s support electronics. This is where the analog-to-digital conversion as well as much of the data processing (see next section) occurs. What used to be the size of a deck of cards has now been reduced, via CMOS integration, to that of a nickel.

  4. Data Processing—Rather than cumbersomely performing image processing functions offline, there is a trend to integrate common scanner related functions such as OCR (Optical Character Recognition) and distortion correction within the support electronics. One of these functions, licensed from Applied Science Fiction (ASF) as Digital ICE™, is automatic scratch and blemish removal. It was first introduced for film scanners (Nikon) and more recently into reflection scanners (Microtek). Truly a technology change, Digital ICE™, relies on the scattering of infrared light by scratch and blemish artifacts in film and photographic paper. An infrared scan in addition to RGB color scans are made of the sample. The infrared scan is used to identify where the scratches are located. This information is then used to mask the blemishes through image processing in the other three color records. It works quite well for minor defects in color negative and incorporated color slide films (e.g., Ektachrome). Unfortunately, this technology has been known to behave erratically on film media common to the library and museum communities. For instance, mixed results occur for non-incorporated coupler films (e.g., Kodachrome) and it will fail completely on all black and white silver halide films.

breakout quoteFinally, a few words on multi-spectral or hyper-spectral image capture for artwork. In concept, performing these types of captures has always been easy. Through multiply-filtered frames and suitably designed light sources, a number of demonstration projects of this nature have been documented. (For some examples, see RLG Diginews, October 15, 1999.) But let’s face it, these projects have not been the epitome of productive workflows. They have, however, supplied critical examples of ways to improve the process and what shortcuts can or cannot be taken. Over the next several years I predict that large gains in productivity, economy, and quality will be made in this area of digital image capture. Some university and commercial partnerships are exercising new models for multi-spectral capture and it will be exciting to see the future levels of improvement.

Calendar of Events

Online Course on Digital Licensing
September 22-November 20, 2003
The course is designed for information professionals who wish to learn more about licensing digital and online content - such as periodicals, databases, and images - without attending an in-person seminar. The target audience includes librarians, archivists, publishers, photographers, Web site owners, content developers, and those in museums, educational institutions, and governments. Participants will receive three e-lessons per week for nine weeks; each e-lesson has a self-marking quiz. Participants also will have access to an exclusive online discussion list on the course content.

The Next Generation of Access: OpenURL and Metasearch
Washington, DC.
October 29 & 30, 2003
NISO will hold two one-day conferences to inform you about the two leading standards initiatives that promise to re-shape information access for all actors in the information delivery equation—publisher, aggregator, librarian, student, scholar, and author. You can attend one or both events. Both meetings will be held at the conference center at the American Geophysical Union.

'Parallel Lives': Digital and Analog Options for Access and Preservation
London, UK
November 10, 2003
A joint conference of the National Preservation Office and King's College to address the importance and interrelated lifecycles of digital images, microfilm, photographs, and other surrogates. Explores how we should create, store, provide access and manage digital objects for the benefit of culture and society.

International Workshop on the Trusted Digital Repository for Cultural Heritage
Rome, Italy
November 17-19, 2003
ERPANET and the Accademia Nazionale Dei Lincei are jointly sponsoring this workshop to identify and discuss the key scientific, technical, management, and policy considerations for the successful implementation of a trusted repository for preserving cultural heritage.

6th International Conference of Asian Digital Libraries (ICADL 2003)
Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access

Kuala Lumpur, Malaysia
December 8-11, 2003
Topics include data mining in digital libraries, multimedia digital libraries, intellectual property rights and copyright, metadata issues, data storage and retrieval, and knowledge management.

International Archiving Workshop on the Selection, Appraisal, and Retention of Scientific Data
Lisbon, Portugal
December 15-17, 2003
The aim of the workshop is to identify and discuss the key scientific, technical, management, and policy considerations for the successful implementation of appraisal and selection guidelines and retention policies.

International Workshop on Document Image Analysis for Libraries
Palo Alto, CA
January 23-24, 2004
This workshop aims to bring together researchers, practitioners, and users who are interested in new technologies to help integrate imaged and encoded documents within digital libraries. Topics include imaging and compression standards, content and metadata extraction, multimedia document analysis, and digital library best practices.

Victorian Association for Library Automation 12th Conference
Breaking Boundaries: Integration & Interoperability

Melbourne, Australia
February 3-5, 2004
This conference will explore the successes and the key challenges in the field of library and information technology. Sessions topics include archiving radio and television, Managing Digital Objects, Open Archive services, and Electronic Publishing.

Museums and the Web
Washington, DC/Arlington, VA
March 31-April 3, 2004
MW2004 will feature a variety of sessions exploring all aspects of the creation, development, maintenance, and evaluation of Web sites in museums, archives, libraries and other cultural and heritage organizations.



Announcements

NISO Publishes Metadata Demystified: Guide for Publishers
The National Information Standards Organization (NISO) announces the joint publication with the Sheridan Press of Metadata Demystified: A Guide for Publishers. The guide presents an overview of evolving metadata conventions in publishing, as well as related initiatives designed to standardize how metadata is structured and disseminated online. Focusing on strategic rather than technical considerations, it offers insight into how publishers can streamline metadata operations. The guide is available for free downloading from NISO.

The GPO and National Archives Unite in Support of Permanent Online Public Access
The US Government Printing Office (GPO) and the National Archives and Records Administration (NARA) have announced an agreement to ensure that free online public access to more than 250,000 federal government titles will remain available permanently. NARA will assume legal custody of the titles as part of the official Archives of the United States, and the GPO will retain physical custody and responsibility for permanent public access and preservation.

3rd ECDL Workshop on Web Archives Proceedings Available
This year the 3rd annual ECDL workshop had presentations from national libraries and researchers about their experiences and projects in the area of Web archiving. Proceedings from the conference are available on their Web site.

Dublin Core Metadata Element Set Recognized by ISO
The International Standards Organization (ISO) has approved the Dublin Core Metadata Element Set as an international metadata standard. Dublin Core was developed for use on the Web and in other information networks across a wide variety of subject areas, languages and economic sectors. ISO approval signifies international recognition of the standard.

Fedora v.1.1 Released
The University of Virginia Library announces the release of Fedora v.1.1, an open-source digital object repository management system. The Fedora Project, a joint effort of the University of Virginia and Cornell University, has made available the first version of a system based on the Flexible Extensible Digital Object Repository Architecture, originally developed at Cornell. This first version of the software is designed to support a repository containing one million objects using freely available software. It fully implements the Fedora architecture, provides the first version of a graphical user interface to manage the repository, and provides facilities to create and ingest batches of objects.

National Library of New Zealand Preservation Metadata Data Model Released
The National Library of New Zealand has released a data model for implementation of its preservation metadata process. The data model is based on a logical preservation metadata model released earlier and maintains the overall structure and data relationships. The model, which includes XML schema definitions, is intended to provide a step toward the implementation of a repository for preservation metadata.

The National Archives of the UK Announces the Launch of a New Archive of UK Central Government Web Sites
This initiative will collect and preserve 50 UK government Web sites, including the Hutton Inquiry, 10 Downing Street, and the Northern Ireland Office. Sites are gathered in weekly or 6-monthly snapshots, using a modified version of the Internet Archives Web crawler. The complete archive will be available on the Web and in the National Archives public search rooms. A copy of each snapshot will also be accessioned for long-term preservation.


Publishing Information

RLG DigiNews (ISSN 1093-5371) is a Web-based newsletter conceived by the RLG preservation community and developed to serve a broad readership around the world. It is produced by staff in the Department of Research, Cornell University Library, in consultation with RLG and is published six times a year at www.rlg.org.

Materials in RLG DigiNews are subject to copyright and other proprietary rights. Permission is hereby given to use material found here for research purposes or private study. When citing RLG DigiNews, include the article title and author referenced plus "RLG DigiNews." Any uses other than for research or private study require written permission from RLG and/or the author of the article. To receive this, and prior to using RLG DigiNews contents in any presentations or materials you share with others, please contact Jennifer Hartzell , RLG Corporate Communications.

Please send comments and questions about this or other issues to the RLG DigiNews editors.

Co-Editors: Anne R. Kenney and Nancy Y. McGovern; Associate Editor: Robin Dale (RLG); Technical Researcher: Richard Entlich; Contributor: Erica Olsen; Copy Editor: Martha Crowe; Production Coordinator: Carla DeMello; Assistant: Valerie Jacoski.

All links in this issue were confirmed accurate as of October 15, 2003.

 

   
 
RLG DigiNews
BROWSE ISSUES
SEARCH
RLG