| |
Table
of Contents
Editor's Interview
The Digital Opportunity Investment Trust: An Interview with Anne G. Murphy
Feature
Article 2
PRONOM—A
Practical Online Compendium of File Formats, by Jeffrey Darlington
Highlighted
Web Site
JHOVE
FAQ
Recent
developments and improvements in hardware for scanners and digital cameras,
by Don Williams
Calendar
of Events
Announcements

print
this article
Editor's
Interview
The
Digital Opportunity Investment Trust (DO IT)
Anne
G. Murphy
Digital Promise Project
Your
Web site includes references to several project names: the Digital Promise,
the Digital Opportunity Investment Trust (DO IT), and the Digital Gift
to the Nation. The historical precedents cited for the proposed funding
are interesting. Would you provide a brief summary of the project and
the ways in which it has evolved since its inception in 2001?
The names
reflect the evolution of the project. In 2001, the Century Fund and
the Carnegie Endowment asked Mr. Minow, former Chair of the FCC and
Mr. Grossman, former President of NBC News and PBS to look at questions
concerning telecommunications and the not-for-profit sector. This project
was dubbed the Digital Promise Project.
In
2002, the project published its findings and recommendations in a book
called The Digital Gift to the Nation, which proposed the creation of
the Digital Opportunity Investment Trust. The recommendation was built
on historical precedence. In each of the past
three centuries, Congress made a bold, farsighted investment in educating
all our citizens. The Northwest Ordinance of 1787 set aside public
land to support public schools in every state. The Land-Grant Colleges
Act of 1862 established 105 land-grant colleges that made America’s
agriculture and industry the most advanced in the world. The GI Bill
of 1944 profoundly expanded educational opportunities for veterans of
World War II.
Now,
we propose, it is time for the fourth major educational initiative to
advance those great legacies into the 21st century.
Congress should create the Digital Opportunity Investment Trust (DO
IT) to open the door to a knowledge-based future for Americans of all
ages. The Trust would be funded by revenues from a publicly owned asset,
the electromagnetic spectrum, the 21st century equivalent of the publicly
owned land that earlier financed America’s public schools and
public higher education. Within the next decade, congressionally mandated
federal auctions and fees for the commercial exploitation of these spectrum
frequencies are expected to produce tens of billions of dollars. Over
the years, DO IT will accumulate approximately $20 billion of this revenue.
Conservatively invested, this would back the Trust with an annual budget
of approximately $1 billion.
The
focus is on education, but digitizing materials has a prominent place
in the project. What is the scope of the digitization portion of the Digital
Promise and have you set guidelines for conversion and types of content
to be included? Will born-digital materials be included as well?
We
need to keep in mind that, at this stage, the Digital Opportunity Trust
is a proposal. No activity has taken place, and certainly no definitive
allocations have been made. As a result of our studies, we propose that
one of the main objectives of the Digital Opportunity Trust be to assist
in the digitization of the collections of universities, museums, libraries,
and cultural institutions—America’s heritage is stored there.
DO IT will help to digitize these collections and to set standards to
conserve born-digital materials, ensuring their accessibility to all.
It will assist in the development of content and software to integrate
the riches of our cultural institutions into classroom curricula and
stimulate research in the humanities.
The art
of digital capture is in a state of flux, which may be unavoidable at
this point in order to ensure that progress in the technology continues.
But the dizzying array of individual projects being undertaken makes
it difficult to develop consistent, reliable management of the growing
digital collections. It’s not enough to
rely entirely on powerful search engines that struggle to make sense
of countless formats and products of highly varying quality. Compatibility
must be assured and standards must be set by an impartial public body
such as DO IT. One major issue is the core set of information
that should be attached to digital information—whether it’s
a text, a recording, an image, or some other form. Standard methods
have been developed for book and article citations (the ISBN provides
a unique identifier for books), and the music industry has a variety
of standards. The problem becomes increasingly challenging when the
original material is an artifact, an excerpt from an animation, or another
format.
There is
disagreement today about the minimum set of information that should
be provided (author, location, etc.) and how it should be recorded.
The Dublin Core Metadata Initiative has received approval as an international
“resource discovery metadata standard on the Internet” (ISO
15836). It provides “a foundation block of modular, interoperable
metadata for distributed resources.” But much work remains before
the standard is universally accepted and ambiguities about interpreting
the standard are resolved.
How will this initiative complement recent and current digitization projects
by libraries, archives, and museums? Does this initiative relate to the
National Digital Information Infrastructure and Preservation Program (NDIIPP)
under way by the Library of Congress?
Our work
has been deeply informed by the work of Laura Campbell and her staff
at the Library of Congress. We see the work of DO IT as a complement
to the fine work being done by the Library, the NEH, and IMLS.
The
digital content created for the project will exist within a distributed
network of education providers. How will the content be managed over time?
As proposed,
DO IT would provide funding to advisory groups
to develop criteria and help to establish urgently needed guidelines
and interoperable standards. The Trust will encourage their widespread
adoption by acting as an important funding source for digitization projects.
It would establish task forces to tackle the issues of intellectual
property, metadata, and the variety of problems encountered in ensuring
continuous migration of archival collections to ensure compatibility
and the longevity of these resources.
Your
project has been compared to work by the National Science Foundation (NSF),
National Institutes of Health (NIH), and the Defense Advanced Research
Projects Agency (DARPA). All of these groups, but NSF in particular, have
included support for long-term access to content created by their projects.
What priority will preserving the digital content created by your project
have? Will the resources for the project include sustainable funding for
preserving your digital content?
We
need to build trusted digital repositories where these sources will
persist, where they will be unaltered in form or content by hackers
and others, where their provenance is known, and where they can be easily
located using sophisticated user interfaces.
What is the current status of Congressional support for the Digital Promise,
and what are your plans for promoting the initiative this year?
In 2003,
Congress allocated funds for an in-depth study of the rationale for
the creation of DO IT, the development of a Research and Development
Roadmap, and a proposal for the structure and governance of the new
entity. This report will be delivered to Congress in October 2003. Shortly
thereafter, Senator Dodd and others will introduce legislation to implement
the report.
We
invite your readers to read the report on our Web
site (available October 24, 2003). If they agree with our recommendations,
we request that they ask their senators to support the Dodd legislative
initiative.
What
do you see as the greatest enablers for the project? … the greatest
barriers?
In
the course of the project, we have developed an outstanding Leadership
Circle and Coalition of Organizations that support this initiative.
In the coming months we will be working with members of both groups
and their constituencies to raise awareness of the proposal and to develop
widespread expressions of support.
The
greatest barrier is the burden on the federal budget. We are reminded,
however, that each of the other major transformations in learning was
enacted during a period of war.
The Land-Grant Colleges Act, perhaps the greatest piece of education
legislation ever enacted, was signed by Abraham Lincoln at the height
of the Civil War.

PRONOM—A
Practical Online Compendium of File Formats
Jeffrey
Darlington
The National Archives (UK)
“There
is a pressing need to establish reliable, sustained repositories of file
format specifications, documentation, and related software. We recommend
the establishment of such repositories for format-specific materials related
to migration as a preservation strategy.”
Risk
Management of Digital Information: A File Format Investigation,
Council on Library and Information Resources, Washington, D.C.,
June 2000
The
problem faced by the National Archives and others who aim to preserve
history by preserving records is that records are not always designed
for longevity. They may be as ephemeral as messages written in the sand
at low tide, or as permanent as the inscriptions on granite monuments.
It is ironic that the primitive technology of ancient
times has produced records lasting hundreds of years, while today’s
advanced electronic world is creating records that may become unreadable
in a few years’ time.
The correct
interpretation of records has always required knowledge of the language
in which they were written, and sometimes of other subjects too - medieval
penmanship, for example. Fortunately enough of this knowledge has survived
that we can make sense of most of the records that have come down to us.
Modern technology has further complicated the problem of interpretation
by making the viewing of records dependent on hardware and software environments
whose own longevity is doubtful. Just as interpretation of the 1086 Domesday
Book depends on the dictionaries and grammars for medieval Latin painstakingly
compiled by long-dead scholars, interpretation of contemporary electronic
records in the future will only be possible if the necessary methods and
tools are compiled, documented, and preserved now.
When, in
the evolution of computing, punched cards were superseded by magnetic
tape, computer records became invisible and readable only by computers.
Preservation programs were started up as early as 1962, and it was soon
recognized that the incompatibility of tape formats would complicate the
task of preservation. This realization led to the development of the ASCII
character code, and other standards for media and recording formats that
have served the preservation community well over the years. The new problems
generated by the evolution of the personal computer and the word processing
application were not so readily recognized. The new tools were seen at
first as merely a method of creating paper documents that could be archived
on paper. File formats were not standardized and soon proliferated alarmingly,
every new software product having its own format.
Dimensions
of Incompatibility
Besides
the incompatibility between products, there is the time dimension. For
each product, new versions of its file formats quickly superseded earlier
ones. As more and more facilities were added, the formats became more
complex. Sometimes the old format was a subset of the new, giving forward
compatibility, but often the appearance of an old record was not rendered
correctly by new versions of the software. These problems have multiplied
as records have evolved from simple texts to complex assemblies of diverse
elements, including embedded formulae, images, and charts. Web sites go
a stage further with animations, video clips, and dynamic content. There
is no paper analog that could be archived in these cases. And yet another
dimension is that of dependencies. There are layers of application software
that depend on operating systems that in turn depend on hardware.
Anyone responsible
for managing and sustaining access to electronic records, even over relatively
short timescales, must deal with these challenges to ensure that valuable
records are not left stranded in formats that are no longer supported.
One approach to a solution is to migrate records from obsolete formats
into current ones, ideally into formats with published standards. Another
is to preserve records in their original formats and keep copies of software
products that can interpret those formats. To preserve the ability to
run those products, copies of operating systems must also be kept. And
in principle, old models of hardware must be either preserved (the museum
approach) or emulated in software. The emulation approach is described
by Rothenberg.
Technical information about file formats and the software products that
support them is a prerequisite for any digital preservation regime.
Introducing
PRONOM
The
National Archives is committed to preserving historic electronic records
indefinitely, and has embarked on a program to make this feasible on a
practical level. One strand in this program is PRONOM, our database of
file formats, and its supporting library of software products. This
collection of electronic materials is aimed initially at helping with
the problem of software obsolescence, which seems the most urgent of the
problems facing the preservers of electronic records. Hardware and media
obsolescence are also to be addressed in future.
The first
version of PRONOM was developed in March 2002 in parallel with the development
of our Digital Archive system. It was designed to hold reliable technical
information about the nature of the electronic records to be stored in
the archive. For example, for Microsoft Word 97, PRONOM will tell you
when it was launched, by whom, whether it is still on the market and whether
it is still supported, what formats it writes, and what formats it reads.
Interest expressed by other national archives led to the concept of distributing
the database on CD, and PRONOM 2 was released in December 2002 to provide
support for multilingual versions of the system.
The system
was designed from the outset with a Web-based user interface and XML system
interfaces, to conform to the UK e-Government Interoperability Framework.
The latest version, PRONOM 3, is now being launched on the Web to make
it available to the whole preservation community. We have simplified the
user interface in the light of our own usability testing. The main search
page allows the user to look up a file extension and see all the formats
PRONOM recognizes with that extension, some extensions being shared by
a number of products. It also allows the user to search for potential
migration paths for a given format.
Content
Development
In
parallel with system development we set out to collect information about
file formats from software vendors and from the Internet. This turned
out to be more difficult than we expected. Vendors did not generally have
systematic records of their old formats that were easily available. Many
software vendors have been taken over by companies that have been bought
up in their turn, and contacts for their products are difficult to find,
especially if the product has been discontinued. A number of enthusiasts
have Web sites containing collections of file format information, but
their coverage is incomplete, particularly for vendor and product details.
The Representation
and Rendering Project (PDF) at the University of Leeds provides a
useful survey of these sources.
We also began
to build up our library of software products. This was an easier task,
and it turned up a useful new source of information. The boxes in which
distribution CDs are packed often display information about operating
system dependencies and compatibility with other products that is not
published elsewhere.
Since that
time, considerable effort has been devoted to the
collection of PRONOM content. National Archives staff have undertaken
intensive research and liaison with major software developers in order
to create an initial core data set of software product information.
Microsoft has been particularly helpful. For the initial data load the
focus is on the most commonly used office products for PC operating systems
from PC-DOS onwards. We intend to load information on about 450 products
over the next few months. The content development work is ongoing, and
we have at least some information on over 3,000 file formats yet to be
verified. We encourage software developers and others to be proactive
in providing information - there is an online submission form on the PRONOM
Web site.
Preservation
Strategies
Information
on file formats read and written by a product supports the migration of
data by suggesting migration paths. However, this data needs interpretation
and validation to be practically useful. Ideally,
migrating a record into a new format is done at an early stage in its
life and by the original author, who can check that the essentials of
his or her work have been preserved. Later migrations are more risky,
especially if more than one generation of format has intervened and the
original author is no longer available. Unfortunately these are
the conditions under which an archive may have to work. For example, there
survive large collections of documents created by the early word processor
WordStar that need to be preserved. Other word processing products may
claim to read WordStar files, but how can we be certain that the full
contents of the migrated document are captured?
We need
a measure of how much the information content of a record is altered as
a result of a particular migration process. Information content includes
formatting, and functionality that is integral to the record rather than
to the creating software (the difference between a hyperlink in a Word
document and the fact that Word includes a spell check function). The
design of PRONOM allows us to keep a measure of the “content invariance”
of a migration path. Our intention is to define an objective and rigorous
methodology for testing migration paths to measure content invariance,
and to record this within PRONOM for each format that a particular product
can read.
Migration
paths for WordStar data are complicated because early DOS versions of
WordStar used seven-bit ASCII characters, the eighth bit being used as
a line wrap marker. When viewed by later products, these characters are
wrongly interpreted as eight-bit ASCII equivalents, and to achieve a successful
migration it is necessary to strip out the marker bits from the WordStar
files. Since line wrapping is handled differently in later products, the
loss of the eighth bit normally makes no difference, and at worst causes
the text to be adjusted to different margins. This example shows the part
that detailed technical knowledge plays in implementing a workable migration
strategy, and also the importance of keeping the original bit-streams.
The
alternative to migration is to have the original bit-streams interpreted
by preserved software from the software library. To support this strategy,
it is necessary to preserve old operating systems and the ability to run
them. Fortunately the dominance of the PC platform and Microsoft operating
systems makes this approach practical for many collections of records,
and the compatibility between successive PC models means that a PC emulator
is not always needed. However, some older operating systems are dependent
on a 16-bit architecture, and future PCs may eventually outgrow the current
32-bit architecture; emulation will be needed to overcome these problems.
Technologies
are constantly evolving, and archivists must be aware of these changes
and the implications for the electronic records they are preserving. As
old software products cease to be supported and become obsolete, preservation
activity will be needed for the file formats that depend on those products,
whether by content invariant migration or by preserving the software.
A “technology watch” process to identify
triggers for preservation actions is a component of the PRONOM program.
There would
be many advantages in migrating records to an XML-based standard format.
The development of practical tools for this task depends on detailed technical
descriptions of how each format actually works. This information is beyond
the present scope of PRONOM, but we plan to collect it for the next major
release. A subset of this information would be useful to develop tools
for the recognition of file formats, a function included in our Digital
Archive and at present provided by commercial viewer software. We expect
to include this recognition function in a later version of PRONOM.

The Web-enabled
PRONOM 3 completed this month, marks the latest stage in the evolution
of the system. At the same time, it is a starting point for the development
of PRONOM as a major shared online resource for the international digital
preservation community. The National Archives has plans for major enhancements
over the next few years, including the development of a number of specific
tools to support digital preservation activities. Further information
about our future plans is available on the National
Archives Web site. We hope that through our continuing research and
contributions from the Web community, the content of PRONOM will expand
to give a comprehensive coverage of file formats that will support worldwide
preservation initiatives.
Acknowledgments
Thanks to Adrian Brown, Jo Pettitt, and Rob Taylor of the National Archives
Digital Preservation team.

Highlighted
Web Site
|
JHOVE
Digital
preservation would be fairly straightforward if all computer files
used the same format and that format didn't evolve over time. In
reality, there are thousands of file format variants, with new ones
introduced each year. Almost every significant step in the management
of files in digital repositories, from ingest to delivery to making
migration decisions, can be executed more effectively if accurate
and detailed file format information is available.
JHOVE
(pronounced "jove"), the JSTOR/Harvard Object Validation
Environment, is a tool to automate the validation of file formats.
Unlike less reliable approaches that rely on superficial indicators
such as file extensions and MIME types, JHOVE uses format-specific
modules to probe a file's internal structure. JHOVE's plug-in style
architecture will allow the work of developing format modules to
be shared. The JHOVE site will eventually include a tutorial on
module writing, and a full explanation of the module interface.
The
pre-release
version of JHOVE (requires Sun's Java) can be downloaded
for review and testing purposes.
(In its present state of development, JHOVE is easiest to install
in a Unix environment, but with some tinkering it can be made to
run under Windows). It includes modules for the identification,
validation and characterization of arbitrary byte streams, ASCII
and UTF-8 encoded text, TIFF, and PDF. A set of example files for
testing the five modules is also available, and the JHOVE Web site
includes a tutorial. When complete, JHOVE will be made available
under an open source license, with support for Unix/Linux and Windows.
Development of JHOVE is funded in part by the Andrew
W. Mellon Foundation through a grant to JSTOR for the recently
launched Electronic-Archiving
Initiative. |

print this faq
FAQ
What
are recent developments and improvements in hardware for scanners and
digital cameras?
For this
FAQ, we asked Kodak's Don Williams to look back over the past five years
and identify significant new developments as well as important incremental
changes in scanner and digital camera design, with special attention to
the needs of libraries and archives. Don Williams is a senior research
engineer in the Image Science Division of Eastman Kodak Co. He has written
extensively about digital image capture specifications and imaging performance
metrics and is a regular participant on digital imaging standards committees.
Recent
developments and improvements in hardware for scanners and digital cameras
When asked
to contribute an update on advances in digital image capture technologies
to this forum, I hesitated momentarily, gauged my instincts, and accepted.
After all, in the past five years all sorts of new image capture devices
have been introduced from which I could draw. What could be easier? Then
the penny dropped.
I realized,
somewhat humbly, that there actually have been few fundamentally new approaches
applied to digital image capture, especially for museum and library community
level tasks. Rather then chasing promising but unproven new scanning technologies,
most efforts have focused on perfecting existing ones. The good news is
... this is not bad news. Freed from the onerous learning curve of technology
adolescence, manufacturers have concentrated on multiple incremental improvements
that come with maturity. For the user, the impact is nothing but positive.
Imaging performance, cost, and speed (think workflow) have dramatically
improved. This benefits not only research organizations but, notably,
resource poor local/regional sites with their own conversion tasks.
But, to suggest
that nothing is new is remiss. Certainly some exciting technologies have
been introduced and are implemented in a few products. These and the cited
maturity improvements are briefly discussed below. Being a scanner gearhead
at heart, I have chosen to organize these according to four scanner subsets.
They are 1) document handling, 2) illumination, 3) sensors/detection,
and 4) data processing. Some items may be scan mode (e.g., transmissive
vs. reflective) or hardware specific (e.g., flatbeds vs. cameras) and
will be emphasized as such.
Document handling—Two words immediately come
to mind, "Book Cradles." This class of camera hardware has
improved from the yawning, static, manual contraptions of several years
ago to the robotically articulated page-turning wonders of today. Like
any new technology there is likely to be an optimization period for
these devices, but the forecast is good. See, for example, Conservation
by Design's Preservation
Book Cradle and 4DigitalBooks' ™ automatic
digitizing system.
Less seductive but equally pragmatic is the trend from cameras with
horizontally constrained document placement to those with a vertical
document mounting option. Gravity and conservator concerns have dictated
this change. Although appearing to be a trivial modification to existing
camera design, doing so while maintaining resilience, portability, and
utility of the supporting structures can be a challenge, especially
for very large documents. Nevertheless, many designers have achieved
this adaptability with minimal compromise, some elegantly so.
- Illumination—Though
hardly noticed, illumination systems of flatbed scanners, both reflective
and transmissive, have improved considerably. Largely, this is attributable
to improvements in cold cathode fluorescent illumination sources used
in these scanners. Their low cost, rapid warmup, stability, and improved
color quality have made them nearly a universal choice for illumination
sources in this class of scanners. Improvements in illumination optics
and increased bit depth for these scanners have also provided dramatic
uniformity performance.
Several years ago it was advisable to avoid the platen margins on these
scanners because of the uncompensated illumination falloff. Today, low
cost scanners can be found where literally all of the platen area is
effectively illuminated within 5.0 % uniformity. Epson flatbed scanners
are particularly good, but any scanner can be tested simply by scanning
a known uniform flat field document, like a Munsell paper sheet.
From a conservator perspective, it is encouraging to see that some camera
system manufacturers
(e.g., Lumiere
Technology) are proactive in designing ultraviolet and infrared
friendly light
sources for especially sensitive documents. The fading and heat
characteristics of these portions of the radiation spectrum are a very
real concern from a conservation perspective, particularly for high
quality scans of long duration.
- Sensors/Detectors—Despite
the hype on the benefits (lower cost, higher level of feature integration)
of CMOS (Complementary Metal Oxide Semiconductor) sensors several years
ago, they continue to have inferior imaging performance (higher noise,
lower dynamic range) than their CCD (Charge Coupled Device) counterparts.
To my knowledge they are used exclusively as area array camera sensors
and not as scanning linear arrays. This makes them perfectly suitable
for many consumer or prosumer (i.e. professional consumer) camera applications
but risky for demanding conversion projects. For this reason, CCDs continue
to be used as the imagers of choice for conversion grade scanning applications.
Several important changes to the sensor "imager package" are
noteworthy. They can apply to either CMOS or CCD type imagers and are:
-
Pigmented color filters—For color scanners where the color
filters are coated onto the sensor, some manufacturers are beginning
to use pigmented rather than dye based filters. The reason for doing
so is the same as for using pigmented dyes in inkjet print applications—stability.
-
Depth-wise color detection—This is a new color detection technology
for digital cameras developed by Fovean.
Its claim to fame is that it can capture a fully pixel-populated
RGB digital image using a single area array detector in a single
frame. Most of today’s studio cameras use scanning linear
arrays (slow), color filter wheels with area arrays (requires multiple
frames), or sparsely populated RGB color filter arrays (requires
de-mosaic interpolation). Fovean has accomplished this by taking
advantage of the well-known fact that different colored light penetrates
to different depths within the detector bulk. Red light penetrates
the furthest, green light less so, and blue light even less. By
reading out the charge associated with different depths within the
sensor one can in fact create a color image without the explicit
use of color filters. This is not an easy task though and may require
aggressive data processing to achieve the demanding image performance
levels of imaging for the cultural heritage community. Currently
cameras employing this technology cater to the prosumer market.
-
Smaller pixel sizes—The individual sensors associated with
a single image pixel have become progressively smaller over the
years. Indeed, this has allowed prosumer/consumer digital cameras
to increase their total pixel count without significantly changing
overall detector size. Today, typical sizes may range between 3-5
microns per pixel compared to 7-11 microns of the past. These smaller
sizes are not without their imaging performance tradeoffs. To achieve
the same signal levels per pixel, about four times the illumination
level is required (can you hear the paper conservators gasp?). Without
these increased levels, a greater reliance is placed on subsequent
image processing to deliver the image. Depending on the processing
aggressiveness, this almost always increases image noise levels,
which lead to lower image quality.
-
Support Electronics—Perhaps the most impressive changes have
come in terms of reducing the size of the camera/scanner’s
support electronics. This is where the analog-to-digital conversion
as well as much of the data processing (see next section) occurs.
What used to be the size of a deck of cards has now been reduced,
via CMOS integration, to that of a nickel.
- Data
Processing—Rather than cumbersomely performing image
processing functions offline, there is a trend to integrate common scanner
related functions such as OCR (Optical Character Recognition) and distortion
correction within the support electronics. One of these functions, licensed
from Applied Science Fiction (ASF) as Digital ICE™, is automatic
scratch and blemish removal. It was first introduced for film
scanners (Nikon) and more recently into reflection
scanners (Microtek). Truly a technology change, Digital ICE™,
relies on the scattering of infrared light by scratch and blemish artifacts
in film and photographic paper. An infrared scan in addition to RGB
color scans are made of the sample. The infrared scan is used to identify
where the scratches are located. This information is then used to mask
the blemishes through image processing in the other three color records.
It works quite well for minor defects in color negative and incorporated
color slide films (e.g., Ektachrome). Unfortunately, this technology
has been known to behave erratically on film media common to the library
and museum communities. For instance, mixed results occur for non-incorporated
coupler films (e.g., Kodachrome) and it will fail completely on all
black and white silver halide films.
Finally,
a few words on multi-spectral or hyper-spectral image capture for artwork.
In concept, performing these types of captures has always been easy. Through
multiply-filtered frames and suitably designed light sources, a number
of demonstration projects of this nature have been documented. (For some
examples, see RLG
Diginews, October 15, 1999.) But let’s face it, these projects
have not been the epitome of productive workflows. They have, however,
supplied critical examples of ways to improve the process and what shortcuts
can or cannot be taken. Over the next several years I predict that large
gains in productivity, economy, and quality will be made in this area
of digital image capture. Some university
and commercial
partnerships are exercising new models for multi-spectral capture and
it will be exciting to see the future levels of improvement.
Calendar
of Events
Online
Course on Digital Licensing
September
22-November 20, 2003
The course is designed for information professionals who wish to learn
more about licensing digital and online content - such as periodicals,
databases, and images - without attending an in-person seminar. The target
audience includes librarians, archivists, publishers, photographers, Web
site owners, content developers, and those in museums, educational institutions,
and governments. Participants will receive three e-lessons per week for
nine weeks; each e-lesson has a self-marking quiz. Participants also will
have access to an exclusive online discussion list on the course content.
The
Next Generation of Access: OpenURL and Metasearch
Washington,
DC.
October 29
& 30, 2003
NISO will hold two one-day conferences to inform you about the two leading
standards initiatives that promise to re-shape information access for
all actors in the information delivery equation—publisher, aggregator,
librarian, student, scholar, and author. You can attend one or both events.
Both meetings will be held at the conference center at the American Geophysical
Union.
'Parallel
Lives': Digital and Analog Options for Access and Preservation
London, UK
November 10, 2003
A joint conference of the National Preservation Office and King's College
to address the importance and interrelated lifecycles of digital images,
microfilm, photographs, and other surrogates. Explores how we should create,
store, provide access and manage digital objects for the benefit of culture
and society.
International
Workshop on the Trusted Digital Repository for Cultural Heritage
Rome, Italy
November 17-19, 2003
ERPANET and the Accademia Nazionale Dei Lincei are jointly sponsoring
this workshop to identify and discuss the key scientific, technical, management,
and policy considerations for the successful implementation of a trusted
repository for preserving cultural heritage.
6th
International Conference of Asian Digital Libraries (ICADL 2003)
Digital Libraries: Technology and Management of Indigenous Knowledge for
Global Access
Kuala Lumpur, Malaysia
December 8-11, 2003
Topics include data mining in digital libraries, multimedia digital libraries,
intellectual property rights and copyright, metadata issues, data storage
and retrieval, and knowledge management.
International
Archiving Workshop on the Selection, Appraisal, and Retention of Scientific
Data
Lisbon, Portugal
December 15-17, 2003
The aim of the workshop is to identify and discuss the key scientific,
technical, management, and policy considerations for the successful implementation
of appraisal and selection guidelines and retention policies.
International
Workshop on Document Image Analysis for Libraries
Palo Alto, CA
January 23-24, 2004
This workshop aims to bring together researchers, practitioners, and users
who are interested in new technologies to help integrate imaged and encoded
documents within digital libraries. Topics include imaging and compression
standards, content and metadata extraction, multimedia document analysis,
and digital library best practices.
Victorian
Association for Library Automation 12th Conference
Breaking Boundaries: Integration & Interoperability
Melbourne, Australia
February 3-5, 2004
This conference will explore the successes and the key challenges in the
field of library and information technology. Sessions topics include archiving
radio and television, Managing Digital Objects, Open Archive services,
and Electronic Publishing.
Museums
and the Web
Washington, DC/Arlington, VA
March 31-April 3, 2004
MW2004 will feature a variety of sessions exploring all aspects of the
creation, development, maintenance, and evaluation of Web sites in museums,
archives, libraries and other cultural and heritage organizations.

Announcements
NISO
Publishes Metadata Demystified: Guide for Publishers
The National Information Standards Organization (NISO) announces the joint
publication with the Sheridan Press of Metadata Demystified: A Guide for
Publishers. The guide presents an overview of evolving metadata conventions
in publishing, as well as related initiatives designed to standardize
how metadata is structured and disseminated online. Focusing on strategic
rather than technical considerations, it offers insight into how publishers
can streamline metadata operations. The guide is available for free downloading
from NISO.
The
GPO and National Archives Unite in Support of Permanent Online Public
Access
The US Government Printing Office (GPO) and the National Archives and
Records Administration (NARA) have announced an agreement to ensure that
free online public access to more than 250,000 federal government titles
will remain available permanently. NARA will assume legal custody of the
titles as part of the official Archives of the United States, and the
GPO will retain physical custody and responsibility for permanent public
access and preservation.
3rd
ECDL Workshop on Web Archives Proceedings Available
This
year the 3rd annual ECDL workshop had presentations from national libraries
and researchers about their experiences and projects in the area of Web
archiving. Proceedings from the conference are available on their Web
site.
Dublin
Core Metadata Element Set Recognized by ISO
The International Standards Organization (ISO) has approved the Dublin
Core Metadata Element Set as an international metadata standard. Dublin
Core was developed for use on the Web and in other information networks
across a wide variety of subject areas, languages and economic sectors.
ISO approval signifies international recognition of the standard.
Fedora
v.1.1 Released
The University of Virginia Library announces the release of Fedora v.1.1,
an open-source digital object repository management system. The Fedora
Project, a joint effort of the University of Virginia and Cornell University,
has made available the first version of a system based on the Flexible
Extensible Digital Object Repository Architecture, originally developed
at Cornell. This first version of the software is designed to support
a repository containing one million objects using freely available software.
It fully implements the Fedora architecture, provides the first version
of a graphical user interface to manage the repository, and provides facilities
to create and ingest batches of objects.
National
Library of New Zealand Preservation Metadata Data Model Released
The National Library of New Zealand has released a data model for implementation
of its preservation metadata process. The data model is based on a logical
preservation metadata model released earlier and maintains the overall
structure and data relationships. The model, which includes XML schema
definitions, is intended to provide a step toward the implementation of
a repository for preservation metadata.
The
National Archives of the UK Announces the Launch of a New Archive of UK
Central Government Web Sites
This initiative will collect and preserve 50 UK government Web sites,
including the Hutton Inquiry, 10 Downing Street, and the Northern Ireland
Office. Sites are gathered in weekly or 6-monthly snapshots, using a modified
version of the Internet Archives Web crawler. The complete archive will
be available on the Web and in the National Archives public search rooms.
A copy of each snapshot will also be accessioned for long-term preservation.

Publishing
Information
RLG
DigiNews (ISSN 1093-5371) is a Web-based newsletter
conceived by the RLG preservation community and developed to serve a broad
readership around the world. It is produced by staff in the Department
of Research, Cornell University Library, in consultation with RLG and
is published six times a year at www.rlg.org.
Materials
in RLG DigiNews are subject to copyright and other proprietary
rights. Permission is hereby given to use material found here for research
purposes or private study. When citing RLG DigiNews, include
the article title and author referenced plus "RLG DigiNews."
Any uses other than for research or private study require written permission
from RLG and/or the author of the article. To receive this, and prior
to using RLG DigiNews contents in any presentations or materials
you share with others, please contact Jennifer
Hartzell , RLG Corporate Communications.
Please send
comments and questions about this or other issues to the RLG
DigiNews
editors.
Co-Editors:
Anne R. Kenney and Nancy Y. McGovern; Associate Editor: Robin
Dale (RLG); Technical Researcher: Richard Entlich; Contributor:
Erica Olsen; Copy Editor: Martha Crowe; Production Coordinator:
Carla DeMello; Assistant: Valerie Jacoski.
All links
in this issue were confirmed accurate as of October 15, 2003.

|