
Overview
In 1999, through generous awards from the National Science Foundation and the Andrew W. Mellon Foundation, the Macaulay Library of Natural Sounds (MLNS)[1] at Cornell University began the enormous task of converting its analog tape-based sound collection to digital storage. Our collection contains over 160,000 recordings of bird, insect, frog, and mammal vocalizations. Analog formats included acetate disk, cassette, and open reel. A considerable number of the open-reel tapes were in various stages of deterioration and thus required specialized treatment prior to transfer. The project began with three archival studios, but with the help of additional funding from the National Science Foundation and the Office of Naval Research we are currently running six fully equipped studios. This article provides an overview of the critical steps relevant to the digitization process. All photos are courtesy of the Macaulay Library.
Tape Inspection
Prior to beginning the digitization process, a careful review of the materials to be transferred was undertaken. As mentioned above, many of our Mylar-based open-reel tapes had already begun to deteriorate. Some of this deterioration, also known as sticky-shed, is the result of binder breakdown and requires that the tapes be stabilized prior to transferring.
![[Analog files]](file106.gif)
Storage facility for analog collection
This process consists of controlled baking (typically 50° C for a period of 24 hours) to temporarily improve binder integrity.[2] Other tapes had splices that needed either repair or replacement. The latter requires a great deal of skill and patience to carefully release the old splice and adhesive without damaging the fragile oxide layer. Still others would require a special relubrication process to minimize friction.[3] The final phase of inspection was to identify the track format. Many of our collection recordings have good, solid data records that accurately describe the make and model of the audio recorder used to create the tape. For those that do not, or are questionable, we use a magnetic developer. This product allows one to literally see the magnetic track format on the tape created by the record head, e.g. full-track, ½-track, ¼-track, 4-track, etc. Once the track format is identified, then the proper head assembly can be selected for an accurate playback.
Playback Equipment Calibration
To extract every nuance of fidelity from the original analog tapes, we took great care to ensure that the master open-reel playback machines (Studer A-820) were in excellent condition. Playback heads and stationary tape guides were visually inspected for excessive wear patterns that could potentially degrade the transfer through increased friction. Tape tensions were adjusted to ensure intimate tape-to-head contact with the least amount of tension, while spooling speeds were set low to provide gentle handling of fragile tapes. Each machine was then calibrated to known international standards using a series of precision calibration tapes from Magnetic Reference Laboratories, Inc. This calibration included adjusting head alignment (height, wrap, and azimuth), playback equalization, playback reference levels, absolute speed, and wow and flutter. Master cassette playback machines (Nakamichi CR-7A) were similarly inspected and calibrated using a series of calibration tapes from BASF prior to beginning the transfer process. The initial calibration/alignment process and all subsequent ones (machines are calibrated biweekly) were accomplished using a computer-based test set manufactured by Audio Precision. The resulting test data were stored and routinely compared to current tests. This allows us to closely monitor the machine’s performance over time and makes it relatively easy to spot problems before they have a negative effect on the transfer process.
Analog-to-Digital Conversion
Perhaps the most important link in the digitization process is the analog-to-digital (A/D) converter. Often overlooked and taken for granted, this sole piece of equipment can make or break the digitization/preservation effort. In our situation we knew that we had many outstanding-quality, wide-spectrum open-reel field recordings. Our ultimate goal was to digitize without compromising quality in any manner whatsoever. To achieve this goal, we first reviewed A/D converters based on published specifications, then narrowed the playing field down to six possibilities and finally requested actual units for in-house testing and audition. The results were nothing short of amazing. Even though all six had very similar published specifications, the actual sound character or lack thereof was very different. Our final decision, the Prism Dream AD-2, was the only device that did not color (alter) our signals. Through grueling A/B listening tests and spectral analysis we confirmed that the digitized signals created by the Prism device were indistinguishable from our highest-quality analog sources.
What makes a good A/D converter? Many key factors determine the quality of a good converter system. To make the proper selection first, we needed to make sure that the device could handle the full-frequency spectrum of the signals to be digitized. Second, we determined the dynamic range required. For us that meant a converter that used a 96.0kHz sampling rate to cover a signal range from 4Hz through at least 32kHz, and 24-bit output resolution to provide a dynamic range of roughly 128.0dB (unweighted RMS). The Nyquist Theorem states that the sampling rate must be at least twice the highest frequency of interest to achieve lossless sampling. [4]
Once these criteria had been defined, we looked for the following other important specifications:
- Total harmonic distortion and noise (1kHz@ -1dBFS)
<-108.0dBFS (0.00045%) unweighted RMS
- Intermodulation distortion
<-90dB
- Spurious aharmonic levels
<-130dBFS (1kHz@ -1dbFS)
- Interference susceptibility
>110dB (50hz) CMMR
>85dB (15kHz) CMMR
- Crosstalk
<-130dB (50Hz, -1dBFS in either channel, other terminated)
<-140dB (15kHz, -1dBFS in either channel, other terminated)
- Intrinsic conversion jitter
<18ps RMS)
- Phase linearity
<1º
- Internal high-precision clock accuracy
±5ppm
Specifications like these are not typically found in the sound cards that often come built into, or bundled with, computers. Nor are they found in stand-alone compact disc recorders. The precision needed to execute exacting A/D conversion requires ultra-clean power supplies; exceptional grounding procedures; ultra-stable clocking devices; high-quality, low-tolerance components; and exceptional printed circuit board design. All of this comes at a high price, but the end results are certainly worth the cost.
Transfer-Level Setting and Monitoring
![[Archival studio]](file51.gif)
One of the six archival studios
Another key element in the digitization process is setting the transfer signal level. The MLNS uses Benchmark Media ultra-low noise, low-distortion preamps as the variable-gain stage between the playback devices and the A/D converters. To fully use all the available digital resolution, e.g., 16-bits or 24-bits, it is important to maximize the analog signal level being fed to the A/D converter. At the same time, great care must be taken to never exceed the maximum allowable input, so that the A/D converter is never overdriven during the transfer process. It is far better to adjust the signal levels before the A/D process rather than try to use a “normalize” function once the signal is digitized. The Prism A/D converters provide sample-accurate precision metering that aids in this critical step.
In addition to the meters, a second form of signal monitoring is accomplished aurally, with high-resolution, near-field monitor speakers. To make this possible, the digital output signals are converted back to analog via high-resolution Prism DA-2 digital-to-analog converters (D/A). Once again, special care was taken to make the D/A selection. To make accurate quality assessments, the D/A must be able to resolve to the same degree of precision as the A/D converter. The monitor speakers are then coupled with a precision switching device allowing the archivists to easily switch the monitors between the feed from the master playback device, the output of the Benchmark preamp, and the output of the D/A converter. Any aural difference is a sign of potential problems in the transfer chain.
Digital Editing System
Once the analog signals have been converted to digital, a digital audio workstation (DAW) is used to edit (if needed), create fade-ins/-outs, add cataloging information (voice ID), add unique file names to each individual recording, and build the DVD project files that will ultimately be written out to DVD-R discs. We use a Sonic Solutions, Sonic Studio HD system. Once again, great care was given to the DAW selection. The Sonic system was selected for its ability to preserve the signal quality from initial recording (digital input) to final digital output by incorporating a full 48-bit data path throughout the system.
Most workstations use 32-bit floating-point math to perform their editing, processing, and gain changes. Thirty-two-bit floating point uses 24 bits of precision (a 23-bit mantissa and one sign bit) and 8 bits of exponent. This exponent is useful for preserving precision over a wide frequency range but does not actually add to the maximum precision.
Sonic Studio HD uses 48-bit math in all audio processing. It is set up to provide 40 bits of precision and 8 bits of linear headroom. The 40 bits of precision in Sonic Studio HD offer 16 bits of additional precision compared to 24-bit audio samples. This greatly reduces the round-off error for audibly superior performance.
Digital Media and Data Format Selection
Several factors drove our decisions regarding the selection and formatting of a long-term storage medium. We had already determined early on that our digital collection was going to be optical-disc rather than tape based. We also knew that our storage requirements were going to be huge (roughly 32MB/minute of 2-channel audio) due to the high-resolution (96kHz sampling rate, 24-bit word length) digital audio files we would be generating. We examined in depth the three options available at the time: CD-R, DVD-RAM, and DVD-R. CD-R certainly had merit with good player compatibility, excellent life expectancy, and low cost but fell short in terms of capacity. DVD-RAM offered increased capacity and long life expectancy but was expensive and did not offer the archival security of a write-once format. DVD-R, however, offered everything we required. Initially it, too, was relatively expensive, but we knew, based on the CD-R’s history, that as the technology gained momentum, the costs would fall significantly.![[breakout quote]](file113.gif)
The next issue at hand was how to format the data. Our goal was to make the digital collection as generic as possible, thereby maximizing accessibility and setting the stage for easy migration to the next generation of digital storage. Initially we contemplated using the DVD-Audio format (similar to CD-Audio) but soon realized that industry-imposed copy-protection schemes would significantly hamper our accessibility and future migration requirements. Instead, we chose to write each disc as a DVD-ROM using the Universal Disc Format (UDF) standard. Our audio recordings reside on the discs as Audio Interchange File Format (AIFF) data files. Every audio file has a voice ID at the beginning of the selection announcing the asset number, which in our case is the MLNS catalog number. This same number is also used as the digital file name. No other metadata is embedded in the audio file. Due to the common occurrences of species splitting and renaming we chose to store all relevant metadata in a separate relational database.
Disc-Writing Strategies
Our DVD-R discs are written using Pioneer DVR-S201 recorders. These devices are professional writers that use the DVD-R 4.7GB Authoring version 2.0 media. These media require a 635nm-laser wavelength instead of the 650nm-laser that the general-use versions utilize, however, they are still fully compatible with all DVD readers. The discs, designed for authoring and replica masters, are generally of a higher quality and more consistent from batch to batch than the general-use discs. We currently use discs from Maxell, TDK, and Pioneer and purchase in lots of 100 to 200 at a time. This purchasing strategy will help minimize any catastrophic batch-related problems. (View enlarged image)
![[Plasmon D-480 robotic jukebox]](file3699.jpg)
DVD jukeboxes
A custom DVD authoring program from Sonic Solutions handles the actual disc formatting, writer control, and bit/bit verification. Write speed is approximately 60 minutes/4.3GB of data. We do not write the disc to the full 4.7GB capacity. Our in-house testing has revealed that the disc quality decreases near the extreme outer diameter of the disc, so we limit our data to 4.3 to 4.4GB/disc.
We also create two, first-generation discs for our archive. Each disc contains roughly 125 minutes of stereo or 250 minutes of mono material. One disc is placed in a large Plasmon D-480 robotic jukebox for in-house distribution, while the second is stored off site at a secure, climate-controlled, underground storage facility.
Disc-Quality Control
Having many years of hands-on experience testing CD-R technologies, we are well aware of the more-subtle problems associated with CD-Rs caused by writer/disc compatibility, writing-speed issues, and dye-formulation problems. Many of these can have a negative impact on the discs’ playability over time. It is a little-known, but true, fact that all blank discs will not perform equally well in all writers. The differences often manifest themselves as significantly higher-than-acceptable error rates or tracking problems. While this may not pose a playability issue immediately, thanks to the error correction/concealment systems employed, there might come a point where the slightest disc degradation could render the disc useless. DVD-R technologies appear to share some of these same issues.
With the above in mind as we grow our digital archive, we do everything in our power to ensure
that each and every disc is the very best quality, thereby maximizing its useful life. To reach that goal, every disc created undergoes a series of rigorous tests.
Using an AudioDev Computer Aided Test System (CATS), we first test every blank disc. During this test phase the disc is subjected to 20 different tests that measure such values as disc reflectivity, push-pull, wobble signal-to-noise, land pre-pit level, block error rate (BLER), etc. A successful test will certify the blank disc’s ability to perform to specification during the writing phase. The ability to test blank, unwritten discs is a valuable asset. Not only do we save time and money by not writing on known defective discs, but we also save valuable hours on the expensive writing lasers.
The next and final step in the Q/C process is the disc verification after writing. The primary goal of this test phase is to quantify the quality of the writing process. Over 50 important parameters are tested during this phase, including servo and tracking, jitter analysis, digital errors, dropouts, HF parameters, and physical measurements. The results of this process offer a pass/fail report detailing all tests and their respective values. The CATS also provides surface-analysis testing to help reveal defects due to disc flatness, focus error, radial noise, and other anomalies that can occur during the manufacturing process. Failed discs are carefully scrutinized and either retested or rejected.
Archive Monitoring
All of the discs’ Q/C data are stored electronically. Discs are randomly pulled from the local jukeboxes and retested. Current test data are compared with prior test results to monitor the integrity of the digital medium. Any disc degradation or manufacturing-batch-related problems can be readily identified and digital clones created on new stock.
Accessibility/Distribution
We consider the DVD-R discs our high-resolution “core-archive.” These hi-res versions are available only in-house via a high-speed network. For external distribution via the Internet we create a variety of down-sampled versions. These include a CD-Audio quality 44.1kHz/16-bit wave file, a 96kbp/s MP3 file, a multi-bitrate RealAudio streaming file, and, coming soon, a QuickTime streaming file. All these files reside locally on a 25 terabyte Apple Xserve RAID storage system. A full backup of these files is maintained off site using an Exabyte LTO tape library system.
Future Proofing
Only time will tell, but if history is any indication, we assume that in the not-too-distant future some new and better digital format for long-term preservation will appear in the market place. Unfortunately, as technology changes, it typically renders existing hardware and software obsolete. In the meantime we will monitor new digital storage technologies while continuing to grow what we believe to be a very robust and accessible digital storage solution. When an improved, standardized digital format does appear, we feel confident that we have set the stage to migrate data from our current storage strategy to the next generation in a relatively painless and automated fashion.
Notes
[1] The MLNS Web site is is currently being completely reworked. Once set up, it will be easily found from a link on the Laboratory of Ornithology’s main site. (back)
[2] Van Bogart, John W.C., “Magnetic Tape Storage and Handling,” National Media Laboratory, June 1995.(back)
[3] Stosich, Michael N., “Problems and Solutions in Long-Term Tape Performance,” audio, November 1990.(back)
[4] Pohlman, Ken C., “Principles of Digital Audio,” 2d ed., Howard W. Sams & Company, 1989.(back)
