Best Practices for Digitization - General Guidelines and Information

Digital Collections and Initiatives

Best Practices for Digitization: General Guidelines and Information

File/Directory naming

All image files should use an 8.3 naming convention: eight digits, numeric, sequential, padded with leading zeros followed by a lowercase, three-character file extension, e.g. 00000013.tif, ensuring consistent and relevant image order. Directory names and structure should reflect the collection.

  • For material that has been described bibliographically, the directory name containing the image files should be the bibliographic ID number, the 001 field in the catalog record: 6124186 (bib ID)/00000001.tif (file names); i.e. 6124186/00000001.tif. An intermediate directory is appropriate for multi-volume items. For example: 6124186 (bib ID)/01 (volume number)/00000001.tif (file names). Directory names should not have punctuation or spaces.

  • For archival materials described in a finding aid, the directory structure follows the finding aid structure wherein the collection code and component identification numbers are used in place of a bib ID. For example: C0744 (collection code)/c002 (component ID)/00000001.tif (file names), i.e. C0744/c002/00000001.tif.

Some materials, such as audiovisual materials or ephemera, may be named corresponding to a barcode or other unique identifier assigned to each physical asset. This may also include indicators about the side (for bilateral media) and derivative status appended at the end. For example: 32101047381338_1_pm.wav (where 32101047381338 = barcode, 1 = side 1, pm = preservation master, and .wav = file extension).

Different Goals of Digitization

When digitizing to support access or to fulfill user requests, it is generally desirable to follow these standards and ingest the content into the digital repository so it can be reused to avoid having to re-digitize materials in the future. However, staff will need to make the judgement call of what quality is appropriate, based on available equipment, timeframe, quality of the source material, and storage costs. Some of these issues are already accounted for in the guidelines, such as using the lower-quality "Optimized for OCR" standards for mostly textual material, as opposed to the "Special Collections on Paper and Film" standards.

Metadata Standards

In general, descriptive metadata should be created prior to digitization. At minimum, there must be a unique identifier, such as a Voyager bib ID or a Finding Aids component ID, connecting digitized content to a metadata record.

When to Outsource Digitization

The Digital Imaging Studio can digitize many types of materials, and should generally be used for rare, valuable, and/or fragile material. DPSG issues a call for digitization projects three times a year, and a small number of items can be digitized by the studio at the discretion of the studio manager.

There are a number of factors that tend to make outsourcing digitization more practical in certain circumstances:

  • AV digitization for any formats beyond what the Mendel Music Library can support

  • Mass paper digitization, or any digitization with a short timeline and a large volume

  • Dedicated external funding for digitization, which may be more practical than hiring staff and purchasing equipment

Schedule for Review and Updating

The Digital Projects Steering Group (DPSG) will review this documentation at the end of each calendar year and, when necessary, designate a person or working group to make updates.