Archives & Museum Informatics








last updated:
October 7, 2014 2:55 PM

J. Trant. "Framing the Picture: Standards for Imaging Systems", ICHIM/MCN, San Diego, CA, Oct.1995.

3. Image Storage

Once an image has been captured, it must be stored, in a particular file format, on a particular storage device. As digital image files are very large image data is often compressed to reduce the amount of storage space required. Technical file formats, compression schemes and storage media change over time. While the cultural heritage community needs to be aware of these advances, it is beyond its ability to control or develop technical standards.

Image File Formats

The choice of image file format is critical to the interchangability of image data. If images are not stored in a widely supported format, it will be difficult, if not impossible for images to be interchanged. (It may be possible to transmit the file, but not to display its contents.) If images are to be made available on the WWW, a limited number of image formats (including GIF and JFIF/JPEG) can be supported by current generations of browsers (although this is changing rapidly).

The only standard in this area, Standard Recommended Practice, File Format for Storage and Exchange of Images,29 applies to the area of document imaging and at the present time, deals only with black and white image data. Subsequent additions are being developed to handle colour and gray-scale images.30 The file format recommended by this standard is TIFF. Of the file formats outlined below, only JFIF (the file format defined with the JPEG compression standard) can be considered a standard. Other image file formats, some proprietary, have become de facto standards through widespread use.31

TIFF (Tagged Image File), originally developed by Aldus Corporation. Version 6.0

TIFF files are widely supported, and can store image data captured with up to 24-bits of colour, compressed with LZW, CCITT Group 3, or Group 4, or JPEG. Issues arise with the widespread variance in the use of the tagged data fields in the TIFF file, and in the range of extensions to the format available.32

GIF (Graphics Interchange Format), originally developed by CompuServe Inc.

Widely supported, and originally the only file format which could be used with WWW documents, GIF is limited in its ability to render only 8-bit colour.33 The file format and accompanying compression algorithm, LZW, have recently been the subject of a licensing dispute, which originally appeared to threaten the use of the format, but in retrospect appears not to have had a significant effect. 34

JPEG File Interchange Format (JFIF) ISO/IEC 10918-1 and ISO/IEC 10918-2

A widely supported file format developed by the Joint Photographic Experts Group, which stores images encoded using the JPEG compression. Stores images in up to 24-bit colour. Now used to distribute images on the WWW, as it is supported by the most recent versions of the Netscape browser software.35

Kodak Photo CD (Image Pac), developed by Eastman Kodak

A proprietary CD-ROM based storage medium, which bundles a series of image resolutions into a single "Image pac". For example, the Master Photo CD format offers five resolutions ranging from 128 x 192 pixels to 2048 x 3072 pixels.36 Stores 24-bit colour images, in PhotoYCC format, a method of representing the colour spectrum where the 24 bits of data per pixel "are distributed among three color components, called Y (luminance information), C1, and C2 (two chrominance channels)."37 The Photo CD format has become popular because of the ease with which it enables existing photographic collections to be converted to digital form.38

BMP (Microsoft Windows Bitmap), developed by Microsoft

Microsoft Windows-based format, supporting up to 24-bit colour. Images are most often stored uncompressed, resulting in a larger file size.39

PICT (Macintosh Picture), developed by Apple Computer Inc.

Macintosh-based format, supporting up to 24-bit color, and used with JPEG compression.40

Still Picture Interchange File Format (SPIFF) ISO/IEC CD 10918-3

A proposal now being developed by Joint Technical Committee 29 of the International Standards Organization and the International Electrotechnical Committee (ISO/IEC JTC 29) "intended to be a generic format that is simple in nature and does not include many of the features found in application specific file formats" Still some years away from implementation.41

Image Compression

The large file size of digital images often leads imaging system implementors to compress files in order to reduce the amount of storage space they occupy. Image compression can be either "lossy" a one-way process which results in a reduction of the amount of image data available, or "lossless," a reversible process which maintains the integrity of the original image file. The purpose for which an image file is being created will dictate whether it is possible to tolerate the data loss incurred with lossy compression. Imperceptible changes in an image file may be acceptable for a public access application, or for the distribution of images over network, but be intolerable for an image archive which is designed to have lasting value. Changes in image data, for example, would make automatic comparison of images and their copies impossible, and preclude image processing applications which analyze the statistical characteristics of image files.

The two image compression formats in widespread use are JPEG and LZW.

JPEG (Joint Photographic Experts Group) ISO/IEC 10918

A lossy compression format, used in JFIF files. JPEG allows for a choice of the level of compression. Various 'Quality' settings within JPEG compliant applications enable the selection of a best possible compression ratio.42 Compression ratios of 25:1 are common - somewhere between 10:1 and 40:1 is likely to be acceptable for a 'working image'.43

LZW (Lempel-Ziv-Welch)

A lossless compression algorithm used in GIF and TIFF files.44 LZW offers compression ratios between 50 and 90%.45 The LZW algorithm was at issue in the recent GIF licensing controversy (see above).

CCITT or Huffman Encoding

Referred to as CCITT Group 3 and Group 4 these compression formats are commonly used for compressing two-colour images, (page images) and are used in fax machines and fax modems. Their uses are limited within the cultural heritage community which tends to require colour imaging.

Emerging compression technologies, based on wavelets and fractels may provide alternatives to JPEG or LZW compression, but these applications are not yet widely available or supported.

Storage Devices

Digital images can be stored on magnetic, magneto-optical or optical media. Storage architectures may employ one or all of these kinds of media, for on-line image storage, backup or long-term storage. The standards which apply to these media are constant regardless of the kind of data written on them. For example, the ISO standard for the CD-ROM file structure (ISO 9660), applies to both image and text data.46 However, the large size of image databases poses a challenge for image database designers, often resulting in the creation of hybrid information storage architectures, which keep frequently accessed, or low-resolution images quickly accessible on magnetic media (stored on-line on a hard-drive) and rely on media with a slower access time, such as CD-ROM for the storage of high-resolution images.

Issues in Image Storage: Archival Integrity

Critical to the choice of storage formats for imaging systems are concerns regarding the longevity of media and the migration of data. Migration of data from one generation of technology to another is inevitable; it is necessary to plan for the refreshment of technology, and the migration of data from one format to another. This successive transformation, however, raises concerns about the long-term integrity of digital information. Each migration could either introduce errors, or by altering the physical format of digital data, also alter its interpretation.47 There is much more to learn about the mechanical and intellectual issues surrounding the integrity of digital information.48

The cultural heritage community must heed the discussions and recommendations of the broader Digital Library community. Symposia such as Digital Imaging Technology for Preservation, hosted by the Research Libraries Group in 1994,49 and the discussions of such interdisciplinary groups as the Task Force on Archiving of Digital Information50 provide a forum for the definition of the issues, and a means for determining the 'best practices' in this rapidly evolving area. Coordinated action is critical if we are to develop means to ensure the longevity digital information and preserve the integrity of its intellectual content.

Standards and consistent practices offer some reassurance that information migration will at least be predictable. Documentation of image capture methodologies and image file formats is the first step towards ensuring the accurate interpretation of visual information when it is displayed in the future.

Next Section: 4. Image Documentation

Informatics: The interdisciplinary study of information content, representation, technology and applications,
and the methods and strategies by which information is used in organizations, networks, cultures and societies.