Document Scanning

File formats

The files commonly used in imaging systems are mostly defined by industry standards, established and maintained by industry working groups in the imaging and telecommunications field.

greyspacer

1.  TIFF files

These files are in widespread use in the imaging field and TIFF is often used too casually as an overall file format definition. A TIFF file can be visualized as a wrapper with an image file inside it. The wrapper gives information about the image so that the software package can open it correctly.

The image file can be any one of a wide range of formats and this format needs to be specified along with the TIFF wrapper. The majority of imaging software will open TIFF files, but not necessarily the image file that is inside. When TIFF files are referred to the image type needs to be included to be certain of compatibility. For instance, a monochrome file may be TIFF CCITT G4, or a colour file may be TIFF JPEG.

greyspacer

2. Common monochrome file formats

The following are the most commonly used monochrome file formats in commercial imaging:
  • TIFF CCITT Group 4 (or 3)
  • CALS (defence industry standard)

greyspacer

3. Common greyscale and colour file formats

The following are the most commonly used greyscale and colour file formats in commercial imaging:
  • TIFF uncompressed (or raw)
  • TIFF JPEG
  • JPEG
  • GIF
  • LZW

greyspacer

4.    Compressed files

Because of the large size of image files a range of compression techniques have been devised. These techniques encode the file content, effectively describing lines of the file rather than giving simple pixel by pixel values.

Monochrome files can be compressed with no data loss using the standards developed by the telecommunications industry. These CCITT compression techniques are widely adopted.

Colour and greyscale files can be compressed without data loss but the established LZW format is commercially proprietary and must be licensed. The two techniques that are commonly used involve some data loss.

The first of these is colour reduction, whereby the range of colours in the image is analysed and reduced to a palette of 256 or 16 colours. Each pixel is then best matched to a palette colour.

The second technique is JPEG compression, developed by the Joint Photographic Experts Group. In this approach the tonal variation range is analyzed and very small tonal variations filtered out of the image. In this way the number of discrete different colours in an area can be significantly reduced with no visible effect. Once the tonal variation has been reduced the process of describing lines can be applied. By these means JPEG compression can be efficient with no appreciable loss of colour detail.