Image format

From SubSurfWiki
Jump to: navigation, search

Choosing a file format for your images can be tricky. It seems simple enough on the outside, but the details turn out to be full of nuance and gotchas. Plenty of reports, presentations, and papers are spoiled by low quality images. Don't let yours be one of them!

Spoiler: my rules of thumb are

  1. Save lots of pixels, at least 1 million for most images.
  2. Save everything in the PNG format.

What determines quality?

The main determinants of image — also called raster — quality are:

  • The number of pixels in the image
  • The size of the image
  • If the image is compressed (e.g. JPG), the fidelity of the compression
  • If the image is indexed (e.g. GIF), the number of colours available

There is a subtle footnote to all of this: what really matters is the lowest-quality version of the image file over its entire history. In other words: it doesn't matter if you have a 1200 × 800 TIF today, if this same file was previously saved as a 600 × 400 GIF with16 colours. You will never get the lost pixels back, though you can try to mitigate the quality loss with filters and careful editing. This seems obvious, but I have seen people caught out by it.

JPG is only for photographs

Quality comparison jpg vs size.jpg

The problem with JPG is that the lossy compression can bite you, even if you're careful. What is lossy compression? The JPEG algorithm makes files much smaller by throwing some of the data away. It 'decides' which data to discard based on the smoothness of the image in the wavenumber domain, in which the algorithm looks for a property called sparseness. Once discarded, the data cannot be recovered. In discontinuous data — images with hard edges — you might see artifacts (e.g. see How to cheat at spot the difference). Bottom line: only use JPG for photographs with lots of pixels (more than about 1200 wide).

The chart, right, shows how image compression affects file size. Note, the exact solution depends on your image.

What is an indexed image file?

Indexed files (e.g. GIFs) do not store the full colour properties (with an 8-bit red value, green value, and blue value, say) of each pixel. Instead, they store a look-up table of colours — 16 colours, perhaps — then reference the colour of each pixel to this table (say, with an integer between 1 and 16). Non-indexed file formats simply store the full colour of each pixel individually.

CMYK vs RGB

If you are ever involved in printing on trade platforms, such as a web press (commonly used for colour books), then you need to know about colour profiles. Since colours can be described additively (as for lights and computer screens, for which the primary colours are red, green, and blue), or subtractively (for inks, for which the primaries are cyan, magenta, and yellow), graphics files can encode their colours in two different ways (actually far more than 2). This is a minefield, but suffice to say: you may need to seek expert help if you want to print colour images in a book or brochure.

Some formats, like PNG, were designed with the web in mind, and do not support non-RGB profiles. You will probably need serious software, like Adobe Photoshop, and you will probably need to make a TIFF or very high-resolution, high fidelity JPEG. These formats have special versions that support CMYK profiles.

Image formats in a nutshell

Rather than list advantages and disadvantages exhaustively, I've tried to summarize everything you need to know in the table below. There are lots of other formats, but you can do almost anything with the ones I've listed... except BMP, which you should just avoid completely.

There are a couple of footnotes to the table. PGM is really only for programmers and tinkerers. Essentially it's the simplest possible image file, containing only ASCII characters. You will probably never use it. GIF has a special feature none of the other formats have: it supports animation. Using a tool like GIMP, you can add multiple layers, which most display software (web browsers, presentation software, etc) will interpret as a looping movie.

Type Name Indexed? Compressed? Lossy? Transparency? CMYK? Use for Avoid for Comments
JPG Joint Photographic Group No Yes Yes No No When small file size is more important than quality Text, line art, and other two-tone images Variable fidelity can be useful
TIF Tagged image format Optional Optional No Yes TIFF/IT When file size is not an issue When it is! Widely used for georeferenced images
PNG Portable network graphics Optional Usually No Yes No Almost anything When you need small files Best all-rounder; pronounced 'ping'
PGM Portable gray map Optional No No No No Playing with data General use Part of the Netpbm family, plain text file is good for scripting
GIF Graphic image format ≤256 colors Yes No Yes No Animations Photographs Multiple layers interpreted as animation; doesn't scale
BMP Windows bitmap No Optional Optional Yes No Nothing Always avoid it Esoteric and uncompressed, it has little going for it

What is WebP?

WebP is a new image format from Google. So far it is not that widely supported (e.g. by MediaWiki, GIMP, etc, or by Internet Explorer) but it could see wide adoption. Essentially, it's a bit like PNG but supports lossy compression... or like JPG that supports transparency. It is an open standard and unencumbered by patents. Find out more.

Use PNG for everything!

All this advice could have been much shorter: use PNG for all raster data. Unless file size is your main concern, you really can't go wrong.

See also

External links