I expect that most graphics artists - amateur and pro - are comfortable
using the mainstays of graphic image formats for the Internet: GIFs and
JPGs. Well, your expertise is about to be dealt a curveball. Since these
formats were created, research has developed more sophisticated approaches.
Besides, JPEG originated about ten years ago: current hardware brandishes
more powerful processors and cheap memory. On the horizon is JPEG 2000
(from those wonderful folks who brought you JPG) that promises a host of
improvements and the introduction of two new file extensions: JP2 and JPX.
Ideally, any compression scheme should feature:
- high levels of compression (leading to minimal file size),
- no loss of detail,
- open standard (no royalties),
- no visible artefacts,
- no visible posterization (loss of colour information),
- easy use (few user-controlled variables), and
- fast execution.
The Internet has fuelled much of the demand for image compression since
end-users expect a website to pop up instantly on-screen. Despite faster
connections large image file sizes are self-defeating. Smaller file sizes
were championed at the expense of image quality. Current schemes differ
from the ideal: as compression ratios increase (and file sizes shrink)
visual artefacts become increasingly evident. Early in the life-cycle of
the current image formats these inadequacies were largely masked by the
comparatively low resolution of desktop monitors and a restricted colour
palette. The landscape has changed: the explosion in popularity of digital
cameras and photo-quality colour inkjets has fuelled demands for algorithms
(often referred to as a 'transform') that can deliver images with the resolution
and colour fidelity of the original intact.
Why JPEG can't 'cut' it
Most of the current compression schemes rely upon some variant of a
scheme called Discrete Cosine Transform (DCT) - relax, I will spare you
the math - combined with other tweaks to minimize the perceived (human
visual-brain interpretation) image quality degradation. This scheme is
lossy; data that is squeezed out during encoding is lost forever. Yes,
lossless encoding schemes exist: Huffman or run-length encoding (RLE) is
an example. Simplified, RLE instead of recording a string (run) of 200
red pixels with the code 'red, red, red1/4' would be coded more economically
as '200 times red.' DCT is more complex: suffice to say that it divides
the image into 8x8 sub-blocks; each one is then processed separately, using
the rules of the algorithm. Since the division of an image into blocks
is consistent, these blocks occupy the same relative position within the
image. Abrupt transitions in the original image (light colour to dark,
for example) tend to create edge artefacts that become increasingly more
evident as the compression ratio is increased. As an analogy, think of
the mosaic effect that is used to hide a person's identity on television
in an investigative report. Lossy schemes are preferred since they offer
higher compression ratios. Hence, a conflict exists between image file
size and quality.
The mathematics aside, JPEG imposes other constraints: most notably,
colour must be eight bits per pixel and only the RGB model is supported.
Indeed, the 'stacked blocks' relegate JPEG as unsuitable for use in video.
Since artifacts are the most stable feature in a video frame they are particularly
evident since the rest of the frame's contents are moving around them.
While commercial graphics printing is remote from my forte, JPEG's failings
would be glaringly evident in any printed material.
A bit of oblique theory
Regardless of the transform employed, if the compression is lossy then
artefacts are a fact of life. Evaluation of artefacts is tricky science.
First, you must quantify the error. On that basis you may have the numbers
to support your conclusion that method A is better than method B. The kicker
with numbers: they ignore the human conscious interpretation of the 'raw'
eye-brain data. Assume that you are evaluating two versions of the same
image: one has been compressed using an algorithm that results in some
loss of sharpness; the other as a result of compression exhibits DCT-style
blockiness. While the numerical amount of distortion may be the same, in
most cases the former image is perceived as better since the error pattern
is more diffused.
JPEG 2000 to the rescue
The new codec employs Wavelet Theory (a search on 'wavelet' on the Internet
will satisfy your curiosity) that operates on the entire image (as a single
entity rather than the blocks of DCT) and produces a continuous data stream
instead of the chunks inherent in DCT; this approach eliminates the blocking
artefacts. Wavelet theory deals with 'trends' - large image areas that
have a slow rate of variation (for example, large expanse of a single colour),
if any - and 'outliers' - concentrated nodes of intense activity (for example,
edges). Compression can be lossy or lossless with lossy compression ratios
up to 300:1. Practically, a compression ratio closer to 140:1 would define
the ceiling for 'visibly lossless' (compression artefacts would remain
invisible to a trained observer). The wavelet encoding process creates
artefacts that are more difficult to perceive and characterize. They are
much less visible in video sequences.
In comparison, the present JPEG would exhibit noticeable degradation
at 30:1. The Wavelet transform encodes trends at a lower resolution and
devotes more processing to the outliers. Wavelet images can be decompressed
in a series of passes; each iteration improves the image quality (resolution
and colour bit depth) up to a totally lossless reconstruction. The incorporation
of a defined colour space, sRGB, provides built-in colour management. Considerable
flexibility is possible since the standard offers definitions for extensions
that would incorporate (singly or in any combination):
- variable colour bit depths (up to 32 bits),
- ICC colour profiles (necessary to support CMYK),
- spot colours,
- alpha layers (transparency),
- metadata (copyright, camera /lens used, date/location,
- error tolerance (repair damage due to noisy transmission
- multiple coding or mixed raster content (proposed)
- video variant for Quick Time (proposed).
The multiple coding is intended for text and graphical content in the
same document. When implemented, it would segment the graphics and text
portions of a page. The graphics would be encoded with 'standard' JPEG
2000; a variation of the algorithm, tuned to maintain maximum sharpness
of the letter forms, would be implemented to enhance OCR accuracy.
Another player, FlashPix, was a format developed by Kodak and partners
for digital camera images. It can achieve many of the goals set for the
JPEG 2000 format; however, the file sizes are larger.
In the larger scheme there are many Wavelet-dependent image storage
formats; indeed, JPEG 2000 is not the first. The proprietary information
is in the detail; in consequence, these processes are mutually exclusive.
Compression engine suppliers include: LuRa Tech, Infinop, Summus, LizardTech,
ER Mapper, and PIC Tools (there are likely others). Interoperability, using
import/export filters, seems likely once JPEG 2000 has formed a user-base.
JPEG 2000 looks like a 'go.' It provides an incremental improvement
on JPEG by offering higher compression ratios with less (visibly perceived)
image degradation. Unlike GIF, proprietary issues (royalties) are not applicable.
Many major players endorse the JPEG standard. While not a guarantee of
success, a dearth of players would guarantee an early death. Once adopted,
market penetration will depend upon its addition to graphics applications
Given its native ability to incorporate many of the features that define
current proprietary image formats (for example, transparency and alpha
channels) combined with varying degrees of compression (even none) it has
the credentials to replace these formats. Acceptance by the commercial
printing industry remains an unknown; however, many industry leaders have
apparently joined the JPEG 2000 standards committee. While nothing is guaranteed
this format has, at least, the attributes that could satisfy that industry's
technical requirements - a standard, at last?
Originally published: June, 2001