Table of Contents
Preface
Acknowledgments
Chapter 1 Introduction
The Representation of Images
Vector and Bitmap Graphics
Color Models
True Color versus Palette
Compression
Byte and Bit Ordering
Color Quantization
A Common Image Format
Conclusion
Chapter 2 Windows BMP
Data Ordering
File Structure
Compression
Conclusion
Chapter 3 XBM
File Format
Reading and Writing XBM Files
Conclusion
Chapter 4 Introduction to JPEG
JPEG Compression Modes
What Part of JPEG Will Be Covered in This Book?
What are JPEG Files?
SPIFF File Format
Byte Ordering
Sampling Frequency
JPEG Operation
Interleaved and Noninterleaved Scans
Conclusion
Chapter 5 JPEG File Format
Markers
Compressed Data
Marker Types
JFIF Format
Conclusion
Chapter 6 JPEG Human Coding
Usage Frequencies
Huffman Coding Example
Huffman Coding Using Code Lengths
Huffman Coding in JPEG
Limiting Code Lengths
Decoding Huffman Codes
Conclusion
Chapter 7 The Discrete Cosine Transform
DCT in One Dimension
DCT in Two Dimensions
Basic Matrix Operations
Using the 2-D Forward DCT
Quantization
Zigzag Ordering
Conclusion
Chapter 8 Decoding Sequential-Mode JPEG Images
MCU Dimensions
Decoding Data Units
Decoding Example
Processing DCT Coefficients
Up-Sampling
Restart Marker Processing
Overview of JPEG Decoding
Conclusion
Chapter 9 Creating Sequential JPEG Files
Compression Parameters
Output File Structure
Doing the Encoding
Down-Sampling
Interleaving
Data Unit Encoding
Huffman Table Generation
Conclusion
Chapter 10 Optimizing the DCT
Factoring the DCT Matrix
Scaled Integer Arithmetic
Merging Quantization and the DCT
Conclusion
Chapter 11 Progressive JPEG
Component Division in Progressive JPEG
Processing Progressive JPEG Files
Processing Progressive Scans
MCUs in Progressive Scans
Huffman Tables in Progressive Scans
Data Unit Decoding
Preparing to Create Progressive JPEG Files
Encoding Progressive Scans
Huffman Coding
Data Unit Encoding
Conclusion
Chapter 12 GIF
Byte Ordering
File Structure
Interlacing
Compressed Data Format
Animated GIF
Legal Problems
Uncompressed GIF
Conclusion
Chapter 13 PNG
History
Byte Ordering
File Format
File Organization
Color Representation in PNG
Device-Independent Color
Gamma
Interlacing
Critical Chunks
Noncritical Chunks
Conclusion
Chapter 14 Decompressing PNG Image Data
Decompressing the Image Data
Huffman Coding in Deflate
Compressed Data Format
Compressed Data Blocks
Writing the Decompressed Data to the Image
Conclusion
Chapter 15 Creating PNG Files
Overview
Deflate Compression Process
Huffman Table Generation
Filtering
Conclusion
Glossary
Bibliography
Index
Forewords & Introductions
The purpose of this book is to instruct the reader on how to write software that can read and write files using various 2-D image formats. I wanted to write a book that explains the most frequently used file formats with enough depth for the reader to implement them, as opposed to one that covered many different formats at a high level or one that avoided the more difficult image formats. As a result, I chose to cover the image file formats that are associated with Web browsers. Those covered in this book (BMP, XBM, JPEG, GIF, and PNG) represent the vast majority of image files that can be found on the Internet. They employ a wide range of encoding techniques and range in implementation difficulty from simple to very complex.
The inspiration for this book was my own frustration resulting from the lack of information on how to implement encoders and decoders for the more complex file formats. Most of the information available was at too high a level, left major gaps, or was very difficult to decipher. I have tried to create a bridge between the programmer and the standards documents.
One issue I faced at the start of this project was which programming language to use for the examples. The intention was to create a book on graphics file formats rather than one on how to write programs to read and write graphics files in a particular language. Therefore, I debated using a language that is easy to read (e.g., Pascal or Ada) or the one most people are likely to use (C++). In the end I felt that its widespread use made C++ the best choice. To make the examples more understandable for non-C++ programmers, I have carefully avoided certain C++ language constructs (e.g.,expressions with side effects and integer/boolean interchangeability) that would make the code difficult for them to understand.
In order to make the encoding and decoding processes as clear as possible, I have used a Pascal-like pseudo-code. C++ is used for complete function implementations and pseudo-code for illustrative fragments. These fragments generally contain no error checking.
Because of their generally large size, it was not possible to include working source code for the formats in the book itself. Instead, the accompanying CD-ROM contains the complete source code for encoders and decoders for almost all of the image formats covered. The reader should use the pseudo-code in the text to learn how processes work and the C++ examples on the CD to see how to implement them.
Generally, the decoders implement more features than the encoders. In the decoders I have implemented all of the features needed to decode files that a reader will have any likelihood of encountering on the Internet. For the sake of clarity, the encoders generally implement a smaller feature subset.
In writing the programming examples I have given clarity precedence over execution efficiency and instant portability. The source examples will compile, without modifications, on Microsoft Windows using both Borland C++Builder V3.0 and Microsoft Visual C++ V5.0. Other compilers generally require some modifications to the code.
The descriptions of the encoders and decoders for the various file formats frequently employ the term "user" to describe the source of certain input parameters to the encoding or decoding process. By this I mean the user of the encoder or decoder, not necessarily the person typing at the keyboard. Since image encoders and decoders are incorporated into other applications, such as image viewers and editors, the user in this case would most likely be another piece of software. However, in many situations the "user" application may get some of these parameters directly from a human.
Just as this is not intended to be a book on C++ programming, it is also not intended to be a book on programming in a specific environment. For that information readers will need a book for their particular system.
A project as large as producing a book requires the involvement of many people. Mike Bailey, Eric Haines, Tom Lane, Shawn Neely, and Glenn Randers-Pehrson reviewed the manuscript and provided many invaluable suggestions. Glenn also arranged for me to get the latest proposed PNG standards for the CD. My fellow aviator, Charlie Baumann, was kind enough to provide several of the photographs. Ralph Miano and Margaret Miano assisted with preparing the manuscript. Jean-Loup Gailley answered all my questions on ZLIB. Albert "The Chipster" Copper compiled examples on systems I did not have access to. Most important, Helen Goldstein at AWL guided the process from start to finish.
John M. Miano
Summit, New Jersey
miano@colosseumbuilders.com
Read an Excerpt
The purpose of this book is to instruct the reader on how to write software that can read and write files using various 2-D image formats. I wanted to write a book that explains the most frequently used file formats with enough depth for the reader to implement them, as opposed to one that covered many different formats at a high level or one that avoided the more difficult image formats. As a result, I chose to cover the image file formats that are associated with Web browsers. Those covered in this book (BMP, XBM, JPEG, GIF, and PNG) represent the vast majority of image files that can be found on the Internet. They employ a wide range of encoding techniques and range in implementation difficulty from simple to very complex.
The inspiration for this book was my own frustration resulting from the lack of information on how to implement encoders and decoders for the more complex file formats. Most of the information available was at too high a level, left major gaps, or was very difficult to decipher. I have tried to create a bridge between the programmer and the standards documents.
One issue I faced at the start of this project was which programming language to use for the examples. The intention was to create a book on graphics file formats rather than one on how to write programs to read and write graphics files in a particular language. Therefore, I debated using a language that is easy to read (e.g., Pascal or Ada) or the one most people are likely to use (C++). In the end I felt that its widespread use made C++ the best choice. To make the examples more understandable for non-C++ programmers, I have carefully avoided certain C++ language constructs (e.g., expressions with side effects and integer/boolean interchangeability) that would make the code difficult for them to understand.
In order to make the encoding and decoding processes as clear as possible, I have used a Pascal-like pseudo-code. C++ is used for complete function implementations and pseudo-code for illustrative fragments. These fragments generally contain no error checking.
Because of their generally large size, it was not possible to include working source code for the formats in the book itself. Instead, the accompanying CD-ROM contains the complete source code for encoders and decoders for almost all of the image formats covered. The reader should use the pseudo-code in the text to learn how processes work and the C++ examples on the CD to see how to implement them.
Generally, the decoders implement more features than the encoders. In the decoders I have implemented all of the features needed to decode files that a reader will have any likelihood of encountering on the Internet. For the sake of clarity, the encoders generally implement a smaller feature subset.
In writing the programming examples I have given clarity precedence over execution efficiency and instant portability. The source examples will compile, without modifications, on Microsoft Windows using both Borland C++Builder V3.0 and Microsoft Visual C++ V5.0. Other compilers generally require some modifications to the code.
The descriptions of the encoders and decoders for the various file formats frequently employ the term "user" to describe the source of certain input parameters to the encoding or decoding process. By this I mean the user of the encoder or decoder, not necessarily the person typing at the keyboard. Since image encoders and decoders are incorporated into other applications, such as image viewers and editors, the user in this case would most likely be another piece of software. However, in many situations the "user" application may get some of these parameters directly from a human.
Just as this is not intended to be a book on C++ programming, it is also not intended to be a book on programming in a specific environment. For that information readers will need a book for their particular system.
A project as large as producing a book requires the involvement of many people. Mike Bailey, Eric Haines, Tom Lane, Shawn Neely, and Glenn Randers-Pehrson reviewed the manuscript and provided many invaluable suggestions. Glenn also arranged for me to get the latest proposed PNG standards for the CD. My fellow aviator, Charlie Baumann, was kind enough to provide several of the photographs. Ralph Miano and Margaret Miano assisted with preparing the manuscript. Jean-Loup Gailley answered all my questions on ZLIB. Albert "The Chipster" Copper compiled examples on systems I did not have access to. Most important, Helen Goldstein at AWL guided the process from start to finish.
John M. Miano
Summit, New Jersey