For some reason unknown, our almost resident expert Hermitian has suggested that FlateDecode is a lossy compression.
Wikipedia may be helpful in correcting that impression. Of course anyone familiar with ZLIB would have known and understood this.
Lossy compression examples for PDF’s include
- DCTDecode – Aka JPEG, always lossy even at highest quality settings
- JPXDecode – Lossless or lossy Wavelet based JPEG
- JBIG2 – Lossless or lossy
And let me also explain why I believe most programs do not touch DCTDecoded data. It has been argued and observed that Preview maintains the exact DCTDecode stream. This is for obvious reasons, as they could either compress the resulting bitmap with high quality but then the filesize would explode, or they could add to the compression by recompressing the bitmap. I believe that Adobe tools are more destructive here.
The way a PDF Editor typically works is that it maintains two different ‘trees’. One contains the PDF tree with all the raw objects, the other contains the rendered information. When objects are not touched, their raw data is written back to avoid the problems with JPEG. This is also why Preview maintains the landscape orientation of the images.
I can appreciate that to someone who was recently introduced to PDF encoded data, that it may appear to be somewhat overwhelming and thus when seeing a /FlateDecode/DCTDecode statement for the filter, one may be initially confused by the order. However, a quick logical analysis of the two possibilities would lead one to quickly eliminate the flow where the bitmap was first zipped up and then the zip file was somehow encoded in a lossy fashion. Imagine the surprise when trying to deflate the data to find out it is no longer a valid encoding as DCT has managed to mess it all up.
Alternatively, one could also have read the PDF standard documentation which outlines the order.