I never really gave it much thought but my recent research into the images supported by Xerox WorkCentres caused me to ponder the issue of DCTDecode. DCTDecode is used to encode an image using the JPEG compression, and is also a fully functional JPEG image, however, in PDF, one should consider the object as a TIFF bitmap, encoded using JPEG compression. I recently noticed the relevance of the notation in the Xerox 7655 specification:
It helps understand why JFIF is not a required format, as the PDF format itself contains the information that would typically be encoded in the APP14 tag.
A minor mystery laid to rest. Also note that the YCbCr tag is obviously inserted by some older software to help with managing the JPEG encoding, as JPEG itself failed to provide for ways to indicate the colorspace used. While PDF resolved this by encoding this in the PDF data (/ColorSpace tag), JFIF was proposed to extend the original JPEG standard.
Now it makes all much better sense as the relevance of DCTDecode objects, which are JPEG encoded data streams used to encode a TIFF image.
The YCbCr comment tag is likely added to inform the PDF generator as to the nature of the DCTDecode object. It all starts to make sense
This also makes sense, realizing that Xerox uses the same DCTDecode object to encode LinearGray images which is a color space not supported by JPEG.
August 28 A sad day as I have decided to no longer allow Hermitian to submit comments on this blog as he has now, several times accused me of behavior for which he has no evidence (hinting that I may be the forger, work for Obama, that I withhold data or manipulate data and other non sequiturs). I feel saddened because, despite his short comings, he did serve a useful purpose. I wish him well and will continue to address issues he raises, to help him understand better why the Xerox workflow stands unassailed. Thank you Hermitian for your efforts to debunk the work flow, helping further strengthen it.
From the FreeRepublic we receive some good feedback from a poster named Butterdezilion.
Butterdezilion: If the Xerox machine is substituting exact replicas every time a certain “blob” (such as a box) appears, then that should happen with every box, every letter, etc. If the Xerox is switching 6’s for 8’s then where are those numbers switched around in the White House PDF?
A good question but as I have shown and found out, the Mixed Raster Compression is all but exact as it appears to be extremely sensitive to small variations. I have seen examples with anywhere from 4 to 17 foreground images. The same for JBIG2, it is based on how similar the two blobs, such as a box are, and in the samples I have, I have found JBIG2 to fail to capture the boxes, but it does capture other letters.