Examining Johanna’s LFBC PDF

Orly has made available a fully unredacted PDF of “Hawaii Girl” Johanna XXXX’s long form birth certificate which was obtained in 1995. Orly is, unsurprisingly, confused about the whole issue and believes that this document was obtained in 2011… Sigh…

But now let’s look at the document in more detail:

122,387 bytes Shasum 5c14e5659931ac3db0278ec4207578bdf59c5a13

PDF Format: 1.4
Media Box: 612×396
Content Creator: Canon iR-ADV 8105  PDF
Encoding software: Adobe PSL 1.1e for Canon
Creation Date: May 4, 2011, 10:17 PM
Modification Date: Nov 25, 2013, 10:01 AM

pdfimages -j Johanna-BC.pdf Johanna creates 4 files

Rotated: 90 degrees clockwise
Shasum: a71d0fa7abcbc5dd92a2bcc09d1ee8b1cf0e3443  Johanna-000.jpg
150×150 dpi 825×1275 pixels (lots of white at right side (bottom jpeg))
Embedded Comment: “Canon Inc”

Rotated: 90 degrees clockwise
Shasum: 4798c65ba40fe11743045ae1ede44f48fceab93e  Johanna-001.pbm
1416×1792 pixels

Rotated: 90 degrees clockwise
Shasum: 678ad4edae3ba01e332b3e102ffc441d02261908  Johanna-002.pbm
554×1040 pixels

Rotated: 90 degrees clockwise
Shasum: 4992b1910c50520847720aea848b69e8da63d398  Johanna-003.jpg
150×150 dpi 825×1275 pixels
Embedded Comment: “Canon Inc”

There are no obvious halos, and unlike the WH LFBC PDF, the gaps behind the bitmap text that is removed is filled with a background color, which obviously reduces the halo effect

Reality Check – We accept your challenge Mr Zullo

Reality check has challenged Mr Zullo

What will Zullo do? He can chose to show how a ‘criminal investigation’ deals with information that contradicts its findings, or he can ignore it.

Either way, Zullo is faced with quite an unfortunate situation of his own creation.

Continue reading

PDF Images

I never really gave it much thought but my recent research into the images supported by Xerox WorkCentres caused me to ponder the issue of DCTDecode. DCTDecode is used to encode an image using the JPEG compression, and is also a fully functional JPEG image, however, in PDF, one should consider the object as a TIFF bitmap, encoded using JPEG compression. I recently noticed the relevance of the notation in the Xerox 7655 specification:

It helps understand why JFIF is not a required format, as the PDF format itself contains the information that would typically be encoded in the APP14 tag.

A minor mystery laid to rest. Also note that the YCbCr tag is obviously inserted by some older software to help with managing the JPEG encoding, as JPEG itself failed to provide for ways to indicate the colorspace used. While PDF resolved this by encoding this in the PDF data   (/ColorSpace tag), JFIF was proposed to extend the original JPEG standard.

Now it makes all much better sense as the relevance of DCTDecode objects, which are JPEG encoded data streams used to encode a TIFF image.

The YCbCr comment tag is likely added to inform the PDF generator as to the nature of the DCTDecode object. It all starts to make sense

This also makes sense, realizing that Xerox uses the same DCTDecode object to encode LinearGray images which is a color space not supported by JPEG.

Wherever the data takes us…


I am tracking a score sheet for features explained versus objections raised.

Confirmation Documents

August 28 A sad day as I have decided to no longer allow Hermitian to submit comments on this blog as he has now, several times accused me of behavior for which he has no evidence (hinting that I may be the forger, work for Obama, that I withhold data or manipulate data and other non sequiturs). I feel saddened because, despite his short comings, he did serve a useful purpose. I wish him well and will continue to address issues he raises, to help him understand better why the Xerox workflow stands unassailed. Thank you Hermitian for your efforts to debunk the work flow, helping further strengthen it.

Continue reading

Butterdezillion – Some good questions

From the FreeRepublic we receive some good feedback from a poster named Butterdezilion.

Butterdezilion: If the Xerox machine is substituting exact replicas every time a certain “blob” (such as a box) appears, then that should happen with every box, every letter, etc. If the Xerox is switching 6’s for 8’s then where are those numbers switched around in the White House PDF?

A good question but as I have shown and found out, the Mixed Raster Compression is all but exact as it appears to be extremely sensitive to small variations. I have seen examples with anywhere from 4 to 17 foreground images. The same for JBIG2, it is based on how similar the two blobs, such as a box are, and in the samples I have, I have found JBIG2 to fail to capture the boxes, but it does capture other letters.

Continue reading