Long Form Birth Certificate – The gory details – Part 2

As I showed in Part 1, Obj 6 contains the 9 images (Xobjects or ‘external objects’). It is time to explore the nature of these images in more detail.

Update: The graphics do not render too well… Click on them to see them.

obj 6 0
 Type:
 Referencing: 26 0 R, 11 0 R, 20 0 R, 22 0 R, 24 0 R, 9 0 R, 
              14 0 R, 7 0 R, 18 0 R, 12 0 R, 16 0 R
<<
   /ProcSet [ /PDF /ImageB /ImageC /ImageI ]
   /ColorSpace
   <<
      /Cs2 26 0 R
      /Cs1 11 0 R
   >>
   /XObject
   <<
      /Im7 20 0 R
      /Im8 22 0 R
      /Im9 24 0 R
      /Im2  9 0 R
      /Im4 14 0 R
      /Im1  7 0 R
      /Im6 18 0 R
      /Im3 12 0 R
      /Im5 16 0 R
   >>
>>

/Im7 – Obj 20 – “Non” Stencil Mask Bitmap (34×70), 173 bytes compressed

obj 20 0
 Type: /XObject
 Referencing: 21 0 R
 Contains stream
  <<
     /Length           21 0 R
     /Type             /XObject
     /Subtype          /Image
     /Width              34
     /Height             70
     /ImageMask        true
     /BitsPerComponent 1
     /Filter           /FlateDecode
  >>
obj 21 0
 Type: 
 Referencing: 
 [(1, '\n'), (3, '173'), (1, '\n')]

bc-006

As explained in the PDF 1.3 Specification

Stencil Masking

An image mask (an image XObject whose ImageMask entry is true) is a mono- chrome image, in which each sample is specified by a single bit. However, instead of being painted in opaque black and white, the image mask is treated as a stencil mask that is partly opaque and partly transparent. Sample values in the image do not represent black and white pixels; rather, they designate places on the page that should either be marked with the current color or masked out (not marked at all). Areas that are masked out retain their former contents. The effect is like applying paint in the current color through a cut-out stencil, which allows the paint to reach the page in some places and masks it out in others.

/Im8 – Obj 22 – “Speckels” Stencil Mask Bitmap (243×217), 671 bytes compressed

obj 22 0
 Type: /XObject
 Referencing: 23 0 R
 Contains stream
  <<
      /Length           23 0 R
      /Type             /XObject
      /Subtype          /Image
      /Width             243
      /Height            217
      /ImageMask        true
      /BitsPerComponent 1
      /Filter           /FlateDecode
  >>
obj 23 0
 Type: 
 Referencing: 
 [(1, '\n'), (3, '671'), (1, '\n')]

bc-007

/Im9 – Obj 24 – “Speckels” Stencil Mask Bitmap (132×142), 344 bytes compressed

obj 24 0
 Type: /XObject
 Referencing: 25 0 R
 Contains stream
  <<
    /Length           25 0 R
    /Type             /XObject
    /Subtype          /Image
    /Width             132
    /Height            142
    /ImageMask        true
    /BitsPerComponent 1
    /Filter           /FlateDecode
  >>
obj 25 0
 Type: 
 Referencing: 
 [(1, '\n'), (3, '344'), (1, '\n')]

bc-008

/Im2 – Obj  – “Text” Stencil Mask Bitmap (1454×1819), 67980 bytes compressed

obj 9 0
 Type: /XObject
 Referencing: 10 0 R
 Contains stream
<<
 /Length           10 0 R
 /Type             /XObject
 /Subtype          /Image
 /Width            1454
 /Height           1819
 /ImageMask        true
 /BitsPerComponent 1
 /Filter           /FlateDecode
 >>
obj 10 0
 Type:
 Referencing:
 [(1, '\n'), (3, '67980'), (1, '\n')]

bc-001

/Im4 – Obj 14 – “Date stamp” (42×274), 480 bytes compressed

obj 14 0
 Type: /XObject
 Referencing: 15 0 R
 Contains stream
<<
 /Length           15 0 R
 /Type             /XObject
 /Subtype          /Image
 /Width              42
 /Height            274
 /ImageMask        true
 /BitsPerComponent 1
 /Filter           /FlateDecode
 >>
obj 15 0
 Type:
 Referencing:
 [(1, '\n'), (3, '480'), (1, '\n')]

bc-003

/Im1 – Obj 7  – 8 bit background JPEG  (1652×1276), 299366 bytes compressed

obj 7 0
 Type: /XObject
 Referencing: 8 0 R, 11 0 R
 Contains stream
<<
 /Length 8 0 R
 /Type             /XObject
 /Subtype          /Image
 /Width            1652
 /Height           1276
 /ColorSpace       11 0 R        % Colorspace
 /BitsPerComponent 8             % 8 bits
 /Filter           /DCTDecode    % Discrete Cosine Transform (JPEG)
 >>
 obj 8 0
 Type:
 Referencing:
 [(1, '\n'), (3, '299366'), (1, '\n')]

bc-000

/Im6 – Obj 18  – Date Stamp (47×216), 436 bytes compressed

obj 18 0
 Type: /XObject
 Referencing: 19 0 R
 Contains stream
<<
 /Length           19 0 R
 /Type             /XObject
 /Subtype          /Image
 /Width              47
 /Height            216
 /ImageMask        true
 /BitsPerComponent 1
 /Filter           /FlateDecode
 >>
obj 19 0
 Type:
 Referencing:
 [(1, '\n'), (3, '436'), (1, '\n')]

bc-005

/Im5 – Obj 12  – Date stamp (199×778), 5510 bytes compressed

obj 12 0
 Type: /XObject
 Referencing: 13 0 R
 Contains stream
<<
 /Length           13 0 R
 /Type             /XObject
 /Subtype          /Image
 /Width             199
 /Height            778
 /ImageMask        true
 /BitsPerComponent 1
 /Filter           /FlateDecode
 >>
obj 13 0
 Type:
 Referencing:
[(1, '\n'), (3, '5510'), (1, '\n')]

bc-004

/Im3- Obj 16  – Signature DOH (123×228), 633 bytes compressed

obj 16 0
 Type: /XObject
 Referencing: 17 0 R
 Contains stream
<<
 /Length           17 0 R
 /Type             /XObject
 /Subtype          /Image
 /Width             123
 /Height            228
 /ImageMask        true
 /BitsPerComponent 1
 /Filter           /FlateDecode
 >>
obj 17 0
 Type:
 Referencing:
[(1, '\n'), (3, '633'), (1, '\n')]

bc-002

6 thoughts on “Long Form Birth Certificate – The gory details – Part 2

  1. I see all of the images are ‘rotated’ (or to be precise, require rotation to be viewed in the normal orientation). Some of the birthers made it sound as if the background hadn’t been rotated – this clears it up for me.

    I can think of three explanations for the rotation. First, that it was fed into the scanner in landscape orientation and the scanner rotated the page after separating the layers. Second, that the pdf created by the scanner was in landscape, and Preview was used to rotate it (also would explain why it was saved in Preview). Third, the graphics program used by the scanner saves bitmaps with “top-down” format. Bitmaps can be saved with the origin in the top left corner (“top-down”) or in the bottom left corner (“bottom-up”). PDF uses “bottom-up” for its origin. If you try to display a “top-down” bitmap “bottom-up” it will rotate counter-clockwise like the images shown above. For more on this, see:

    http://www.fileformat.info/format/bmp/egff.htm

    One experiment suggests itself. Create a pdf file in Illustrator that has a background image and several monochrome layers, that is in landscape. Open the file in Preview, rotate it, and save as pdf. I wonder how that would show up in the gory details – particularly if it causes tell-tale changes.

  2. Upon further digging, it appears that the top-down bitmap would not be rotated, but rather mirrored upside-down. However, TIFF, a very common format for scanned documents, does allow for “top-right” which would have the proper rotation, and this choice of orientation could be explained by the way the scanner scanned the document (ie, scanning the page from left to right – which is the scan order for a fed document in Portrait orientation – which just happens to be the most common way you scan a document on a Xerox WorkCenter).

  3. This may be getting somewhere

    In the case of a TIFF, even when a portrait manuscript is read as Long Edge Feed (LEF), when it is displayed at the PC, it will be displayed as an oblong (Short Edge Feed – SEF). If the customer desires that TIFFs be displayed in Portrait orientation,
    the documents must be scanned in SEF.

    Here

    Let me think how this would reconcile with the 8-bit alignment on the top and left side of the objects. Need to ponder a bit.

  4. I think I figured out the alignment of the monochrome layers: it all comes back to the JBIG2 compression scheme. Described briefly, JBIG2 compresses by creating a library of repeated symbols plus the locations where those symbols start. The closer to the origin a symbol is located, the fewer bits are needed to describe the location (eg. 3,5 takes fewer kestrokes than 367,1239). Therefore, it is advantageous to set the origin as close to the symbols as possible. For JBIG2, the origin is in the lower left. The non-axis boundaries do not actually increase the size of the compressed file. So as long as they are at least far enough away to encompass every symbol location, the upper and right borders are immaterial to the compression.

    Thus, if you start with a layer built from 8×8 blocks, it is beneficial to trim the left and lower boundaries as tight as possible, but there is no need to trim the upper and right boundaries.

    See more here.

    The money quote:

    Symbol Compression Components
    • Symbol Dictionaries stored in Symbol Libraries,
    containing:
    – Symbol images of similar height grouped together in sorted width
    – Each height class is CCITT Group 4compressed
    – Height class delta Huffman encoded.
    – Each dictionary symbol has unique ID
    • Symbol location data, or position blocks:
    – List of X,Y locations with associates token ID
    – Symbol positions of lower left-hand corner, grouped in rasterscan order
    – X- and Y-positions delta Huffman encoded.

Comments are closed.