Xerox/WH PDF – Part 19 Building the PDF

[NBC: Since our friend Hermitian pointed out a minor error which did not affect the calculations, I have updated the article indicated in red]

There are some interesting observations from both the Xerox PDF and the Preview version of the Xerox PDF. Anyone using Illustrator would have missed these low level actions, and miss tell tale signs about the software used.

Remember that when scanning the original document the ‘right way up’, the resulting PDF shows its embedded objects to be rotated 90 degrees clockwise. This is true for both the original and preview saved pdf. But there is are some important differences as well

The MediaBox for the Xerox version shows a landscape document, while the Preview shows a Portrait document (y dimension larger than x dimension)

Xerox

/MediaBox [0 0 792 612]

or 11”x8.5” (landscape)

Preview

/MediaBox [0 0 612 792]

or 8.5”x11” (portrait)

This explains why Illustrator and some other tools appear to be confused. In Mac OS/X Preview as well as Acrobat Reader, both documents open as if they were Portrait. In fact, I tried various other applications and they all open the document in Portrait.

Okay time for some mathematics and an introduction to 2D affine transforms

The CM matrix a b c d e f cm represents the following transformation for (x,y) into (x’, y’)

x'        | a    c    e|    x
          |            |
y'  =     | b    d    f|    y
          |            |
1         | 0    0    1|    1

From the PDF standard:

  • Translations shall be specified as [1 0 0 1 tx ty], where tx and ty shall be the distances to translate the origin of the coordinate system in the horizontal and vertical dimensions, respectively.
  • Scaling shall be obtained by [sx 0 0 sy 0 0]. This scales the coordinates so that 1 unit in the horizontal and vertical dimensions of the new coordinate system is the same size as sx and sy units, respectively, in the previous coordinate system.
  • Rotations shall be produced by [cos q sin q -sin q cos q 0 0], which has the effect of rotating the coordinate system axes by an angle q counter clockwise.
  • Skew shall be specified by [1 tan a tan b 1 0 0] ,which skews the x axis by an angle a and the y axis by an angle b.

Transformations

e and f are the x and y translations called tx and ty, nothing tricky here.

The scaling factors sx and sy can be calculated as follows:

sx = sqrt(a**2 +c**2)

sy = sqrt(b**2 +d**2)

the rotation matrix for angle q is

|   cos q      sin q|

| -sin  q      cos q|

So let’s do the calculations.

The original PDF shows the following Coordinate Transformation Matrix

798.72 0.00 0.00 614.40 -3.36 -1.20 cm /XIPLAYER0 Do Q

sx = 798.72   ; 1664 pixels in the 150150 system which is 14 pixels larger than the expected 1650 (11*150)

sy = 614.40   ; 1280 which is 5 pixels larger than the expected 1275 (8.5*150)

tx = -3.36      ;  7 pixels in the 150×150 system system

ty= -1.20       ;  2.5 pixels in the 150×150 system.

angle q = 0°

angles a, b = 0

In other words, the jpeg is scaled and moved in such a way that there are 14 pixels in the x direction on either side, and 5 pixels in the x direction extending beyond the mediabox. The jpeg is 1664×1280 in size but it is centered in the MediaBox with 2.5 overlapping on the left and right and 7 in the top and bottom. The explanation is simple. Remember that jpeg always contains 8 MOD 0 blocks and thus a JPEG which does not match this requirement is padded.

The Preview created PDF shows the following Coordinate Transformation Matrix

0 798.72 -614.4 0 613.2 -3.36 cm /Im1 DO Q

sx = 798.72  ; 1664 pixels in the 150150 system which is 14 pixels larger than the expected 1650 (11*150)

sy = 614.40  ; 1280 which is 5 pixels larger than the expected 1275 (8.5*150)

tx = 613.2     ; 1277.5, 2.5 pixels over the boundary set by  1275

ty= -3.36      ;  7 pixels in the 150×150 system system

angle q = 90° (counter clockwise)

angles a, b = 0

In both cases they are scaled the same amount, in the Xerox case they remain aligned with the landscape document and in preview they are rotated 90 clockwise to align with the portrait document. No skew is applied in either case.

So why does the Xerox generated document still looks okay in Preview or other viewers? Because it contains a /Rotate 270 which rotates the document 270 clockwise, restoring its ‘proper orientation’.

Based on these findings I observe that the WH PDF was scanned in upside down which explains why the embedded bitmaps and JPEG are rotated counter clockwise.

And I leave it up to Hermitian to deduce from this information what we should find for the alignment of the monochrome bitmaps….

Let’s see how good his deductive skills are.

30 thoughts on “Xerox/WH PDF – Part 19 Building the PDF

  1. Just to show my prediction its sha1 hash is 82b2bef0b225b3c008b9960cb71108803cef5184 as calculated at this site

    I will post it as soon as Hermitian has shared with us his insights…

  2. “This explains why Illustrator and some other tools appear to be confused.”

    Even among other Adobe products. Photoshop and Reader opens it in Portrait.

    President Obama’s 2010 tax returns also open in landscape in Illustrator.

  3. It makes sense since rotating a JPEG is not lossless IIRC and therefore it makes sense that editing software maintains it in its rotated form when it was captured.

    Again, confounding a forgery explanation.

  4. NBC

    “Remember that when scanning the original document the ‘right way up’, the resulting PDF shows its embedded objects to be rotated 90 degrees clockwise. This is true for both the original and preview saved pdf. But there is are some important differences as well

    The MediaBox for the Xerox version shows a landscape document, while the Preview shows a Portrait document (y dimension larger than x dimension)

    Xerox

    /MediaBox [0 0 792 612]

    or 11”x8.5” (landscape)

    Preview

    /MediaBox [0 0 612 792]

    or 8.5”x11” (portrait)

    This explains why Illustrator and some other tools appear to be confused. In Mac OS/X Preview as well as Acrobat Reader, both documents open as if they were Portrait. In fact, I tried various other applications and they all open the document in Portrait.”

    HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

    In an earlier posting of results comparing Xerox with Preview you found that the W and H were not opposite.

    See:

    https://nativeborncitizen.wordpress.com/2013/07/02/pdf-scanned-on-xerox-workcentre-7535-part-3/

    So what’s the storyline here?

    Are you saying that Xerox and Preview are reversed for the MediaBox but not for theXObjects ?

    Or is your imaginary secretary playing around again with upside-down versus right-side up?

    Remember that when she rotates the original, everything gets rotated.

  5. NBC

    “Okay time for some mathematics and an introduction to 2D affine transforms”

    Your use of the label “affine transforms” is misleading here in two ways.

    1. I think you meant to say “affine transformations”

    [NBC: Nope affine transforms but I see that the term causes you some confusion.]

    “transforms” has a different meaning in mathematics. Ex. Laplace Transforms, Fourier Transforms

    [NBC: Affine transforms.. Affine transformation matrix.. Google is your friend.]

    2. Your use of the “affine transforms” label in conjunction with the 3 x 3 matrix equation immediately thereafter suggests a 3D Affine Coordinate Transformation with the special case of z’ = z = 1. In reality the PDF standard uses a special coordinate transformation matrix that requires extreme care in its interpretation. There are some hidden traps that get fixed in the calculations. So it comes down to this. You must follow the interpretations laid out within the PDF standards (in this case 1.4 for Xerox and 1.3 for Preview).

    [NBC: The transformation matrix is hardly that special. The 3×3 matrix is the one the PDF specification uses as well. So reality and your interpretation do crash into eachother once again.]

    For example a translation between the user space (x ,y) and the device space (x’, y’) has no effect (on x’, y’) or on (x, y). The reason is that the PDF coordinate transformation is a coordinate transformation between the origin points of the two coordinate frames rather than between two positions of an object. If you don’t believe it then do the MATH. But do it in strict compliance with the PDF standards or else your calculations are just garbage.

    [NBC: You are now mixing up apples and oranges. I am talking about user space and how objects are placed inside the user space using the cm operator. Sigh… Apples and oranges. You confuse the various spaces that exist when rendering a PDF:

    • user space: Device indedepent coordinate system used internally
    • Device space: Coordinate system suitable for the rendering device
    • CTM: Current Transformation Matrix – Transforms user space to device space
    • cm: can be used to modify the local CTM
    • other spaces…

    Now look carefully the cm operator is embedded in a q a b c d e f cm /XIPLAYER0 Do Q statement. This means that the the object /XIPLAYER0 is rendered within a temporary coordinate system defined by the cm matrix. q/Q combination pushes and pops the stack leaving the cm transformation to apply locally only.

    The PDF reference provides a helpful note here as well:

    The default user space provides a consistent, dependable starting place for PDF page descriptions regardless of the output device used. If necessary, a PDF content stream may modify user space to be more suitable to its needs by applying the coordinate transformation operator, cm (see 8.4.4, “Graphics State Operators”). Thus, what may appear to be absolute coordinates in a content stream are not absolute with respect to the current page because they are expressed in a coordinate system that may slide around and shrink or expand. Coordinate system transformation not only enhances device-independence but is a useful tool in its own right.

    ]

    All PDF coordinate transformations are between coordinate frames and not between objects. Just remember that the object just goes along for the ride.

    [NBC: Exactly… You got it…]

    So, for instance, a scaling in the x direction stretches the x axis and the object’s width. Consequently, the result is x’ = x. Just stare at Figure 4.6 on page 131 of PDFReference13.pdf for Preview and at Figure 4.3 on page 144 of PDFReference14.pdf for Xerox.

    [NBC: Uh… nope x’= a x perhaps a minor oversight. Did you mean translation? But that would have moved the x’ by tx… I understand that the math may be somewhat confusing to you here. Let’s see if we can help you work through this carefully.]

    These conventions (i.e. interpretations) are unique to the PDF files and the asscoiated PDF standards. The correct interpretation and application of each transformation is buried within the internal calculations for a given PDF file.

    [NBC: No kidding… And if one does not look at the original raw data, one may miss out on the essential differences as to how an object is rendered. The two pdf’s show exactly this.]

    In fact, I have found that the matrix definitions and operations detailed within the pages of the PDF standards are misleading. I personally believe that this may be intentional and may date back to when Adobe controlled the PDF standard and PDF was proprietary. Adobe had to reveal (within each PDFReferenceXX.pdf) only enough information to facilitate the licensed use of PDF files. Supposedly, the ISO has fixed this but I believe some of the past has been retained.

    [NBC: sigh… And yet people all managed to interpret the cm operation, apparently correctly… I wonder how that happened?]

    You are not privy to how the PDF code does its internal calculations. Consequently, your hand calculations are subject to your interpretation of a given six-number transformation vector (i.e. cm). Hence your calculations will reflect your interpretations which may be wrong.

    [NBC: The PDF code does it calculations its own way, following the specifications or it would not render documents correctly. By repeating the same calculations by hand one gains a proper insight into the differences between two documents that appear on the surface to be identical… That’s what forensic examination thrives on… Finger prints so to speak… Or at least something like bloodtype…]

    This is especially true for the interpretation of the physical effects on the object for a specified transformation six-vector “cm” = [a b c d e f]. The PDF standards refer to “cm” as a “coordinate transformation operator” or an “array containing six elements”.

    [NBC: Indeed… Your point? three of the elements are fixed and need not be stored.]

    I would strongly advise against an attempt to decipher the physical effects of a give “cm” by means of hand calculations. If you insist, I would strongly recommend that you first carefully work through the birth-certificate-long-form.pdf file and then tackle the wh-lfbc-scanned-xerox-7535-wc.pdf. We know much more about the WH LFCOLB than the Xerox scanned image.

    [NBC: we know as much about both as we have the raw PDF code and the final rendering. The problem is that very different internals can lead to the same looking document at high level. You would miss out… And yes, the calculations have to be done very carefully as order does matter in these transforms.]

    Each PDF standard implies that a given “cm” is determined by matrix multiplication of up to four simple transformations is a specified order.

    [NBC: Yes the rotation, scaling, translation and skew transformations are their own matrices and they can be multiplied in the order applied to form a single matrix]

    The simple transformations are the four six-vectors that you have posted. However, if you carry out the matrix multiplication you will find that it doesn’t work using the matrices for each simple transformation as defined within the PDF standard.

    [NBC: Please elaborate]

    I challenge each reader to take say the PDFReference13.pdf and carry out the matrix multiplication according to the recipe spelled out by the PDF standard matrix definitions. Then post your findings. I will confirm your results for the reader who gets it right.

    [NBC: I challenge you to show that ‘it does not work out’ and we can take it from there.]

    Of course, the bottom line here is, that you should be using the usual software tools that everybody else uses to interpret PDF files.

    [NBC: That would cause one to miss the interesting data found in the raw format… What a waste… As that’s where the real information is hiding…]


  6. The CM matrix a b c d e f cm represents the following transformation for (x,y) into (x’, y’)

    x’ | a b e| x
    | |
    y’ = | c d f| y
    | |
    1 | 0 0 1| 1

    HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

    1. Your [3x 3] Matrix is different from the one specified in PDFReference13.pdf and PDFReference14.pdf.

    2. Also your use of column vectors is in violation of the PDF specification. The PDF specification imposes row vectors (i.e. [x’ y’ 1], [x y 1]).

    3. The order of your matrix multiplication is backwards. The correct order is that the row vector [x y 1] premultiplies the correct [3 x 3] matrix.

    4. Finally, your matrix definitions and order of, matrix multiplication yields the wrong equations.

    x’ = ax + by + e
    y’ = cx + dy + f

    Rather than your incorrect results the correct matrices and math operations are defined on page 133 of PDFReference13.pdf or page 146 of PDFReference14.pdf.

    The correct equations are:

    x’ = ax + cy + e
    y’ = bx + dy + f

    So is your first load of garbage ready to dump yet ?

    Cry Uncle ???

    It’s just going to get worse.

  7. NBC

    The scaling factors sx and sy can be calculated as follows:

    sx = sqrt(a**2 +b**2)

    sy = sqrt(c**2 +d**2)

    I entered each of your two equations into the search box of my PDF Reader and searched the document PDFReference14.pdf for matches.

    I got zero hits for each equation.

    So next I tried the two partial equations:

    sx = sqrt

    sy = sqrt

    Again I got zero hits.

    Is that the garbage truck that I hear ?

    [NBC: I have tried to simplify matters for you and done the math assuming no skew but if you need a step by step explanation then please let me know. The math is quite simple actually… Sure, the reference may not provide you with help here, so you have to do some hard work yourself….]

  8. Are you saying that Xerox and Preview are reversed for the MediaBox but not for theXObjects ?

    Or is your imaginary secretary playing around again with upside-down versus right-side up?

    Remember that when she rotates the original, everything gets rotated.

    Sigh, I have to use even simpler words to help my friend understand.

    1. The width and height of the objects are the same as they remain rotated.
    2. In Xerox the mediabox indicates landscape which is then rotated 270 degrees after the document is assembled
    3. in Preview the mediabox indicates portrait and all objects are rotated in place.
    4. Placing the document upside down in the scanner only affects the direction of the rotation of the images.

    You do understand PDF and the many ways a page can be built?

    This is PDF 101 my friend. And I showed you step by step how the objects are built onto the page and how the page is subsequently presented.

  9. In fact, I have found that the matrix definitions and operations detailed within the pages of the PDF standards are misleading.

    Again our poor friend has to point to his failure to understand properly the PDF standards and yet he points us to the actual software tools which rely on the interpretation by developers of said standard.

    Looking at the software tools would have hidden all the details I have shown to exist in the PDF files, details which show exactly what happened during the scanning.

    I believe that is called ‘research’ and ‘hypothesis testing’

    I have worked carefully through the examples and hope that this can serve as a starting point to those who may not understand the cm operator.

    I am glad that you have opened the PDF standard, as you may have come to realize how much information was lost to those looking only through the high level tools.

  10. Sigh… do you ever attempt to understand ?…

    I present to you in a simplified manner the steps that are taken in the two pdf’s to get to the same looking document which however opens differently in Illustrator…

  11. A careful mathematician would have noticed that this mistake is not affecting the conclusions. But thank you for pointing out a minor issue and I will correct it.

    Are you still ignoring the findings?…

  12. Also your use of column vectors is in violation of the PDF specification. The PDF specification imposes row vectors (i.e. [x’ y’ 1], [x y 1]).

    You do know that you can specify the matrix onto a row or column and this does not affect the outcome? Simple mathematics.

  13. NBC

    “So let’s do the calculations.”

    Said NBC, the guy who wants to do the math but doesn’t want to adhere to the PDF standard.

    [NBC: Show me where I am going wrong :-)]

    So here’s another example of NBC taking great license with the rules of the road. Using pixels rather than points to do his image size calculations.

    [NBC: Sigh… I am not sure what causes Hermitian all these problems. Of course the problem Hermetian is facing here is that he has to catch up with those who have already familiarized themselves with the PDF raw data. So let’s give him some time..]

    However, so that my calculations and numbers are trustworty, I’m hereafter going to use only lines of code taken directly from the WH LFCOLB PDF file. Thus I’m using my favorite 010 Editor to lift individual lines of code from the PDF file birth-certificate-long-form.pdf. I am using the Internet Archive WayBack Machine archive for the URL:

    [NBC: I am glad that you are doing the hard work that you should have done before drawing conclusions.]

    http://www.whitehouse.gov/sites/default/files/rss_viewer/birth-certificate-long-form.pdf

    I selected the first snapshot taken at 17:11:11 on 04/27/2011 after the release of the WH LFCOLB PDF file at 12:09:24 PM. Hence, I downloaded the file “birth-certificate-long-form.pdf” from here:

    http://web.archive.org/web/20110427171111/http://www.whitehouse.gov/sites/default/files/rss_viewer/birth-certificate-long-form.pdf

    NBC had posted the dimensions for the MediaBox from Preview as:


    /MediaBox [0 0 612 792]

    The Media Box is the window frame within user space within which the page is painted.

    [NBC: I see you too can read the PDF standard.. Small steps.]

    So the PDF standard requires that the default unit of length for the PDF user coordinate space to be one point. One point is equal to 1/72 inches. Therefore, if not explicitly expressed as some other unit of length within the PDF file, the default unit of points must prevail.

    [NBC: Pixels and points can be trivially transformed as you showed yourself. If you’d rather work in points, then by all means. The results will be the same]

    Consequently, the dimensions in inches for the MediaBox are:

    W = 612 pts/72 ppi = 8.5 in.

    H + 792pts/72 ppi = 10.0 in.

    [NBC: Huraah….]

    I find the definition for the media box on line 15 as:

    Line 15 << /Type /Page /Parent 3 0 R /Resources 6 0 R /Contents 4 0 R /MediaBox [0 0 612 792].

    [NBC: Rather than look at line numbers you should include the object, in this case Obj 2 0.]

    Now NBC asked me to anticipate his next move so I’m guessing he will move on to the background image object. I find this Xobject on line 601 as follows:

    Line 601 << /Length 8 0 R /Type /XObject /Subtype /Image /Width 1652 /Height 1276 /ColorSpace

    [NBC: You are still way behind what I asked you to do but find, you found the JPEG… Not too hard to find although you missed some components. 7 0 obj …. and 11 0 R /BitsPerComponent 8 /Filter /DCTDecode >>]

    Quoting from this post here…

    “So the PDF standard requires that the default unit of length for the PDF user coordinate space to be one point. One point is equal to 1/72 inches. Therefore, if not explicitly expressed as some other unit of length, the default unit of points must prevail.”

    Consequently, we must find the dimensions in inches of the background layer to be:

    W = 1652pts/72ppi = 22.944444 in.

    H = 1276pts/72ppi = 17.722222 in.

    [NBC: Well, yes and no. Remember that the object is scaled onto the page so your calculation misses an essential part here. You need to translate to user coordinates before you can conclude its size. Basic PDF.]

    These are the dimensions of the background image as displayed on the screen of the forger’s vector graphics program. The screen resolution is 72ppi x 72 ppi = 72PPI x 72PPI. Thus each pixel is again a square of size 1pt x 1pt.

    Next moving on to the mostly text layer we find it on line 60 as follows:

    Line 60 << /Length 10 0 R /Type /XObject /Subtype /Image /Width 1454 /Height 1819

    Again, following the mandatory method of the PDF standard there results:

    W = 1454pts/72ppi = 20.194444 in.

    H = 1819pts/72ppi = 25.263889 in.

    These are the dimensions of the mostly text layer as displayed on the forger’s screen on his MAC OS computer. The screen resolution is again 72ppi x 72 ppi = 72PPI x 72 PPI. Thus each pixel is a square of size 1pt x 1pt.

    [NBC: Still no evidence of a forger.. Can you not first look at the data before showing that the data will not change your mind?]

    At this stopping point, I will ask the readers to remember the following relationships for the background layer:

    W = 1652pts/72ppi = 22.944444 in. = 11.013333in./0.48

    H = 1276pts/72ppi = 17.722222 in. = 8.506667in./0.48

    Likewise for the mostly text layer, the reader should please remember these relationships:

    W = 1454pts/72ppi = 20.194444 in. = 4.846667in./0.24

    H = 1819pts/72ppi = 25.263889 in. = 6.063333in./0.24

    Also the reader should remember that the reduction scale factor applied to the background image when the forger placed it into the WH LFCOLB PDF image in Adobe Illustrator was 48%. The background image was also rotated by an angle of 90 degrees clockwise.

    [NBC: Anyone familiar with how MRC happens would realize that the 24 and 48% are common factors that result from the downsampling of the JPEG to 150 DPI and the background to 300 DPI and then displaying in a 72 DPI document. This is again simple stuff. 72/150=0.48 72/300=0.24. My goodness sakes, no need for any forger, just common workflow. I explained this is my earliest postings.]

    Likewise, the reader should remember that the reduction scale factor applied to the mostly text image when the forger placed it into the WH LFCOLB PDF image in Adobe Illustrator was 24%.
    The mostly text image was also rotated by an angle of 90 degrees clockwise.

    [NBC: All images were rotated because the document was ‘scanned’ in landscape and since jpegs do not rotate well, all images remained stored in their landscape form. Again simple workflow]

    Therefore, the final dimensions of the background layer in the WH LFCOLB PDF image must be:

    W = 8.506667 in.

    H = 11.013333 in.

    The pixel resolution of the background layer is 72 PPI/.48 = 150 PPI.

    [NBC: Huraah…]

    Hence, the background layer dimensions are slightly greater than the corresponding dimensions of the 8.5 in. x 11.0 in. Artboard.

    [NBC: Yes, and I have explained why this is the case… Can you?]

    Likewise, the final dimensions of the mostly text layer in the WH LFCOLB PDF image are:

    W = 6.063333 in.

    H = 4.846667 in.

    These layer dimensions have been confirmed for the WH LFCOLB PDF file opened within Adobe Illustrator CS6 and Adobe Illustrator CC. The dimensions were read off the layer info panel for each layer.

    Likewise the reduction scale factors (0.48 and 0.24) have also been read out from the links panel data for each layer in both Adobe Illustrator CS6 and Adobe Illustrator CC.

    Also the rotations of 90 degrees clockwise for each layer have also been read out from the same links panel data for each layer in both Adobe Illustrator CS6 and Adobe Illustrator CC.

    Finally, each non-background image is subjected to the same reduction scale factor (24%) and rotation of 90 degrees clockwise as it is placed within the WH LFCOLB PDF image in Adobe Illustrator.

    Therefore, the composite image of the WH LFCOLB opens in the portrait (i.e.letter) orientation within Adobe Illustrator CS6 and Adobe Illustrator CC. The same is the case for Adobe Acrobat XI Pro, Adobe Reader XI, PDF Xchange Viewer Pro version 2.5 and the newly released PDFXChange Editor Pro version 3.0.

    [NBC: Huraah… So far so good. I am glad that you are finally doing the hard work. How does it feel?]

  14. One omission from my last post…

    The pixel resolution of the non-background layers as displayed within Adobe Illustrator are 72PPI/.24 = 300PPI.

  15. NBC

    “[NBC: Since our friend Hermitian pointed out a minor error which did not affect the calculations, I have updated the article indicated in red]”

    If I accept this latest wild claim at face value, then it can only mean that you were not actually using the erroneous equations that you posted to make your calculations. Since you have by the same claim verified that was the case, let me request that in the future you cease posting any equations that you are just making up as you go.

    That would be better for everyone. Then your constant state of confusion would only affect you.

  16. So far Hermitian has shown that his analysis matches mine. He appears confused about the 24 and 48% scaling which follows logically from displaying a 150 PPI and 300 PPI onto a 72 PPI background. Simple stuff. So the simple workflow is that the MRC separated into a background layer at 150 PPI and foreground layers at 300 PPI. This is a common workflow for MRC.

    No need for a forger.

    So far you have done nothing more than followed the Xerox workflow after it was saved by Preview on a Mac. I have shown the impact of this last step on the orientation and why it shows up correctly inside Illustrator unlike the Xerox created PDF.

    I am proud of you though… This is important work you are doing and while you jump to a forgery conclusion without anything that points to this, I am impressed by your diligence, although a bit late.

    Now perhaps you can get to address my challenge based on the Xerox and the Preview versions of the Xerox scans I provided. Do you need a quick recap of what I asked you?

  17. That would be better for everyone. Then your constant state of confusion would only affect you.

    As I showed, the omission did not affect the final conclusions. These are not ‘made up’ equations, they help you parse the cm operator’s matrix into its proper components.

    You see, because of the 90 or 0 degree rotations, one of the terms was always zero so nothing changed.

    Let me help you out with parts you do not understand about my analysis. It’s relatively simple math, once you understand Matrix multiplications.

    Let me know if you could benefit from a refreshers course?

    Take your time Hermitian, due diligence takes its time and you are a few months behind my efforts so I do not blame you for coming up to speed slowly.

    If you disagree with my findings/math then please correct it and we can work from there.

    Looking forward to your further contributions.

    PS: You do know how to transpose a vector/matrix?

  18. Yes, I am trying to figure out a more legible format. The previous one had problems with italics for some.

  19. W = 1652pts/72ppi = 22.944444 in. = 11.013333in./0.48

    H = 1276pts/72ppi = 17.722222 in. = 8.506667in./0.48

    Have you figured out why it is a smidge larger than the mediabox? I have already given the answer but I’d like to hear about your explanation.

  20. Likewise the reduction scale factors (0.48 and 0.24) have also been read out from the links panel data for each layer in both Adobe Illustrator CS6 and Adobe Illustrator CC.

    Have you already figured out how and why the images were scaled?

    Hint: It follows from a common work flow…

  21. NBC

    “Remember that when scanning the original document the ‘right way up’, the resulting PDF shows its embedded objects to be rotated 90 degrees clockwise. This is true for both the original and preview saved pdf.”

    I for one believe that we could expect a trained paralegal to place a one-page document “right side up on the glass”. That’s especially true when that sheet is the purported certified copy of the birth certificate of the purported President of the United States.

    [NBC: A trained paralegal or perhaps an assistant, who knows. This what the evidence shows however. You cannot ignore the data, wherever they lead. And your appeal to your own ‘ignorance’ shows again that you let your conclusions ignore the data]

    Also, the very reason for the existence of the Portable Document Format is so that the image will be device independent. Said differently, the image will be the same for many different devices that can display or print the image.

    [NBC: Exactly..]

    And we have already proven this device independence for the WH LFCOLB PDF image.

    [NBC: Circular is it not… It is a PDF file and thus it is device independent.]

    In fact it seems reasonable to assume that the same PDF file will display the image that is scanned on any device capable of displaying or printing PDF files. Thus if the paralegal is drunk and places the original upside down on the glass of the scanner then the resulting scanned PDF image will be upside down on any device that is capable of displaying or printing PDF images.

    [NBC: I am sure the paralegal realized this and did a 180 rotate]

    The Current Transformation Matrix CTM is device dependent. The CTM controls the handoff from the user space to the device space.

    See Chapter 4 page 131 of PDFReference14.pdf.

    “The graphics state includes the current transformation matrix (CTM), which maps
    user space coordinates used within a PDF content stream into output device
    coordinates.”

    You should be comparing the lines of code that are created by Preview (for your Xerox scanned PDF) to the lines of code in the birth-certificate-long-form.pdf. They should be the same for each of the nine layers.

    [NBC: They should be similar, not necessarily the same. And I did this, IIRC. But if not, it’s still not too late. Remember that the Xerox created preview PDF does not contain the same number of layers, but the coding is very similar. Another reason why one has to look at the raw data. Remember, I now have given you a hint as to the scanning process and you can now predict what the bitmaps should show for the Xerox bitmap, which was not scanned upside down.]

  22. NBC

    “PS: You do know how to transpose a vector/matrix?”

    Interchange the row for the column. One is a row vector and the other is a column vector.

    Did you know that a Hermitian matrix is equal to its transugate?

  23. I am aware of the irony of your name but now apply it to the PDF transformation matrix. And yes, I am quite familiar with Hermitian matrices.

  24. NBC

    “So far Hermitian has shown that his analysis matches mine. He appears confused about the 24 and 48% scaling which follows logically from displaying a 150 PPI and 300 PPI onto a 72 PPI background. Simple stuff. So the simple workflow is that the MRC separated into a background layer at 150 PPI and foreground layers at 300 PPI. This is a common workflow for MRC.”

    Wrong Doofus !

    Let’s get something straight. I not the confused one here.

    [NBC: And yet downscaling/downsampling to 150 and 300 PPI in case of MRC is quite common.]

    I don’t want any part of your analysis. A guy who can’t even read and follow a published standard is just a disaster waiting to happen.

    [NBC: ROTFL… Just because you do not comprehend something does not make it a disaster, other than to those who are presuming forgery]

    My analysis is not the same as yours. If you don’t understand why that is then I can’t help you.

    [NBC: Your analysis is exactly the same as mine, although you focus on points rather than on pixels. Same result. But you introduced 48 and 24% ad hoc. Do you not understand how these are a standard outcome in PDF where the background is downsampled to 150 and the foreground to 300 ppi? It’s so trivial that you will slap your head once I show it.]

    Here’s a check on your analysis thus far.

    Please post the exact lines from your Preview code where the pixel resolutions are specified to be 150 PPI for the background image and 300 PPI for the non-background image.

    [NBC: Valid question and of course they are not specified but rather can be observed when you look at how the images are mapped onto the canvas]

    Or else post the line numbers where the default unit of length for user space was changed from 1/72 in. to 1/150 in. And then from 1/150 in. to 1/300 in.

    [NBC: You do not understand the nuances here. Of course the user coordinates are still in the 72 ppi format, but what happens if you take 48% of 300 DPI? You get 72 dpi, same with 24% of 150 DPI… Geez..]

    And here’s a hint on the answers that you are going to find. I have already searched the file wh-lfbc-scanned-xerox-7535-wc.pdf for the numbers “150″ and “300″ and both searches returned zero hits.

    [NBC: Well, yes, it requires a logical step.]

    And stop adding your erroneous comments beside your numbers. Your added comments are not factual per the lines of code that you think you know how to read.

    [NBC: ROTFL… I have no idea what you are talking about here but so far you have yet to show that the numbers or analyses are wrong. Somewhat embarrassing, would you not say so? But I will let you putz along fully confident that you too will eventually get up to speed enough in this new world of PDF analysis. I know it’s scary to take off the training wheels and go for a ‘deep dive’]

  25. NBC

    “NBC: I have tried to simplify matters for you and done the math assuming no skew but if you need a step by step explanation then please let me know. The math is quite simple actually… Sure, the reference may not provide you with help here, so you have to do some hard work yourself….]”

    Hey Dude ! The correct matrix equations are spelled out in black and white in the PDFReferenceXX.pdf. These are the only equations that one should use to interpret the PDF code.

    [NBC: Unless of course one understands how to transform a row vector multiplication with a column vector multiplication… But if you are confused by the math, you can work out the same in your preferred notation. Same difference.. I will give you hint though (AB)T is BTAT and thus anyone can do the simple math involved..]

    The fact that you won’t use the specified equations and methodology is cause for concern.

    [NBC: I understand that to a novice to Matrix manipulations, this indeed may be a concern. Do the work yourself which is an excellent exercise…]

    Here’s a nice little exercise for you carry out since you claim to already know the answer.

    Starting with your definitions for the [3 x 3] matrix equations and your row vectors for each simple transformation (i.e. translate, rotate and scale) carry out the ordered matrix multiplication (your way) to combine the three simple transformations into one.

    [NBC: I just did that…]

    Then carry out the same exercise except this time use the [3 x 3] matrix and 3-vector definitions exactly as given in the standard. Carry out the multiplication of the three simple transformations in the order of matrix multiplication specified in the standard.

    Then show that you obtained the same results with your formulation as with the formulation specified in the standard.

    After all, it’s just hard work — but believe me you need to do it.

    [NBC: Why do it when you can show mathematically speaking they are equivalent? If you disagree, show that you can do the math… It’s time you make some real effort here… I can only so long hold your hand and walk you through these steps.]

  26. Starting with your definitions for the [3 x 3] matrix equations and your row vectors for each simple transformation (i.e. translate, rotate and scale) carry out the ordered matrix multiplication (your way) to combine the three simple transformations into one.

    You’re way behind… Read up posting 20…🙂

    Still a few steps behind I notice..

  27. NBC

    “[NBC: I just did that…]”

    Then post your results from the two different methods.

    And by the way, also post the two lines of code from the Preview where the pixel resolution was specified to be 150 PPI for the background image and 300 PPI for the mostly text image.

    And then you never did answer my question about the conflict between your MediaBox and your XOBjects.

    “HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

    “In an earlier posting of results comparing Xerox with Preview you found that the W and H were not opposite.

    See:

    https://nativeborncitizen.wordpress.com/2013/07/02/pdf-scanned-on-xerox-workcentre-7535-part-3/

    So what’s the storyline here?

    Are you saying that Xerox and Preview are reversed for the MediaBox but not for theXObjects ?

    Or is your imaginary secretary playing around again with upside-down versus right-side up?

    Remember that when she rotates the original, everything gets rotated.”

    Number 1 on your response was:

    “1. The width and height of the objects are the same as they remain rotated.”

    A rotation of 90 degress interchanges W for H and H for W.

    Duh!

  28. [NBC: I just did that…]”

    Then post your results from the two different methods.

    Since you are claiming that these methods are different, I suggest you do it.

    Need some help?

    Remember that when she rotates the original, everything gets rotated.”

    At the visible level that is correct but that does not mean that the internal objects get rotated.

    I know, this is quite a bit of information to have to take in and your recent discovery of the PDF raw data means that you have not fully experienced how many different ways can lead to the same overall high level picture.

    I am not sure what you are talking about with W and H, as I said, the objects remain rotated within the stream, only when they get rendered are they rotated into their proper position

    Good luck

Comments are closed.