[NBC: Since our friend Hermitian pointed out a minor error which did not affect the calculations, I have updated the article indicated in red]

There are some interesting observations from both the Xerox PDF and the Preview version of the Xerox PDF. Anyone using Illustrator would have missed these low level actions, and miss tell tale signs about the software used.

Remember that when scanning the original document the ‘right way up’, the resulting PDF shows its embedded objects to be rotated 90 degrees clockwise. This is true for both the original and preview saved pdf. But there is are some important differences as well

The MediaBox for the Xerox version shows a landscape document, while the Preview shows a Portrait document (y dimension larger than x dimension)

**Xerox**

/MediaBox [0 0 792 612]

or 11”x8.5” (landscape)

**Preview**

/MediaBox [0 0 612 792]

or 8.5”x11” (portrait)

This explains why Illustrator and some other tools appear to be confused. In Mac OS/X Preview as well as Acrobat Reader, both documents open as if they were Portrait. In fact, I tried various other applications and they all open the document in Portrait.

Okay time for some mathematics and an introduction to 2D affine transforms

The CM matrix ** a b c d e f cm** represents the following transformation for (x,y) into (x’, y’)

x' | a c e| x | | y' = | b d f| y | | 1 | 0 0 1| 1

From the PDF standard:

- Translations shall be specified as [1 0 0 1 tx ty], where tx and ty shall be the distances to translate the origin of the coordinate system in the horizontal and vertical dimensions, respectively.

- Scaling shall be obtained by [sx 0 0 sy 0 0]. This scales the coordinates so that 1 unit in the horizontal and vertical dimensions of the new coordinate system is the same size as sx and sy units, respectively, in the previous coordinate system.

- Rotations shall be produced by [cos q sin q -sin q cos q 0 0], which has the effect of rotating the coordinate system axes by an angle q counter clockwise.

- Skew shall be specified by [1 tan a tan b 1 0 0] ,which skews the x axis by an angle a and the y axis by an angle b.

e and f are the x and y translations called tx and ty, nothing tricky here.

The scaling factors sx and sy can be calculated as follows:

sx = sqrt(a**2 +c**2)

sy = sqrt(b**2 +d**2)

the rotation matrix for angle q is

| cos q sin q|

| -sin q cos q|

So let’s do the calculations.

The original PDF shows the following Coordinate Transformation Matrix

798.72 0.00 0.00 614.40 -3.36 -1.20 cm /XIPLAYER0 Do Q

sx = 798.72 ; 1664 pixels in the 150150 system which is 14 pixels larger than the expected 1650 (11*150)

sy = 614.40 ; 1280 which is 5 pixels larger than the expected 1275 (8.5*150)

tx = -3.36 ; 7 pixels in the 150×150 system system

ty= -1.20 ; 2.5 pixels in the 150×150 system.

angle q = 0°

angles a, b = 0

In other words, the jpeg is scaled and moved in such a way that there are 14 pixels in the x direction on either side, and 5 pixels in the x direction extending beyond the mediabox. The jpeg is 1664×1280 in size but it is centered in the MediaBox with 2.5 overlapping on the left and right and 7 in the top and bottom. The explanation is simple. Remember that jpeg always contains 8 MOD 0 blocks and thus a JPEG which does not match this requirement is padded.

The Preview created PDF shows the following Coordinate Transformation Matrix

0 798.72 -614.4 0 613.2 -3.36 cm /Im1 DO Q

sx = 798.72 ; 1664 pixels in the 150150 system which is 14 pixels larger than the expected 1650 (11*150)

sy = 614.40 ; 1280 which is 5 pixels larger than the expected 1275 (8.5*150)

tx = 613.2 ; 1277.5, 2.5 pixels over the boundary set by 1275

ty= -3.36 ; 7 pixels in the 150×150 system system

angle q = 90° (counter clockwise)

angles a, b = 0

In both cases they are scaled the same amount, in the Xerox case they remain aligned with the landscape document and in preview they are rotated 90 clockwise to align with the portrait document. No skew is applied in either case.

So why does the Xerox generated document still looks okay in Preview or other viewers? Because it contains a /Rotate 270 which rotates the document 270 clockwise, restoring its ‘proper orientation’.

Based on these findings I observe that the WH PDF was scanned in upside down which explains why the embedded bitmaps and JPEG are rotated counter clockwise.

And I leave it up to Hermitian to deduce from this information what we should find for the alignment of the monochrome bitmaps….

Let’s see how good his deductive skills are.

Just to show my prediction its sha1 hash is 82b2bef0b225b3c008b9960cb71108803cef5184 as calculated at this site

I will post it as soon as Hermitian has shared with us his insights…

“This explains why Illustrator and some other tools appear to be confused.”

Even among other Adobe products. Photoshop and Reader opens it in Portrait.

President Obama’s 2010 tax returns also open in landscape in Illustrator.

It makes sense since rotating a JPEG is not lossless IIRC and therefore it makes sense that editing software maintains it in its rotated form when it was captured.

Again, confounding a forgery explanation.

NBC

“Remember that when scanning the original document the ‘right way up’, the resulting PDF shows its embedded objects to be rotated 90 degrees clockwise. This is true for both the original and preview saved pdf. But there is are some important differences as well

The MediaBox for the Xerox version shows a landscape document, while the Preview shows a Portrait document (y dimension larger than x dimension)

Xerox

/MediaBox [0 0 792 612]

or 11”x8.5” (landscape)

Preview

/MediaBox [0 0 612 792]

or 8.5”x11” (portrait)

This explains why Illustrator and some other tools appear to be confused. In Mac OS/X Preview as well as Acrobat Reader, both documents open as if they were Portrait. In fact, I tried various other applications and they all open the document in Portrait.”

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

In an earlier posting of results comparing Xerox with Preview you found that the W and H were not opposite.

See:

https://nativeborncitizen.wordpress.com/2013/07/02/pdf-scanned-on-xerox-workcentre-7535-part-3/

So what’s the storyline here?

Are you saying that Xerox and Preview are reversed for the MediaBox but not for theXObjects ?

Or is your imaginary secretary playing around again with upside-down versus right-side up?

Remember that when she rotates the original, everything gets rotated.

Your use of the label “affine transforms” is misleading here in two ways.

1. I think you meant to say “affine transformations”

“transforms” has a different meaning in mathematics. Ex. Laplace Transforms, Fourier Transforms

2. Your use of the “affine transforms” label in conjunction with the 3 x 3 matrix equation immediately thereafter suggests a 3D Affine Coordinate Transformation with the special case of z’ = z = 1. In reality the PDF standard uses a special coordinate transformation matrix that requires extreme care in its interpretation. There are some hidden traps that get fixed in the calculations. So it comes down to this. You must follow the interpretations laid out within the PDF standards (in this case 1.4 for Xerox and 1.3 for Preview).

For example a translation between the user space (x ,y) and the device space (x’, y’) has no effect (on x’, y’) or on (x, y). The reason is that the PDF coordinate transformation is a coordinate transformation between the origin points of the two coordinate frames rather than between two positions of an object. If you don’t believe it then do the MATH. But do it in strict compliance with the PDF standards or else your calculations are just garbage.

All PDF coordinate transformations are between coordinate frames and not between objects. Just remember that the object just goes along for the ride.

So, for instance, a scaling in the x direction stretches the x axis and the object’s width. Consequently, the result is x’ = x. Just stare at Figure 4.6 on page 131 of PDFReference13.pdf for Preview and at Figure 4.3 on page 144 of PDFReference14.pdf for Xerox.

These conventions (i.e. interpretations) are unique to the PDF files and the asscoiated PDF standards. The correct interpretation and application of each transformation is buried within the internal calculations for a given PDF file.

In fact, I have found that the matrix definitions and operations detailed within the pages of the PDF standards are misleading. I personally believe that this may be intentional and may date back to when Adobe controlled the PDF standard and PDF was proprietary. Adobe had to reveal (within each PDFReferenceXX.pdf) only enough information to facilitate the licensed use of PDF files. Supposedly, the ISO has fixed this but I believe some of the past has been retained.

You are not privy to how the PDF code does its internal calculations. Consequently, your hand calculations are subject to your interpretation of a given six-number transformation vector (i.e. cm). Hence your calculations will reflect your interpretations which may be wrong.

This is especially true for the interpretation of the physical effects on the object for a specified transformation six-vector “cm” = [a b c d e f]. The PDF standards refer to “cm” as a “coordinate transformation operator” or an “array containing six elements”.

I would strongly advise against an attempt to decipher the physical effects of a give “cm” by means of hand calculations. If you insist, I would strongly recommend that you first carefully work through the birth-certificate-long-form.pdf file and then tackle the wh-lfbc-scanned-xerox-7535-wc.pdf. We know much more about the WH LFCOLB than the Xerox scanned image.

Each PDF standard implies that a given “cm” is determined by matrix multiplication of up to four simple transformations is a specified order.

The simple transformations are the four six-vectors that you have posted. However, if you carry out the matrix multiplication you will find that it doesn’t work using the matrices for each simple transformation as defined within the PDF standard.

I challenge each reader to take say the PDFReference13.pdf and carry out the matrix multiplication according to the recipe spelled out by the PDF standard matrix definitions. Then post your findings. I will confirm your results for the reader who gets it right.

Of course, the bottom line here is, that you should be using the usual software tools that everybody else uses to interpret PDF files.

”

The CM matrix a b c d e f cm represents the following transformation for (x,y) into (x’, y’)

x’ | a b e| x

| |

y’ = | c d f| y

| |

1 | 0 0 1| 1

”

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

1. Your [3x 3] Matrix is different from the one specified in PDFReference13.pdf and PDFReference14.pdf.

2. Also your use of column vectors is in violation of the PDF specification. The PDF specification imposes row vectors (i.e. [x’ y’ 1], [x y 1]).

3. The order of your matrix multiplication is backwards. The correct order is that the row vector [x y 1] premultiplies the correct [3 x 3] matrix.

4. Finally, your matrix definitions and order of, matrix multiplication yields the wrong equations.

x’ = ax + by + e

y’ = cx + dy + f

Rather than your incorrect results the correct matrices and math operations are defined on page 133 of PDFReference13.pdf or page 146 of PDFReference14.pdf.

The correct equations are:

x’ = ax + cy + e

y’ = bx + dy + f

So is your first load of garbage ready to dump yet ?

Cry Uncle ???

It’s just going to get worse.

I entered each of your two equations into the search box of my PDF Reader and searched the document PDFReference14.pdf for matches.

I got zero hits for each equation.

So next I tried the two partial equations:

sx = sqrt

sy = sqrt

Again I got zero hits.

Is that the garbage truck that I hear ?

Sigh, I have to use even simpler words to help my friend understand.

1. The width and height of the objects are the same as they remain rotated.

2. In Xerox the mediabox indicates landscape which is then rotated 270 degrees after the document is assembled

3. in Preview the mediabox indicates portrait and all objects are rotated in place.

4. Placing the document upside down in the scanner only affects the direction of the rotation of the images.

You do understand PDF and the many ways a page can be built?

This is PDF 101 my friend. And I showed you step by step how the objects are built onto the page and how the page is subsequently presented.

Again our poor friend has to point to his failure to understand properly the PDF standards and yet he points us to the actual software tools which rely on the interpretation by developers of said standard.

Looking at the software tools would have hidden all the details I have shown to exist in the PDF files, details which show exactly what happened during the scanning.

I believe that is called ‘research’ and ‘hypothesis testing’

I have worked carefully through the examples and hope that this can serve as a starting point to those who may not understand the cm operator.

I am glad that you have opened the PDF standard, as you may have come to realize how much information was lost to those looking only through the high level tools.

Sigh… do you ever attempt to understand ?…

I present to you in a simplified manner the steps that are taken in the two pdf’s to get to the same looking document which however opens differently in Illustrator…

A careful mathematician would have noticed that this mistake is not affecting the conclusions. But thank you for pointing out a minor issue and I will correct it.

Are you still ignoring the findings?…

You do know that you can specify the matrix onto a row or column and this does not affect the outcome? Simple mathematics.

Said NBC, the guy who wants to do the math but doesn’t want to adhere to the PDF standard.

So here’s another example of NBC taking great license with the rules of the road. Using pixels rather than points to do his image size calculations.

However, so that my calculations and numbers are trustworty, I’m hereafter going to use only lines of code taken directly from the WH LFCOLB PDF file. Thus I’m using my favorite 010 Editor to lift individual lines of code from the PDF file birth-certificate-long-form.pdf. I am using the Internet Archive WayBack Machine archive for the URL:

http://www.whitehouse.gov/sites/default/files/rss_viewer/birth-certificate-long-form.pdf

I selected the first snapshot taken at 17:11:11 on 04/27/2011 after the release of the WH LFCOLB PDF file at 12:09:24 PM. Hence, I downloaded the file “birth-certificate-long-form.pdf” from here:

http://web.archive.org/web/20110427171111/http://www.whitehouse.gov/sites/default/files/rss_viewer/birth-certificate-long-form.pdf

NBC had posted the dimensions for the MediaBox from Preview as:

“

/MediaBox [0 0 612 792]

“

The Media Box is the window frame within user space within which the page is painted.

So the PDF standard requires that the default unit of length for the PDF user coordinate space to be one point. One point is equal to 1/72 inches. Therefore, if not explicitly expressed as some other unit of length within the PDF file, the default unit of points must prevail.

Consequently, the dimensions in inches for the MediaBox are:

W = 612 pts/72 ppi = 8.5 in.

H + 792pts/72 ppi = 10.0 in.

I find the definition for the media box on line 15 as:

Line 15 << /Type /Page /Parent 3 0 R /Resources 6 0 R /Contents 4 0 R /MediaBox [0 0 612 792].

Now NBC asked me to anticipate his next move so I’m guessing he will move on to the background image object. I find this Xobject on line 601 as follows:

Line 601 << /Length 8 0 R /Type /XObject /Subtype /Image /Width 1652 /Height 1276 /ColorSpace

Quoting from this post here…

“So the PDF standard requires that the default unit of length for the PDF user coordinate space to be one point. One point is equal to 1/72 inches. Therefore, if not explicitly expressed as some other unit of length, the default unit of points must prevail.”

Consequently, we must find the dimensions in inches of the background layer to be:

W = 1652pts/72ppi = 22.944444 in.

H = 1276pts/72ppi = 17.722222 in.

These are the dimensions of the background image as displayed on the screen of the forger’s vector graphics program. The screen resolution is 72ppi x 72 ppi = 72PPI x 72PPI. Thus each pixel is again a square of size 1pt x 1pt.

Next moving on to the mostly text layer we find it on line 60 as follows:

Line 60 << /Length 10 0 R /Type /XObject /Subtype /Image /Width 1454 /Height 1819

Again, following the mandatory method of the PDF standard there results:

W = 1454pts/72ppi = 20.194444 in.

H = 1819pts/72ppi = 25.263889 in.

These are the dimensions of the mostly text layer as displayed on the forger’s screen on his MAC OS computer. The screen resolution is again 72ppi x 72 ppi = 72PPI x 72 PPI. Thus each pixel is a square of size 1pt x 1pt.

At this stopping point, I will ask the readers to remember the following relationships for the background layer:

W = 1652pts/72ppi = 22.944444 in. = 11.013333in./0.48

H = 1276pts/72ppi = 17.722222 in. = 8.506667in./0.48

Likewise for the mostly text layer, the reader should please remember these relationships:

W = 1454pts/72ppi = 20.194444 in. = 4.846667in./0.24

H = 1819pts/72ppi = 25.263889 in. = 6.063333in./0.24

Also the reader should remember that the reduction scale factor applied to the background image when the forger placed it into the WH LFCOLB PDF image in Adobe Illustrator was 48%. The background image was also rotated by an angle of 90 degrees clockwise.

Likewise, the reader should remember that the reduction scale factor applied to the mostly text image when the forger placed it into the WH LFCOLB PDF image in Adobe Illustrator was 24%.

The mostly text image was also rotated by an angle of 90 degrees clockwise.

Therefore, the final dimensions of the background layer in the WH LFCOLB PDF image must be:

W = 8.506667 in.

H = 11.013333 in.

The pixel resolution of the background layer is 72 PPI/.48 = 150 PPI.

Hence, the background layer dimensions are slightly greater than the corresponding dimensions of the 8.5 in. x 11.0 in. Artboard.

Likewise, the final dimensions of the mostly text layer in the WH LFCOLB PDF image are:

W = 6.063333 in.

H = 4.846667 in.

These layer dimensions have been confirmed for the WH LFCOLB PDF file opened within Adobe Illustrator CS6 and Adobe Illustrator CC. The dimensions were read off the layer info panel for each layer.

Likewise the reduction scale factors (0.48 and 0.24) have also been read out from the links panel data for each layer in both Adobe Illustrator CS6 and Adobe Illustrator CC.

Also the rotations of 90 degrees clockwise for each layer have also been read out from the same links panel data for each layer in both Adobe Illustrator CS6 and Adobe Illustrator CC.

Finally, each non-background image is subjected to the same reduction scale factor (24%) and rotation of 90 degrees clockwise as it is placed within the WH LFCOLB PDF image in Adobe Illustrator.

Therefore, the composite image of the WH LFCOLB opens in the portrait (i.e.letter) orientation within Adobe Illustrator CS6 and Adobe Illustrator CC. The same is the case for Adobe Acrobat XI Pro, Adobe Reader XI, PDF Xchange Viewer Pro version 2.5 and the newly released PDFXChange Editor Pro version 3.0.

One omission from my last post…

The pixel resolution of the non-background layers as displayed within Adobe Illustrator are 72PPI/.24 = 300PPI.

If I accept this latest wild claim at face value, then it can only mean that you were not actually using the erroneous equations that you posted to make your calculations. Since you have by the same claim verified that was the case, let me request that in the future you cease posting any equations that you are just making up as you go.

That would be better for everyone. Then your constant state of confusion would only affect you.

So far Hermitian has shown that his analysis matches mine. He appears confused about the 24 and 48% scaling which follows logically from displaying a 150 PPI and 300 PPI onto a 72 PPI background. Simple stuff. So the simple workflow is that the MRC separated into a background layer at 150 PPI and foreground layers at 300 PPI. This is a common workflow for MRC.

No need for a forger.

So far you have done nothing more than followed the Xerox workflow after it was saved by Preview on a Mac. I have shown the impact of this last step on the orientation and why it shows up correctly inside Illustrator unlike the Xerox created PDF.

I am proud of you though… This is important work you are doing and while you jump to a forgery conclusion without anything that points to this, I am impressed by your diligence, although a bit late.

Now perhaps you can get to address my challenge based on the Xerox and the Preview versions of the Xerox scans I provided. Do you need a quick recap of what I asked you?

You are trying different WordPress themes I see. 😉

As I showed, the omission did not affect the final conclusions. These are not ‘made up’ equations, they help you parse the cm operator’s matrix into its proper components.

You see, because of the 90 or 0 degree rotations, one of the terms was always zero so nothing changed.

Let me help you out with parts you do not understand about my analysis. It’s relatively simple math, once you understand Matrix multiplications.

Let me know if you could benefit from a refreshers course?

Take your time Hermitian, due diligence takes its time and you are a few months behind my efforts so I do not blame you for coming up to speed slowly.

If you disagree with my findings/math then please correct it and we can work from there.

Looking forward to your further contributions.

PS: You do know how to transpose a vector/matrix?

Yes, I am trying to figure out a more legible format. The previous one had problems with italics for some.

Have you figured out why it is a smidge larger than the mediabox? I have already given the answer but I’d like to hear about your explanation.

Have you already figured out how and why the images were scaled?

Hint: It follows from a common work flow…

I for one believe that we could expect a trained paralegal to place a one-page document “right side up on the glass”. That’s especially true when that sheet is the purported certified copy of the birth certificate of the purported President of the United States.

Also, the very reason for the existence of the Portable Document Format is so that the image will be device independent. Said differently, the image will be the same for many different devices that can display or print the image.

And we have already proven this device independence for the WH LFCOLB PDF image.

In fact it seems reasonable to assume that the same PDF file will display the image that is scanned on any device capable of displaying or printing PDF files. Thus if the paralegal is drunk and places the original upside down on the glass of the scanner then the resulting scanned PDF image will be upside down on any device that is capable of displaying or printing PDF images.

The Current Transformation Matrix CTM is device dependent. The CTM controls the handoff from the user space to the device space.

See Chapter 4 page 131 of PDFReference14.pdf.

“The graphics state includes the current transformation matrix (CTM), which maps

user space coordinates used within a PDF content stream into output device

coordinates.”

You should be comparing the lines of code that are created by Preview (for your Xerox scanned PDF) to the lines of code in the birth-certificate-long-form.pdf. They should be the same for each of the nine layers.

Interchange the row for the column. One is a row vector and the other is a column vector.

Did you know that a Hermitian matrix is equal to its transugate?

Oops!

That is transjugate rather than transugate.

I am aware of the irony of your name but now apply it to the PDF transformation matrix. And yes, I am quite familiar with Hermitian matrices.

Wrong Doofus !

Let’s get something straight. I not the confused one here.

I don’t want any part of your analysis. A guy who can’t even read and follow a published standard is just a disaster waiting to happen.

My analysis is not the same as yours. If you don’t understand why that is then I can’t help you.

Here’s a check on your analysis thus far.

Please post the exact lines from your Preview code where the pixel resolutions are specified to be 150 PPI for the background image and 300 PPI for the non-background image.

Or else post the line numbers where the default unit of length for user space was changed from 1/72 in. to 1/150 in. And then from 1/150 in. to 1/300 in.

And here’s a hint on the answers that you are going to find. I have already searched the file wh-lfbc-scanned-xerox-7535-wc.pdf for the numbers “150″ and “300″ and both searches returned zero hits.

And stop adding your erroneous comments beside your numbers. Your added comments are not factual per the lines of code that you think you know how to read.

Hey Dude ! The correct matrix equations are spelled out in black and white in the PDFReferenceXX.pdf. These are the only equations that one should use to interpret the PDF code.

The fact that you won’t use the specified equations and methodology is cause for concern.

Here’s a nice little exercise for you carry out since you claim to already know the answer.

Starting with your definitions for the [3 x 3] matrix equations and your row vectors for each simple transformation (i.e. translate, rotate and scale) carry out the ordered matrix multiplication (your way) to combine the three simple transformations into one.

Then carry out the same exercise except this time use the [3 x 3] matrix and 3-vector definitions exactly as given in the standard. Carry out the multiplication of the three simple transformations in the order of matrix multiplication specified in the standard.

Then show that you obtained the same results with your formulation as with the formulation specified in the standard.

After all, it’s just hard work — but believe me you need to do it.

You’re way behind… Read up posting 20… 🙂

Still a few steps behind I notice..

NBC

“[NBC: I just did that…]”

Then post your results from the two different methods.

And by the way, also post the two lines of code from the Preview where the pixel resolution was specified to be 150 PPI for the background image and 300 PPI for the mostly text image.

And then you never did answer my question about the conflict between your MediaBox and your XOBjects.

“HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

“In an earlier posting of results comparing Xerox with Preview you found that the W and H were not opposite.

See:

https://nativeborncitizen.wordpress.com/2013/07/02/pdf-scanned-on-xerox-workcentre-7535-part-3/

So what’s the storyline here?

Are you saying that Xerox and Preview are reversed for the MediaBox but not for theXObjects ?

Or is your imaginary secretary playing around again with upside-down versus right-side up?

Remember that when she rotates the original, everything gets rotated.”

Number 1 on your response was:

“1. The width and height of the objects are the same as they remain rotated.”

A rotation of 90 degress interchanges W for H and H for W.

Duh!

Since you are claiming that these methods are different, I suggest you do it.

Need some help?

At the visible level that is correct but that does not mean that the internal objects get rotated.

I know, this is quite a bit of information to have to take in and your recent discovery of the PDF raw data means that you have not fully experienced how many different ways can lead to the same overall high level picture.

I am not sure what you are talking about with W and H, as I said, the objects remain rotated within the stream, only when they get rendered are they rotated into their proper position

Good luck