MS – Orly v Democrat – Henry Blake Affidavit – Part 3

Our friend Hermitian has provided his latest ‘analysis’ of the document that was submitted to the Court in the case Taitz v Democrat Party in Mississippi. For some inexplicable reasons, our friend believes that a scanned version of a printed PDF containing President Obama’s birth certificate shows evidence of a ‘forger’. His arguments are that a forger somehow better explains the data than a workflow.

As I will show, he ignores more likely workflow scenarios, relies on comparisons between different scanning programs and document resolutions and believes that the OCR text layer was somehow added by a forger. Of course, a much simpler and more elegant explanation exists where the documents were scanned on a Fujitsu ScanSnap S#1500 scanner into the ScanSnap  Manager, a software package that is used to for scanning in the documents. The scanner in question does not have TWAIN support and therefore it is logical that the document was scanned in as follows:

The letter to Fuddy was scanned in as a  PDF and the print out of the Hawaiian long form birth certificate was scanned in separately. Both documents were combined into a single PDF which showed some ‘artifacts’ such as an OCR text layer and lines and blocks which all remain invisible. The OCR text is an invisible layer on top of the PDF, which means that you can select certain words on the image and copy the OCR’ed word. However, due to the mediocre quality of the scan, the document shows few successful OCR’ed words and many of the words are misspelled. Only the large font words were captured accurately consistently, which makes sense in the above workflow.

I used my own OCR software to extract the text on the image and while more words were OCR’ed, the quality of the words is incredibly poor, with minor exceptions, including the large font portions.

For someone to properly understand a PDF, one cannot rely on studying it in illustrator, instead one has to do the somewhat harder work of decoding the original document into its objects and instructions. Not trivial but also not that hard.

Hermitian: Assuming that the Obama LFCOLB PDF image on page 4 of the court document 35-1.pdf was created by means of a human operator scanning a printout of page 2 of court document 15-1.pdf in a Fugitsu ScanSnap #S1500 scanner into Acrobat 9 (with OCR turned on) then OCR would assign each word to one of the three following categories:

NBC: Note that the Producer was not the scanner but rather the software: PFU ScanSnap Manager 5.0.21 #S1500, which is to be used to scan. Some quick research reveals that:

ScanSnap Manager
This software is required to scan documents with the ScanSnap. The scanned image data can be converted to a PDF or JPEG file to be saved.

The scanner driver does not support TWAIN, which means that you cannot scan directly into Acrobat. Which explains why the paper capture plugin was used.

Producer: Adobe Acrobat 9.51 Paper Capture]

Hermitian:

1. Those words which are deciphered and made selectable

2. Those words which are deciphered but are flagged as suspect for errors – these words are also made selectable

3. Those words which are not deciphered – these words are not made selectable

The Obot claim is that this assumed work flow produced the LFCOLB image which comprises page 4 of court document 35-1.pdf. However, much of the text appearing on page 4 of court document 35-1.pdf was not deciphered and thus fell into category 3.

NBC: For good reasons. The quality of the scan was pretty poor since it was scanned as a 150 DPI document because it was mixed color/gray. The letter was scanned at a higher DPI setting.

Hermitian: Of the certificate words on the page 4 LFCOLB that were deciphered, most were marked as suspect. Those words (or characters) which were made selectable but were not flagged as suspect include “OF”, “61″ (in the certificate number) and the typed Roman numeral “II”. The words (or numbers) “Case”, ”Filed 05/04/12, “Page” which are part of the original case label (i.e. the Green label) were also marked as suspect. However all of the text of both case labels was deciphered and made selectable.

A significant finding of the inspection (of page 4 of document 35-1.pdf) within Adobe Acrobat XI Pro was that none of the form text was deciphered by the purported OCR except for the words “STATE”, “HAWAII”, “CERTIFICATE OF LIVE BIRTH”, and “DEPARTMENT OF HEALTH”. The deciphered words are in the largest font printed on the certificate form. None of the smaller text printed on the form was deciphered and made selectable.

NBC: all of this points to an OCR of a low resolution document.

Hermitian continues to  describes various scenarios that have little relevance to the workflow in question. He decides, for no logical reason to scan another document using another program to conclude that under his scenario, the OCR works better.

Hermitian: These results are atypical because the OCR algorithms included with the various versions of Adobe Acrobat typically detect more words than not – as do most of the popular OCR programs. Two popular programs are ABBY PDF Transformer Pro 3.0, and PDF-XChange Viewer Pro version 2.5.

For reference, I applied the ABBY PDF Transformer 3.0 program to the original WH LFCOLB PDF image. This PDF utility does both OCR and MRC. I turned the MRC off and scanned for OCR only. The ABBY OCR algorithm deciphered all of the typed text except for the word “Male”. The OCR scan also failed to decipher the form text “Sex”, “6a.”, “6c.”, “8.”, “20.”, ”Other”, and in box 22 ”Date Accepted by Reg.”, and the date stamp “AUG -8 1961″. The WH LFCOLB file is a one-page PDF file.

I also applied PDF-XChange Viewer Pro version 2.5 to scan the WH LFCOLB PDF image for OCR. All of the typed text was made selectable. The form text that was not made selectable included “Sex”, “6a.”, “6c.”, “8.”, “20.” and the Reg. General’s date stamp “AUG -8 1961″. All of the smallest form text was made selectable.

I also applied the ABBY PDF Transformer 3.0 OCR algorithm to document 15-1.pdf. Page 2 of document 15-1.pdf is identical to the WH LFCOLB image except for the case label added to the top edge of the page. The OCR algorithm deciphered the entire case label, and all of the typed text except for the one word “Male”. Additionally the form text (or numbers) “Sex”,“6a.”, “6c.”, “8.”, “20.”,“Other”, “Date Accepted by Reg.” and the associated date AUG -8 1961 were not deciphered. These OCR results (except for the added case label) are the same as for the WH LFCOLB image. Both pages of document 15-1.pdf were scanned for OCR.

Finally I also applied the ABBY PDF Transformer 3.0 OCR program to the four-page document 35-1.pdf. The scan deciphered both case labels and found all of the typed text with the exception of the “X” in the No box within form box 7g. The form text that was not deciphered included “5a. Month”, “5b. Hour”, “6b. Island”, “Town Limits”, “Island”,“7d. Street Address”, “ district”, “7g. Is Residence on a Farm or Plantation?”, “Mother”, “17b. Date Last Worked”, “Signature of Parent”, “Informant”, “Parent”, “Other”, “18b. Date of Signature”, “hour stated”, “M.D.”, “22. Date Accepted by Reg. General”, “AUG -8 19″. Additionally, the following warning was returned by the scan: “Page 4 Warning Check the document language”.

The difference in image resolution of the mostly text layer of the WH LFCOLB PDF image and the uniform resolution of the page 4 LFCOLB PDF image likely explains why less text was detected in this trial OCR scan of page 4 of document 35-1.pdf than the scans of the WH LFCOLB and the page 2 LFCOLB. The resolution (150 PPI) of the page 4 LFCOLB PDF image (last page of 35-1.pdf) is lower than the resolution (300 PPI) of the mostly text layer of the WH LFCOLB PDF image (and the page 2 LFCOLB PDF image). The smallest form text of the page 4 LFCOLB PDF image would be the most affected by the reduced resolution.

The Obot claim is that page 4 of document 35-1.pdf was created by a scan of a paper copy of page 2 of document 15-1.pdf.

NBC: A logical conclusion based on the evicence.

Hermitian: The METADATA from document 35-1.pdf indicates that the PDF document was created by PFU ScanSnap Manager 5.0.21 #S1500 and produced by the Adobe Acrobat 9.51 Paper Capture Plug-in. Thus the document would have been created by means of a Fugitsu ScanSnap S1500 scanner and Adobe Acrobat 9. The PDF document would have been created using the “PDF from scanner” mode in Acrobat 9 in a customized scan with “Make Searchable (Run OCR)” and “Optimized Scanned PDF” options selected.

NBC: You should really pay more attention to the Creator tag and look at the information about the scanner in question as this would lead you to reject your scenario.

Hermitian: If this indeed was the actual workflow, then the results from the trial OCR scans of the three LFCOLB PDF images reported herein do not explain how the assumed workflow could have yielded the observed poor results. The reported results from the trial scans indicate that OCR should have detected most of the text on the page 4/11 LFCOLB but it did not.

NBC: Hence the scenario about the workflow is suspect and in fact, likely wrong.

Hermitian: This was first detected when the page 4/11 LFCOLB PDF image was opened in Adobe Acrobat XI Pro and the “Find All Suspects” tool was applied. The “Select Text” tool was also utilized. Much of the text on page 4 of 35-1.pdf was found to be not selectable. Of the selectable text, most was also flagged as suspect. The words “Case”, “Filed 05/04/12″ and “Page”in the original case label (i.e. the Green label) were flagged as suspect. However, both case labels were entirely selectable. Of the identified words and numbers that were made selectable by the purported OCR only the word “OF”, the number “61″ and the Roman numeral “II” were not flagged as suspect.

The findings reported herein indicate that the particular words on page 4 of document 35-1.pdf that were made selectable did not result solely from the application of OCR. Rather it is more likely that human intervention also occurred. Otherwise, why was only the largest printed text on the certificate form made selectable? Then, more importantly, why was none of the smaller form text made selectable in this purported OCR scan?

NBC: Quite simple, because only the largest print was readable by the OCR software. Why this requires a human intervention or why the human intervention would insert misspelled words or why a human would even intervene in such way, is totally left unexplained. Again, a workflow, different from the one imagined by Hermitian will be shown to be more likely.

Hermitian: If not this scenario, then the peculiar internal structure of the page 4 PDF image must have defeated the Adobe Acrobat OCR scan. This scenario is also unlikely assuming that the PDF file was created by first scanning a paper document to create a flattened bitmap image and then embedding this bitmap image into a single layer within a PDF document.

NBC: A simple scenario emerges: The printer does not support Twain scanning, but rather software, identified as the creator (note the ‘manager’ in the creator part…). The image is scanned to a color pdf with a likely resolution of 150 DPI. When subsequently imported into the PDF, Paper Capture fails to properly capture the information. So simple.

41 thoughts on “MS – Orly v Democrat – Henry Blake Affidavit – Part 3

  1. “NBC: Note that the Producer was not the scanner but rather the software: PFU ScanSnap Manager 5.0.21 #S1500, which is to be used to scan. Some quick research reveals that:

    “ScanSnap Manager

    “This software is required to scan documents with the ScanSnap. The scanned image data can be converted to a PDF or JPEG file to be saved.

    “The scanner driver does not support TWAIN, which means that you cannot scan directly into Acrobat. Which explains why the paper capture plugin was used.

    “Producer: Adobe Acrobat 9.51 Paper Capture]”

    Make that Adobe Acrobat 9.51 Paper Capture Plug-in. It’s that last word “Plug-in” that’s important here. Now a “Plug-in”, as the name implies, plugs into something. NBC seems to think that the “Plug-in” plugs into his Fugitsu ScanSnap S1500 scanner. However, to the contrary, beginning with Adobe Acrobat 5, the Paper Capture Plug-in is incorporated into Adobe Acrobat.

    [NBC: I am very aware of how the plugin is one belonging to Acrobat duh…]

    See: http://www.adobe.com/support/pdfs/CapturePlugInHelp.pdf

    Because we are now up to version 11 of Acrobat, we can safely assume that the “Adobe Acrobat Paper Capture Plug-in” plugs into Adobe Acrobat. Therefore one can not use the plug-in without using what it plugs into. Which means that one cannot scan to the “Adobe Acrobat Paper Capture Plug-in” without scanning directly into Adobe Acrobat. Ever since version 5 of Acrobat this has been a constant source of confusion for Acrobat users which has continued through, at least, version 9.

    [NBC: The paper capture plugin can import PDF’s]

    See: http://forums.adobe.com/thread/303423

    This poor guy was not able to find the plug-in in his version 9. Fortunately, for him, it was there all along.

    And unfortunately, for NBC’s latest workflow scenario, the “Adobe Acrobat Paper Capture Plug-in 9.5.1 was there in Acrobat 9 when the secretary scanned the print-out of page 2 of document 15-1.pdf.

    So all of this stuff spouted by NBC about the Fugitsu does not not support TWAIN and therefore it cannot scan directly into Acrobat is just more Obot Bolony (Bologna).

    Fortunately for all users of Fitgitsu scanners, the Fugitsu company knows better than NBC, and that is why they sell the Fugitsu ScanSnap S1500 bundled with Adobe Acrobat.

    Well Duh !!!

    [NBC: It’s Fujitsu btw… You do know how the workflow scans the document into a Fujitsu provided software? I assume that you did read the user manuals🙂. Hermitian should have read the next link in Google SearchL

    The Paper Capture Plug-in for Windows is a plug-in for Adobeᆴ Acrobatᆴ 5.0 that is based on the same technology as the Paper Capture Online service. It enables the “capturing” of an Adobe Portable Document Format (PDF) file on a user’s computer without sending the file over the Internet.

    The Paper Capture Plug-in performs OCR (optical character recognition) on a PDF image file. The benefit of capturing the file is that it allows for the text to be searched and copied.

    ]

    Poor Hermitian, such simple research and he stumbles at every step

    Oh and as to TWAIN drivers

    You can create a PDF file directly from a paper document, using your scanner and Acrobat. On Windows, Acrobat supports TWAIN scanner drivers and Windows Image Acquisition (WIA) drivers.

    ]

    Cheers.

  2. Just a brief post to clear up a point of confusion that NBC and Vicklund seem to have about the “low resolution” of the purported color print-out that was purportedly scanned to create page 4 of document 35-1.pdf.

    If I understand their claimed workflow, and believe me we are trying to hit a moving target here, the secretary first used her color laser printer to produce a color print of page 2 of document 15-1.pdf. She then placed the color printout on the glass of her Fugitsu ScanSnap S1500 scanner and scanned directly to PDF in Adobe Acrobat 9. She thus had to have previously selected “Create PDF from scanner” / “Custom Scan” / and would have checked the box “Make Searchable (RUN OCR) within Adobe Acrobat 9.

    Now we know absolutely nothing about the print resolution that she chose for the printout of page 2 of document 15-1.pdf. Thus we know nothing about the resolution of the paper document that she scanned.

    We do know that the PDF image that comprises page 2 of PDF document 15-1.pdf is identical to the WH LFCOLB PDF image except for the case label added to the top edge of the page.

    And therefore we know that the background layer of the page 2 LFCOLB PDF image (as seen on the screen of the secretary’s monitor) was 150 PPI and that the resolution of all the non-background layers was 300 PPI. But we don’t know the resolution of the color printout.

    Consequently, the color printout of page 2 of document 15-1.pdf has some unknown resolution of XXX.dpi.

    Now both NBC and Vicklund are assuming that the resolution of this printout was the same pixel resolution of the LFCOLB PDF image on page 4 of Document 35-1.pdf. So they are assuming that the color printout of page 2 of document 15-1.pdf had an equivalent pixel resolution of 150 PPI. Hence they are assuming that XXX.dpi = 150 PPI.

    And then they are claiming that their “assumed” low resolution for the color printout explains the poor OCR results observed for the page 4 LFCOLB.

    And then I am assuming that they don’t know what the Hell they are talking about !

    A subtle, but not unimportant, fact is that we know absolutely nothing about the source of the paper document that NBC and Vicklund claim that was scanned to produce page 4 of document 35-1.pdf. Consequently, we do not know the actual chain of custody between the LFCOLB PDF image on page 2 of PDF document 15-1.pdf and the LFCOLB PDF image on page 4 of document 35-1.pdf. Thus, there is an unknown chain of custody between page 2 of PDF document 15-1.pdf and page 4 of PDF document 35-1.pdf.

    And of course we also know that PDF document 15-1.pdf and PDF document 35-1.pdf were both filed (by the MDEC attorneys) in the same Obama ballot challenge law suit brought by Orly Taitz and her plaintiffs into Federal District Court, Southern District of Mississippi.

  3. Here’s Hermie’s bizarre workflow:

    1. Document 15-1.pdf is opened in Adobe Acrobat and then page 2 is extracted into a separate one-page PDF document.

    This is actually somewhat reasonable. But not the only reasonable workflow, nor is it the most efficient.

    2. The object containing the existing Document 15-1 case label is then selected and its color is changed from bright Blue to light Green.

    This doesn’t make any sense. Why deliberately change the color of the case label? There’s no reason to change the color on purpose. If it happens as, say part of the printing and scanning process, it’s expected that the colors will shift, but in that case, it’s accidental, not deliberate.

    3. The bright-Blue case label for Document 35-1 is then typed above the light Green case label of Document 15-1 within the margin of the (page 4/11) LFCOLB PDF image.

    This is wrong for three reasons. First, it’s in the wrong order. You don’t type in headers individually. The preferred workflow would be to merge the file with the Fuddy letter first, then create the header and let it do the page numbering automatically. Second, it’s totally unnecessary. As you’ve been told numerous times, it is the court that adds the header when you submit the file. So this step wouldn’t even be in the workflow. Third, in your workflow, the second case label would superimpose the first case label – we’ve seen this happen in other case filings. You would first have to scale down the image without changing the page size, since the header location is fixed by the court.

    4. The resulting (page 4/11) LFCOLB one-page PDF image file is then merged with the three-page Tepper-to-Fuddy letter to create the Document 35-1.

    Other than being out of order, this is fine.

    So to sum up, you have two unnecessary steps (2&3), a step out of order (step 4 should be before step 3, if step 3 were actually necessary), and a missing step (shrink the image so the two case labels don’t overlap).

  4. If I understand their claimed workflow, and believe me we are trying to hit a moving target here, the secretary first used her color laser printer to produce a color print of page 2 of document 15-1.pdf.

    What evidence do you have that the printer was a laser printer rather than an inkjet?

    She then placed the color printout on the glass of her Fugitsu ScanSnap S1500 scanner and scanned directly to PDF in Adobe Acrobat 9.

    This is incorrect. The METADATA indicates that it was scanned directly to PDF in PFU ScanSnap Manager 5.0.21 #S1500.

    She thus had to have previously selected “Create PDF from scanner” / “Custom Scan” / and would have checked the box “Make Searchable (RUN OCR) within Adobe Acrobat 9.

    This is impossible. The SnapScan Manager is not TWAIN compliant, which means you can’t use Acrobat to run the scanner. You instead have to scan it from the scanner, and then open it in Acrobat and run the OCR plugin.

    You also seem to have missed the middle step: filing the printout for several days while they waited for the response from Fuddy. And it’s possible that there was a photocopy step before that. Here’s the workflow:

    1. Print letter to Fuddy
    2. Sign letter to Fuddy
    3. Print 15-1 page 2 (LFBC with single case file label) once or twice with default “Fit to Page” setting
    4. Copy signed letter to Fuddy (and LFBC if only printed once)
    5. Send Fuddy letter and LFBC
    6. File hardcopy of Fuddy letter and LFBC in same folder
    7. Wait for response from Fuddy
    8. Receive response from Fuddy
    9. Retrieve hardcopy of Fuddy letter and LFBC
    10. Scan each from scanner
    11. Merge the two files in Adobe Acrobat
    12. Run OCR (might be part of step 11)
    13. Send to court electronic filing system

    Your scenario would actually add steps between 10 and 11 – instead of scanning an already existing hardcopy that is filed with another document you have to scan, you would have the poor paralegal open a pdf, extract one page, and shrink it so the case labels won’t overlap (plus the unnecessary steps you threw in for no reason). That would also mean that you are sending to the court something other than a copy of what you sent Fuddy.

  5. W. Kevin Vicklund

    July 10, 2013 13:32

    “This is wrong for three reasons. First, it’s in the wrong order. You don’t type in headers individually. The preferred workflow would be to merge the file with the Fuddy letter first, then create the header and let it do the page numbering automatically. Second, it’s totally unnecessary. As you’ve been told numerous times, it is the court that adds the header when you submit the file. So this step wouldn’t even be in the workflow. Third, in your workflow, the second case label would superimpose the first case label – we’ve seen this happen in other case filings. You would first have to scale down the image without changing the page size, since the header location is fixed by the court.”

    I believe that any law firm who is offering up the official long-form birth certificate of the President of the United States in an official electronic court filing document would place the importance of the chain of custody above the work load of their secretary.

    Given that they had already produced document 15-1.pdf to the court, then they have a legal burden (under the best evidence rule) to produce the second LFCOLB copy with a legal chain of custody to the first LFCOLB copy.

    So let’s examine your proposed work flow (in light of the chain of custody requirement) and assuming that the court applies the case label as you claim.

    You suggest that the secretary first produced a color print of page 2 of document 15-1.pdf. Now we are assuming that the court had previously applied the 15-1 case label in pure Blue color at the top edge of the page 2 in their normal position on the page. Hence the color printout would have a printed image of the 15-1 case label in Blue color.

    Under your assumed work flow, the secretary then scanned the color printout of page 2 of document 15-1.pdf to produce the page 4 PDF image for document 35-1.pdf. She then merged the one-page PDF scanned image of page 2 of document 15-1 with the 3 page PDF document containing the Fuddy letter. The law firm numbers the pages of 35-1.pdf and then files it electronically with the court.

    The court then adds the 35-1 case label to the 35-1.pdf PDF document in pure Blue color which is their normal practice.

    Then under your proposed work flow, the LFCOLB PDF image on page 4 of document 35-1.pdf would have two superimposed Blue labels at the top. The Blue 35-1 case label would be in a scalable font typed over the Blue image of the 15-1 case label.

    Now let’s compare that with my preferred work flow, again assuming that the court applies the case labels.

    The secretary opens the PDF document 15-1.pdf and extracts page 2 into a single-page PDF document. She then selects the Blue 15-1 case label and changes its color from Blue to Green. She also repositions the label to a slightly lower position on the page. She then merges this one-page PDF with the 3-page Fuddy letter.

    The law firm then electronically files the PDF document 35-1.pdf with the court.

    The court then applies the 35-1 case label to document 35-1.pdf.

    Then page 4 LFCOLB PDF image then has two case labels (the 15-1 label in Green and the 35-1 label in Blue). The Blue label is above the Green label.

    Other than the two case labels, the page 4 LFCOLB PDF image is then identical to the WH LFCOLB PDF image. Consequently, the chain of custody between the page 4 LFCOLB PDF image and the WH LFCOLB image is maintained.

  6. So why doesn’t Obama allow Congress to examine the document he has posted on his white house webpage and let them do their own analysis? After all, Congress would simply be proving a positive.

    Sure, he doesn’t have to do this, but it would help clear up this one point and put this particular matter to rest for the good of the country.

    ex animo
    davidfarrar

  7. Wow… The Short Bus Squad has shown up! “So why doesn’t Obama allow Congress to examine the document he has posted on his white house webpage…” He presented the authenticated hard copies at an open press conference! Any one in Congress could have gone to see them, and they didn’t. Why? ‘Cause they’re not as stupid as Birthers.

    BHO isn’t hiding anything! He has proven his birth and citizenship more than ANY OTHER PRESIDENT, EVER. God, you guys are dim.

  8. Mr. Farrar:

    Why should he bother? Is Congress asking for it? Except for a fringe goober faction, no one really cares.

  9. David Farrar – “So why doesn’t Obama allow Congress to examine the document he has posted on his white house webpage and let them do their own analysis? ”

    How do you know he’s not willing to do that? Have you asked your congressman or senator to call or walk down to the White House to view the BC? Maybe some members have already done that. And that’s why Speaker Boehner won’t pursue the BC issue.

  10. I believe that any law firm who is offering up the official long-form birth certificate of the President of the United States in an official electronic court filing document would place the importance of the chain of custody above the work load of their secretary.

    There is no chain of custody to preserve. The birth certificate was never offered as evidence, and even if it was, it was already filed with the court. Sending out copies doesn’t break chain of custody. They are copies, and the court understands that they are copies. Frankly, the “best evidence” of what they sent would be an actual copy of what they sent that includes all the changes caused by printing (such as reduced image size and color differences), rather than a previous version. That way the court could see what Fuddy saw. Also note that an actual paper copy was filed with the court.

    You suggest that the secretary first produced a color print of page 2 of document 15-1.pdf. Now we are assuming that the court had previously applied the 15-1 case label in pure Blue color at the top edge of the page 2 in their normal position on the page. Hence the color printout would have a printed image of the 15-1 case label in Blue color.

    Yes. However, as anyone who has worked with color printers knows, the color on the page is not the color on the screen, and especially with inkjets, colors bleed at the edges. So the blue will not be the same on the printed paper,and will blur with the green of the background, and why the basketweave color is no longer a bright green. Also, as anyone who has extensive experience printing PDFs knows, the default print setting in Adobe (Reader and Acrobat) is “Fit to Paper.” With most printers, there is a unprintable margin, and thus PDFs end up printing at about 96%.

    Under your assumed work flow, the secretary then scanned the color printout of page 2 of document 15-1.pdf to produce the page 4 PDF image for document 35-1.pdf.

    Yes. And in so doing, the image was optimized, causing the colors to blur even more. That’s why, when you view the image at 800% or more, you can see that the center of the letters in the first case label are blue, and fade to blue-green at the edges.

    She then merged the one-page PDF scanned image of page 2 of document 15-1 with the 3 page PDF document containing the Fuddy letter.

    Yep. Possibly does an OCR, but without checking the results, or the OCR might have just been automatic.

    The law firm numbers the pages of 35-1.pdf and then files it electronically with the court.

    WRONG! The court does that. It’s part of the case label.

    The court then adds the 35-1 case label to the 35-1.pdf PDF document in pure Blue color which is their normal practice.

    Yep.

    Then under your proposed work flow, the LFCOLB PDF image on page 4 of document 35-1.pdf would have two superimposed Blue labels at the top. The Blue 35-1 case label would be in a scalable font typed over the Blue image of the 15-1 case label.

    Nope! By printing at the default “Fit to Page” the image prints at 96%, causing the first case label to shift downwards and shrink. I measured it, the image is at 96%, and using 5 different printers from 4 manufacturers, each one prints the 15-1 page 2 at 96%.

    Did I mention that I hate “Fit to Page”? It ignores white space at the edges of the page when it determines whether it needs to add in the unprintable margin. So I’m constantly having to remember to change the setting if I’m bouncing from different page sizes.

  11. So why doesn’t Obama allow Congress to examine the document he has posted on his white house webpage and let them do their own analysis? After all, Congress would simply be proving a positive.

    Why would Congress want to examine the document? And as to ‘being good for the country’, it would not end the foolishness of the birthers who have chosen to drag down our President. Why would he pay attention to these people?…

  12. So let’s examine your proposed work flow (in light of the chain of custody requirement) and assuming that the court applies the case label as you claim.

    There is no chain of custody requirement here. Sigh…

  13. The secretary opens the PDF document 15-1.pdf and extracts page 2 into a single-page PDF document. She then selects the Blue 15-1 case label and changes its color from Blue to Green. She also repositions the label to a slightly lower position on the page. She then merges this one-page PDF with the 3-page Fuddy letter.

    ROTFL… So again, what is the problem, even under your scenario there is no ‘forger’…

    Sigh… The simple reality is that the letter and the birth certificate copied from the original filing were sent together to be verified. The document was scanned in again to reflect what the document sent looked liked and merged with the scan of the letter. Nothing nefarious. But even under your preferred scenario, nothing nefarious happened other than your ‘chain of custody’ claims. ROTFL. As I said, nothing much really. And of course by scanning the actual documents sent to Fuddy, there is a great chain of custody.
    So is your hope that the court will reject the filing because of your misunderstanding of the law of evidence?

    PS, Fuddy verified the document found on the whitehouse website.

  14. We’ve been assuming that the OCR was done by the law firm. Is it possible it was done by the court clerk? Not that it really matters, but that could explain why it was never checked. Clerk receives document, adds the case label, runs OCR, puts it in the electronic case file.

  15. We can look at other documents, but this is a good point as the court workflow does add automatic labels. Is it the filing system software?

  16. Using Adobe Acrobat 9.0, with the PaperCapture 9.1 plugin, I just ran an OCR on page 11 of the 12 page combined document 35. This was without flattening or anything, so the existing OCR remained. It was able to pick up only two additional words, “Highway” (from the address) and “African” (from the father’s race). I will run a more complete experiment when I have more time tonight, but it looks like PaperCapture is not as capable an OCR program as the ones Hermie was using. I expect that after flattening, PaperCapture 9.1 will perform OCR similarly (but not identically, as it’s a slightly different version) to what we see in the page 4/11 document – picking up most of what was picked up, missing a couple of words that were picked up, picking up some words that were not picked up, and missing almost all of what was missed.

  17. Results of the experiment after removing bookmarks, hidden text, and hidden objects:

    Differences from the original page 4/11:

    Words missed

    “STATE OF HAWAII” in upper left corner (form text)

    “Maternity & Gynelogical Hospital” (from Name of Hospital (typed text)

    “University” from father’s Type of Business (typed text)

    Words picked up

    “Highway” from mother’s Address (typed text)

    “African” from father’s Race (typed text)

    All other text picked up or missed by the original page 4/11 are the same

    This is substantial corroboration that using PaperCapture 9.51 was why only a handful of words were picked up by running OCR. The minor differences can be attributed to the difference in versions.

  18. Changes to the METADATA:

    Modified: [todays date and time]

    PDF Producer: Adobe Acrobat 9.2 Paper Capture Plug-in

  19. This is substantial corroboration that using PaperCapture 9.51 was why only a handful of words were picked up by running OCR. The minor differences can be attributed to the difference in versions.

    Excellent research!!! Will turn your findings and others into a separate posting.

  20. What’s fascinating about this is how Hermitian is so obviously out of his depth to the casual observer, but has made such a significant emotional investment in a truly impressive waste of time that he cannot let go. What’s that old adage–“the need to believe trumps everything?”

  21. Results of the experiment after removing bookmarks, hidden text, and hidden objects:

    Any evidence of new hidden elements?

  22. I find Hermitian fascinating because he is so convinced of the document being a forgery.

  23. You are right, see my latest posting on black rectangles..

    They are artifacts left behind by forms I believe

  24. NBC and Vicklund are still way off base. You would think that by know even an Obot could get it right!

    The PFU ScanSnap Manager 5.0.21 #S1500 can produce PDF files but does not have OCR capability. That’s why the Fugitsu ScanSnap S1500 Scanner is bundled with both Adobe Acrobat X and ABBYY FineReader for ScanSnap. Acrobat can create PDF files directly from the Fugitsu ScanSnap S1500 and apply OCR on the same PDF file. There are three different OCR modes that the operator can choose from.

    ABBYY FineReader for ScanSnap can also do OCR on PDF files (but only the PDF files that are created by the PFU ScanSnap Manager 5.0.21 #S1500). The bundled version of ABBY Fine Reader does not do OCR on Acrobat PDF files. The OCR capabilities of ABBYY Fine Reader are beyond those of the ABBYY PDF Transformer 3.0 that I used in my trials. In spite of that fact, my OCR trials using ABBYY PDF Transformer 3.0 exceeded the (purportedly) OCR deciphered words observed on the page 4 LFCOLB PDF image from document 35-1.pdf.

    From the Fugitsu ScanSnap S1500 Operator’s Manual:

    “Adobe Acrobat X Standard®

    “An industry-standard application to create, edit, and manage electronic documents in
    PDF format.
    *1 : Only bundled with S1500.
    *2 : Only bundled with S1500M.

    “Adobe Acrobat : Adobe® Acrobat®

    “All the descriptions in this manual assume the usage of Adobe Acrobat
    bundled with the ScanSnap. Unless otherwise specified, the term Adobe
    Acrobat refers to the Adobe Acrobat bundled with the ScanSnap.

    “ABBYY FineReader for ScanSnap®

    “All the descriptions in this manual assume the usage of ABBYY
    FineReader for ScanSnap bundled with the ScanSnap. Unless otherwise
    specified, the term ABBYY FineReader for ScanSnap refers to the ABBYY
    FineReader for ScanSnap bundled with the ScanSnap.

    “Note that ABBYY FineReader for ScanSnap may be upgraded without
    notice.

    “If the descriptions differ from the actual displayed screens, refer to the
    ABBYY FineReader for ScanSnap Help.

    “About the OCR function of ABBYY FineReader for ScanSnap

    “ABBYY FineReader for ScanSnap is an application used exclusively with the ScanSnap. This
    program can perform text recognition only for PDF files created by using the ScanSnap. It
    cannot perform text recognition for files created using Adobe Acrobat or other applications.

    “This application can perform text recognition on the scanned images using OCR
    (Optical Character Recognition) and convert the image data to Word, Excel or
    PowerPoint files that can be edited.”

    So the bottom line is that the importance of your PDF Creator Tool: PFU ScanSnap Manager 5.0.21 #S1500 to our debate over OCR is nil.

    If page 4 of document 35-1.pdf was really created by scanning a paper printout of page 2 of document 15-1.pdf as you claim, then the unknown resolution of the printed copy and the choice of OCR software (i.e. Adobe Acrobat 9 vs ABBYY FineReader ScanSnap ???) are far more important to the OCR issue.

    By the way, 150 dpi is the standard print resolution for when a printer is used to make document copies in lieu of a Xerox.

  25. Good to see our message finally sunk in. It’s amusing to see you take our position and berate us for it, but whatever. Now, we happen to know which OCR software was used, since the METADATA tells us this. That was what the discussion the last 48 hours was all about – what can the actual OCR software used do, not what a higher-end software package that wasn’t used can do.

    But at least you’re starting to catch up to us.

  26. But at least you’re starting to catch up to us.

    Slowly and he is not catching up as much as getting less faster behind🙂

  27. So the bottom line is that the importance of your PDF Creator Tool: PFU ScanSnap Manager 5.0.21 #S1500 to our debate over OCR is nil.

    Au contraire… The Manager is integrated with Adobe PDF Library which provides with all the PDF and OCR solutions.

    Sorry my friend, your batting average remains remarkably low.

  28. No new hidden elements. I think that’s from the scan itself.

    Wait! I missed a step. I was fiddling around with converting to Form, and realized that while I had used the Examine Document navigation panel to delete hidden items and hidden text, I hadn’t checked the Content navigation panel. So I went back and made sure to remove all of the extra stuff that showed in the Content panel as well, and then reran the OCR scan.

    I got black rectangles.

    In fact, I got several black rectangles in the same spots as the in the original, others overlapped the original but were not identical.

    No hidden lines. That may be from the scanner, or it may be from the later version of Photo Capture.

    Also of note: running the experiment multiple times, I found that it doesn’t always select the exact same text. The latest one, for example, caught STANLEY (mother’s name), but not AFRICAN. Also, it’s lumping the OCR text in with the case label, which I didn’t delete, rather than separate Container objects. Which suggests another experiment…

  29. Back from experiment. When the case label was deleted, (just to clarify, I’m talking about the 35-1 label), the OCR put the text into Container objects.

    But wait! There’s more!

    This time, not only did I get the black rectangles, I also got hidden lines! As with the text and the rectangles, they are similar to the original, but not always identical, which would point to either a difference between versions and/or the weird effect other objects have on the OCR scan (ie, hidden lines added or different words being picked up because of the presence or absence of other objects).

  30. I got black rectangles.

    Well, Hermitian got you exactly where he wanted you to be… Debunking his follies🙂

    Well done

  31. This time, not only did I get the black rectangles, I also got hidden lines! As with the text and the rectangles, they are similar to the original, but not always identical,

    Poor Hermitian must be really be happy now🙂

  32. Hey hermie, you missed a hell of a lot here:

    W. Kevin Vicklund permalink
    July 10, 2013 17:56

    Using Adobe Acrobat 9.0, with the PaperCapture 9.1 plugin, I just ran an OCR on page 11 of the 12 page combined document 35. This was without flattening or anything, so the existing OCR remained. It was able to pick up only two additional words, “Highway” (from the address) and “African” (from the father’s race). I will run a more complete experiment when I have more time tonight, but it looks like PaperCapture is not as capable an OCR program as the ones Hermie was using. I expect that after flattening, PaperCapture 9.1 will perform OCR similarly (but not identically, as it’s a slightly different version) to what we see in the page 4/11 document – picking up most of what was picked up, missing a couple of words that were picked up, picking up some words that were not picked up, and missing almost all of what was missed.

    W. Kevin Vicklund permalink
    July 10, 2013 18:36

    Results of the experiment after removing bookmarks, hidden text, and hidden objects:

    Differences from the original page 4/11:

    Words missed

    “STATE OF HAWAII” in upper left corner (form text)

    “Maternity & Gynelogical Hospital” (from Name of Hospital (typed text)

    “University” from father’s Type of Business (typed text)

    Words picked up

    “Highway” from mother’s Address (typed text)

    “African” from father’s Race (typed text)

    All other text picked up or missed by the original page 4/11 are the same

    This is substantial corroboration that using PaperCapture 9.51 was why only a handful of words were picked up by running OCR. The minor differences can be attributed to the difference in versions.

    W. Kevin Vicklund permalink
    July 10, 2013 18:47

    Changes to the METADATA:

    Modified: [todays date and time]

    PDF Producer: Adobe Acrobat 9.2 Paper Capture Plug-in

    W. Kevin Vicklund permalink
    July 10, 2013 22:45

    No new hidden elements. I think that’s from the scan itself.

    W. Kevin Vicklund permalink
    July 11, 2013 13:14

    No new hidden elements. I think that’s from the scan itself.

    Wait! I missed a step. I was fiddling around with converting to Form, and realized that while I had used the Examine Document navigation panel to delete hidden items and hidden text, I hadn’t checked the Content navigation panel. So I went back and made sure to remove all of the extra stuff that showed in the Content panel as well, and then reran the OCR scan.

    I got black rectangles.

    In fact, I got several black rectangles in the same spots as the in the original, others overlapped the original but were not identical.

    No hidden lines. That may be from the scanner, or it may be from the later version of Photo Capture.

    Also of note: running the experiment multiple times, I found that it doesn’t always select the exact same text. The latest one, for example, caught STANLEY (mother’s name), but not AFRICAN. Also, it’s lumping the OCR text in with the case label, which I didn’t delete, rather than separate Container objects. Which suggests another experiment…

    W. Kevin Vicklund permalink
    July 11, 2013 13:25

    Back from experiment. When the case label was deleted, (just to clarify, I’m talking about the 35-1 label), the OCR put the text into Container objects.

    But wait! There’s more!

    This time, not only did I get the black rectangles, I also got hidden lines! As with the text and the rectangles, they are similar to the original, but not always identical, which would point to either a difference between versions and/or the weird effect other objects have on the OCR scan (ie, hidden lines added or different words being picked up because of the presence or absence of other objects).

  33. Hermitian really could benefit from reading through these postings and your comments. They help understand why I feel comfortable with my conclusions both about the WH PDF and document 35-1.

    So far Hermitian has done little to support his findings or explain his working hypothesis.

  34. I read through all this stuff when he posted it. But you promised to update his results on this blog.

    Thus far the page count for hard results that I have posted on Scribd is 39 pages for me and zip for you and Vicklund.

    And Vicklund claims to be a master printer. And you claim to be his mindreader.

    How about you and Vicklund backing up your bravado by posting your own sworn affidavits on Scribd?

    And then I can proceed to rip them to shreds.

    Oh! I see !

    You would prefer not to put your wild conjectures on the record?

    I thought so. All you Obots can dish out the criticism but you can’t stand to have any of your verbose braggadocio critiqued.

  35. I read through all this stuff when he posted it. But you promised to update his results on this blog.

    I did and failed…. But I am in the process of correcting that…

    As to posting an affidavit, I have no interest in doing so, my contribution is merely looking at your claims and the claims of others to show that under closer scrutiny they do not hold up.

    So far you have done little to rip them to shreds while Vicklund and I have laid to rest most of your “arguments”.

    Which is of course why you are now moving the goal posts.

    In the interest of the truth, why can we not explore where the data leads us?

  36. I read through all this stuff when he posted it.

    And yet you managed to get every detail wrong.

    Then again, you can’t even seem to remember your own claims.

Comments are closed.