Xerox/WH PDF – Part 13 Label Fonts

It was noted that ECF/CM relabeled the documents in Orly’s case in Mississippi. It will be interesting to see what Itext did to achieve these changes. It seems that they decided to embed the font used, probably to ensure a common experience across readers. I will likely present a before and after comparison of the file to determine what changed.

Before ECF/CM relabeled the documents:

gov.uscourtsmssd.78493.35.1.pdf shows the following fonts. There are only two embedded font objects: HiddenHorzOCR and HiddenHorzOCR. Interesting side note: Note the reference to Hidden and OCR.

name                                 type              emb sub uni object ID
------------------------------------ ----------------- --- --- --- ---------
Times-Italic                         Type 1            no  no  no      76  0
Times-Roman                          Type 1            no  no  no      77  0
Helvetica-Oblique                    Type 1            no  no  no      78  0
Helvetica                            Type 1            no  no  no      79  0
Helvetica                            Type 1            no  no  no      80  0
HiddenHorzOCR                        CID Type 0C       yes no  yes     82  0
Helvetica                            Type 1            no  no  no      83  0
Times-Italic                         Type 1            no  no  no      84  0
Helvetica-Oblique                    Type 1            no  no  no      85  0
Times-Roman                          Type 1            no  no  no      86  0
Helvetica                            Type 1            no  no  no      88  0
Times-Roman                          Type 1            no  no  no      89  0
HiddenHorzOCR                        CID Type 0C       yes no  yes     91  0
Times-Roman                          Type 1            no  no  no      92  0
Helvetica                            Type 1            no  no  no      93  0

obj 80 0
 Type: /Font
 Referencing:
<<
 /BaseFont /Helvetica
 /Type /Font
 /Encoding /WinAnsiEncoding
 /Subtype /Type1
 >>

After gov.uscourtsmssd.78493.35.1.pdf shows that the STCCFS+LiberationSans font is being used for Obj 80
obj 80 0
 Type: /Font
 Referencing: 95 0 R

  <<
    /LastChar 118
    /BaseFont /STCCFS+LiberationSans
    /Type /Font
    /Encoding /WinAnsiEncoding
    /Subtype /TrueType
    /FontDescriptor 95 0 R
    /Widths '[277 0 0 0 0 0 0 0 0 0\n0 0 0 333 0 277 556 556 556 556\n556 556 556 0 556 0 277 0 0 0\n0 0 0 666 0 722 722 0 610 0\n722 0 0 0 556 0 0 0 666 0\n722 0 610 0 0 943 0 0 0 0\n0 0 0 0 0 556 0 500 556 556\n277 556 0 222 0 0 222 833 556 556\n0 0 0 500 277 556 500]'
    /FirstChar 32
  >>

name                                 type              emb sub uni object ID
------------------------------------ ----------------- --- --- --- ---------
Times-Roman                          Type 1            no  no  no      76  0
Times-Italic                         Type 1            no  no  no      77  0
Helvetica-Oblique                    Type 1            no  no  no      78  0
Helvetica                            Type 1            no  no  no      79  0
STCCFS+LiberationSans                TrueType          yes yes no      80  0
HiddenHorzOCR                        CID Type 0C       yes no  yes     82  0
Times-Italic                         Type 1            no  no  no      83  0
Helvetica                            Type 1            no  no  no      84  0
Helvetica-Oblique                    Type 1            no  no  no      85  0
Times-Roman                          Type 1            no  no  no      86  0
Helvetica                            Type 1            no  no  no      88  0
Times-Roman                          Type 1            no  no  no      89  0
HiddenHorzOCR                        CID Type 0C       yes no  yes     91  0
Times-Roman                          Type 1            no  no  no      92  0
Helvetica                            Type 1            no  no  no      93  0

11 thoughts on “Xerox/WH PDF – Part 13 Label Fonts

  1. “NBC

    “obj 80 0
    Type: /Font
    Referencing:
    <<
    /BaseFont /Helvetica
    /Type /Font
    /Encoding /WinAnsiEncoding
    /Subtype /Type1"

    HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
    Now that "keeper of all the fonts for the King" NBC posted Helvetica as the official font for the added court case label (object 80 ??) for the unmodified document 35-1.pdf.

    And he poo pooed my findings that the Font was Arial which is substituted for Helvetica in Adobe Illustrator CS6, Adobe Acrobat XI, Adobe Reader XI and PDF XChange Viewer Pro because the Helvetica font is not an available font in any of these four programs.

    And I even posted my findings here from court Document 15-1.pdf (which was also altered by someone using a very old version (2.17) of iText). But then I obtained my results with PDF XChange Viewer Pro which NBC has outlawed as not having the right stuff.

    But, you see, what NBC didn't share with his readers is that the Helvetica font is one of the 13 font sets that are NEVER embedded in PDF files. Consequently all of my high-level PDF tools substituted an Arial font for the Helvetica font.

    And if you, the reader buy NBC's distorted storyline then this simple and routine font substitution is the crime of the century. So in NBC's greatly distorted view of reality it's perfectly OK for the forger to be the Xerox Workcenter 7535, but it's not OK for the favorite PDF software Tools (used by 100's of millions of graphic artists, scientists, courts and attorney's) to make a routine font substitution for an unavailable font.

  2. And he poo pooed my findings that the Font was Arial which is substituted for Helvetica in Adobe Illustrator CS6, Adobe Acrobat XI, Adobe Reader XI and PDF XChange Viewer Pro because the Helvetica font is not an available font in any of these four programs.

    You missed the essential evidence in the original document and relied on the behavior of high level tools that try to make sense out of the document and as such your conclusion about Arial was flawed.

    As I said, you are looking at too high a level to allow you to properly understand the evidence.

    I have no idea what you are talking about other than that you now admit that the tools you use do substitutions which do not allow you to fully comprehend and appreciate the underlying pdf file.

    Note for instance that objects have names, and their names can vary between software. Not that all objects have numbers and the order etc are also dependent on the software used.

    Objects can be rendered ‘inline’ or as ‘real’ objects, again pointing to different software.

    You can have a document that looks the same in Illustrator and hides many hints inside.

    Since I am aware of font substitution I decided to check your claims and found them to be false positives.

    That’s all. As I said, you are relying on ‘hearsay’ rather than on the raw data and thus your conclusions may not be too well supported.

    Do you need any pointers to the command line tools? They provide an in-depth view into the raw data where many hidden gems can be found that are helpful in understanding the software used.

    As I showed for example, the jpeg embedded in the WH PDF shows a comment string also found in the 7535 version.

    Good detective work requires persistence and the right tools.

  3. But, you see, what NBC didn’t share with his readers is that the Helvetica font is one of the 13 font sets that are NEVER embedded in PDF files.

    I thought that was common sense, and the reason why the Court likely chose to embed the sans font so that it would render properly.

    Just common sense…

    Instead of being confused by font substitution, I looked at the raw data which shows how the file was created, not how it was rendered and interpreted by a high level ‘editor/viewer’.

  4. NBC is a very slippery guy!

    “You missed the essential evidence in the original document and relied on the behavior of high level tools that try to make sense out of the document and as such your conclusion about Arial was flawed.”

    HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
    If you mean that I wasn’t totally consumed that you pointed out that whoever the Hell created this page 4 forgery picked the Helvetica font which is never embedded in PDF files and is not an available font in either Adobe Illustrator CS6, Adobe Acrobat XI Pro, Adobe Reader XI, and PDF XChange Viewer Pro. And consequently the Arial MT font was substituted for an unavailable font by all the PDF software tools. Why is an unused font totally essential evidence of anything?

    That is unless you are demanding that the gazillion users who use the Adobe PDF tools routinely should immediately uninstall them and download your free parser tool? If so then maybe you should provide the source of this indefensible tool which doesn’t render bitmap streams or screen or print images. And then these millions of experts on everything PDF can follow YOUR advice. That is if they have a Macintosh.

  5. And then maybe you can explain why the unknown person/entity, who later modified the 15-1.pdf and 35-1.pdf files with iText, picked the Liberation Sans font (which is the metric equivalent font for Arial) and replaced your precious Helvetica font. So why didn’t the unknown party pick the metric equivalent font for Helvetica?

    Never mind! You wouldn’t have a clue.

  6. That is unless you are demanding that the gazillion users who use the Adobe PDF tools routinely should immediately uninstall them and download your free parser tool?

    These users need the viewer for specific purposes, not for forensic research. For that, better tools exist.

    If you mean that I wasn’t totally consumed that you pointed out that whoever the Hell created this page 4 forgery picked the Helvetica font

    No evidence of forgery and no real understanding how Helvetica is ‘chosen’… Wow…

  7. And then maybe you can explain why the unknown person/entity, who later modified the 15-1.pdf and 35-1.pdf files with iText, picked the Liberation Sans font (which is the metric equivalent font for Arial) and replaced your precious Helvetica font. So why didn’t the unknown party pick the metric equivalent font for Helvetica?

    The ‘unknown’ person is the developer of the ECF/CM system. The reason why he replaced the font is a minor issue but probably because the font is open source and can thus be embedded.

    It’s so simple.

  8. “NBC

    “July 17, 2013 21:57

    “But, you see, what NBC didn’t share with his readers is that the Helvetica font is one of the 13 font sets that are NEVER embedded in PDF files.

    “I thought that was common sense, and the reason why the Court likely chose to embed the sans font so that it would render properly.

    “Just common sense…

    “Instead of being confused by font substitution, I looked at the raw data which shows how the file was created, not how it was rendered and interpreted by a high level ‘editor/viewer’.

    HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
    Alright! !et’s just briefly test the credibility of this new little nugget of yours.

    So your claim is that some court official first noticed that the font for the court labels was identified as Arial MT in his Adobe Reader and that the label text was rendered perfectly so he decided he had to immediately modify all of the case labels for all the court documents using an old outdated version of IText for the sole purpose of substituting the Liberation Sans font (the metric equivalent font for Arial) for the Arial font that was working fine in his Adobe Reader. And he picked a font that was unavailable in all the Adobe PDF Tools that had to be embedded within each PDF court document file to even work in his Adobe Reader.

    And then he did something that no court official with even a rudimentary knowledge of document systems and embedded fonts would ever do (to modify all the electronic documents in a system containing countless different documents) — HE EMBEDDED THE LIBERATION SANS FONT AS a SUBSET. Which means only the fonts that are actually used in each document (and no more) are embedded within each document. Even a novice would then know that there would then be as many embedded font subsets as documents.

    All of this done just to replace an available font (Arial MT) for a metric equivalent font for Arial that is not available in all the popular PDF software tools.

    And then afterward, anyone needing to modify the case labels would have to override the Liberation Sans embedded font subset within each PDF document and substitute Arial for LIberation Sans.

    Genius ! Shear Genius !

  9. And then afterward, anyone needing to modify the case labels would have to override the Liberation Sans embedded font subset within each PDF document and substitute Arial for LIberation Sans.

    The case labels shouldn’t need modification!

  10. Poor Hermitian still is suffering from some unfamiliarity with embedded fonts

    And then he did something that no court official with even a rudimentary knowledge of document systems and embedded fonts would ever do (to modify all the electronic documents in a system containing countless different documents) — HE EMBEDDED THE LIBERATION SANS FONT AS a SUBSET. Which means only the fonts that are actually used in each document (and no more) are embedded within each document

    Yes, there was no need to add more characters to the file. That’s the beauty of embedded font, they allow you to support a font without having to include the complete font.

    The case labels are automatically added when you file a document through the ECF/CM and they may have realized that a better workflow would be to add the Liberation Sans font and wrote a simple script to do so. This is not rocket science.

    Hermetian may want to delve deeper in how the ECF/CM workflow happens and how these labels are added. I predict that they are simply added through a scripting function which invokes iText to add these labels.
    I also have the suspicion that they do more to ensure that the PDF’s remain legible on a variety of readers. More on ths later.

    Even a novice would then know that there would then be as many embedded font subsets as documents

    Yes, of course a lot of the fonts entries are the same: Such as page x of y so all they need to do is add the 10 digits. and voila. But this can be simply automated.

    I am surprised that Hermitian is not familiar with the advantages of embedded subsets of a font in PDF documents.

  11. “NBC

    “July 18, 2013 01:24

    “”And then maybe you can explain why the unknown person/entity, who later modified the 15-1.pdf and 35-1.pdf files with iText, picked the Liberation Sans font (which is the metric equivalent font for Arial) and replaced your precious Helvetica font. So why didn’t the unknown party pick the metric equivalent font for Helvetica?””

    “The ‘unknown’ person is the developer of the ECF/CM system. The reason why he replaced the font is a minor issue but probably because the font is open source and can thus be embedded.”

    What’s your proof that the developer of the PACER ECF?CM system made the modifications to many (or all) of the electronic court documents in the Obama Mississippi Ballot Challenge/RICO Federal law suit?

    [NBC: Common sense and the fact that ECF/CM is now using itext? Who else would modify the labels in these documents? Santa?]

    And just what was the “minor issue” that caused the developer of the ECF/CM system to modify these official court documents ?

    [NBC: Your reading abilities are quite poor. I never said this. I said that the change of labels is a minor and irrelevant issue.]

    And maybe you could also explain why the court official would be using an old outdated version of iText to modify official court records for a U.S. Federal court.

    [NBC: Another minor issue.]

    And while you are mulling over the answers to these three questions it might help to know that the original electronic court filing for PDF Document 15-1.pdf when opened in Adobe Illustrator CS6 correctly identified the font of the case label as Arial MT. Also, Adobe Acrobat XI Pro substituted the available fort Arial for the unavailable font Helvetica.

    [NBC: An irrelevant issue]

    Of course Arial is an an available font set within all of the major word processing programs, Adobe Illustrator, Adobe Acrobat, Adobe Reader and Adobe Photoshop and all the PDF XChange PDF Tools. So for any future modification of any of the case Labels any one of these readily available programs could be used to accomplished the modifications at no additional cost to the PACER/ECF/CM system.

    And given the fact that you possess inside information as to the internal activities of the developer of the PACER/ECM/CM system you should be willing to swear that only the document case labels were modified but not the original PDF documents.

    [NBC: Sigh… you are really hilarious. A researcher would not rely on the words of others but would rather do the tests to see what the new labeling approach has changed. I am sure that you have already done the work? After all it should be trivial in Adobe tools ROTFL… Or do you need some help with the low level tools that have helped me debunk your many ‘hypotheses’?]

    It’s so simple.

Comments are closed.