PDF-Hul: Bug in skipIISBytes and PdfModule.getObject #151

pmay · 2016-10-11T12:33:13Z

Many documents returning Invalid Page Dictionary Object seem to be the result of a bug in PdfFlateInputStream.skipIISBytes which is miscalculating the number of bytes to skip when the requested skip number is larger than the remaining buffer size.

In particular, this seems to relate to Page Trees encoded in stream objects where the root page starts beyond one buffer's worth of data.

The added InvalidPageDictionary.pdf file exemplifies this problem.

Note solving this problem results in "Improperly Constructed Page Tree" being returned by JHOVE. This seems to be being caused by JHOVE not correctly setting the object index for objects extracted from streams, meaning that when PageTreeNode.nextPageObject (line 197) tries to check if it's already visited a node, it fails (essentially it compares index -1 to index -1).

…e-#82 FIX - Update README post v1.14

Release 1.14

…d test PDF to demonstrate this.

codecov-io · 2016-10-11T12:38:39Z

Current coverage is 3.43% (diff: 0.00%)

No coverage report found for integration at 144a26e.

Powered by Codecov. Last update 144a26e...0e3715d

…d from an object stream. This resulted in Improperly constructed page tree errors

david-russo

Looks good to me.

david-russo · 2016-11-04T10:12:56Z

jhove-modules/src/main/java/edu/harvard/hul/ois/jhove/module/PdfModule.java

- return ostrm.getObject (objIndex);
+ /* Need to ensure the object number is set */
+ PdfObject obj = ostrm.getObject (objIndex);
+ obj.setObjNumber (objIndex);


Setting the object number can be safely moved into the ObjectStream.getObject() method for the benefit of any other callers.

carlwilson · 2017-03-20T16:56:58Z

This is fixed by #188

carlwilson and others added 4 commits May 13, 2016 19:56

Merge pull request openpreserve#83 from openpreserve/fix-update-readm…

a576324

…e-#82 FIX - Update README post v1.14

Merge pull request openpreserve#85 from openpreserve/release-1.14

c244fe9

Release 1.14

Corrects bug in skipIISBytes which miss-skips a number of bytes. Adde…

8fd8021

…d test PDF to demonstrate this.

Updated gitignore to ignore .iml files

44a3dce

pmay added the in progress label Oct 11, 2016

Corrects bug where object indexes are not stored if they are extracte…

0e3715d

…d from an object stream. This resulted in Improperly constructed page tree errors

pmay changed the title ~~PDF-Hul: Invalid Page Dictionary Object - bug in skipIISBytes~~ PDF-Hul: Bug in skipIISBytes and PdfModule.getObject Oct 11, 2016

pmay added 3 commits October 11, 2016 21:17

Merge branch 'integration' into integration

daf3abb

Merge branch 'integration' into integration

1c287c2

Merge remote-tracking branch 'upstream/master' into integration

4eb5ee0

david-russo approved these changes Nov 4, 2016

View reviewed changes

BezrukovM mentioned this pull request Mar 1, 2017

Merging PRs and fixing issues #188

Merged

carlwilson closed this Mar 20, 2017

carlwilson removed the in progress label Mar 20, 2017

This was referenced Sep 30, 2017

Problem with PDF annotation dictionaries #113

Closed

PDF module error with TeX-created documents #112

Closed

rgfeldman added a commit to rgfeldman/jhove that referenced this pull request Apr 10, 2019

openpreserve#151

80f5eb3

rgfeldman added a commit to rgfeldman/jhove that referenced this pull request Apr 10, 2019

openpreserve#151

9430bc9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF-Hul: Bug in skipIISBytes and PdfModule.getObject #151

PDF-Hul: Bug in skipIISBytes and PdfModule.getObject #151

pmay commented Oct 11, 2016 •

edited

Loading

codecov-io commented Oct 11, 2016 •

edited

Loading

david-russo left a comment

david-russo Nov 4, 2016 •

edited

Loading

carlwilson commented Mar 20, 2017

PDF-Hul: Bug in skipIISBytes and PdfModule.getObject #151

PDF-Hul: Bug in skipIISBytes and PdfModule.getObject #151

Conversation

pmay commented Oct 11, 2016 • edited Loading

codecov-io commented Oct 11, 2016 • edited Loading

Current coverage is 3.43% (diff: 0.00%)

david-russo left a comment

Choose a reason for hiding this comment

david-russo Nov 4, 2016 • edited Loading

Choose a reason for hiding this comment

carlwilson commented Mar 20, 2017

pmay commented Oct 11, 2016 •

edited

Loading

codecov-io commented Oct 11, 2016 •

edited

Loading

david-russo Nov 4, 2016 •

edited

Loading