Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify when the general occurance of Metadata and AF keys cannot actually occur due to other requirements #403

Closed
petervwyatt opened this issue May 4, 2024 · 4 comments
Assignees
Labels
bug Something isn't correct documentation Improvements or additions to documentation proposed solution Proposed solution is ready for review

Comments

@petervwyatt
Copy link
Member

The Metadata (XMP metadata) and AF (associated files) keys are generally described as able to occur on any object in a PDF. However, certain objects define other specific restrictions that mean these keys cannot actually be present - it is worth highlighting this to clarify this contradiction.

  • cross-reference streams require all keys to be direct (https://pdf-issues.pdfa.org/32000-2-2020/clause07.html#H7.5.8.2) so any key that refers to a stream cannot be used

  • 7.12 extensions dictionary prose states "The extensions dictionary, all developer extensions dictionaries, as well as their entries, shall be direct objects (i.e., this information shall be nested directly within the catalog dictionary with no indirect objects used)."

  • DigSig prose states "When a byte range digest is present, all values in the signature dictionary shall be direct objects."

  • F.3.3 Linearization parameter dictionary (Part 2) prose states "All values in this dictionary shall be direct objects"

  • F.3.6 prose states "With one exception, the values of all entries in the hint streams’ dictionaries shall be direct objects and may contain no indirect object references."

Is there anywhere else? Are there any other general keys that occur supposedly anywhere?

PS. This issue was discovered as part of our "future PDF" work for supporting new compression filters.

@petervwyatt petervwyatt added bug Something isn't correct documentation Improvements or additions to documentation labels May 4, 2024
@petervwyatt petervwyatt self-assigned this May 4, 2024
@petervwyatt
Copy link
Member Author

Proposed solution is to add a standardized informative notes to the locations relevant to the above list:

  • 7.5.8.2 Cross-reference stream dictionary", append note to end of subclause
  • 7.12.1 General, add note after paragraph (end of subclause)
  • 12.8.1 General, add note after 1st bullet after NOTE 1 (where above quote is situated)
  • F.3.3 Linearization parameter dictionary (Part 2), add note after 1st para (where above quote is situated)
  • F.3.6 Hint streams (Parts 5 and 10), add note below paragraph below NOTE (where above quote is situated)

with the following:

NOTE: due to the above requirement for direct objects, Metadata streams (see 14.3.2 Metadata streams) and Associated Files (see 14.13 Associated Files) cannot be included.

@petervwyatt petervwyatt added the proposed solution Proposed solution is ready for review label Jun 5, 2024
@petervwyatt
Copy link
Member Author

PDF TWG agree

@mrbhardy
Copy link

mrbhardy commented Jun 6, 2024

I don't think this is correct for cross-reference stream dictionaries (7.5.8.2). If we combine the errata linked here with the original text, it says:

  • The values of all entries shown in "Table 17 - Additional entries specific to a cross-reference stream dictionary" shall be direct objects; indirect references shall not be permitted. For arrays (the Index and W entries), all of their elements shall be direct objects as well. The values of all entries shown in "Table 5 - Entries common to all stream dictionaries" shall also be direct objects. For arrays, all array elements shall be direct objects and for dictionaries, all key values shall be direct objects as well. The F entry defined in Table 5 shall not be used.

  • Other cross-reference stream entries not listed in "Table 17 — Additional entries specific to a cross-reference stream dictionary" may be indirect; in fact, some (such as Root in "Table 15 — Entries in the file trailer dictionary") shall be indirect.

Given the second bullet clearly says that, if the entry isn't mentioned in Table 17, it may be indirect, that gives me permission to put indirect entries into the dictionary. The earlier statement in the first bullet that says:

or dictionaries, all key values shall be direct objects as well

Is contextualized by the earlier part stating the limitation is in context of entries in Table 17. So a sensible reading of this means it isn't illegal to have an indirect object as a value to a key not in Table 17 or Table 5.

@petervwyatt
Copy link
Member Author

@mrbhardy precisely - and that is the problem that's in all the locations. The intent is that everything needs to be direct but the current wording only references the explicit keys in Tables 17 and 5 and neglects the global(!) generic exceptions that metadata and AF state in their separate subclasses so there is ambiguity if those overrides apply or not.

In this specific case, I'd agree that the proposed NOTE could be reworded here to simply say Metadata and AF are not allowed in a factual manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't correct documentation Improvements or additions to documentation proposed solution Proposed solution is ready for review
Projects
None yet
Development

No branches or pull requests

2 participants