Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fields and Widgets, Widgets and Fields. #28

Open
faceless2 opened this issue Sep 28, 2022 · 6 comments
Open

Fields and Widgets, Widgets and Fields. #28

faceless2 opened this issue Sep 28, 2022 · 6 comments

Comments

@faceless2
Copy link
Collaborator

faceless2 commented Sep 28, 2022

It was inevitable this was going to come up at some point.

First, I'm assuming a processing model which means a node in the PDF can be of more than one type. Traverse to a combined field+widget from Fields? It's validated as Field. Traverse from a Page? It's also validated as a Widget. Everything below assumes that model, if that's not how you do it I guess you can ignore the whole thing.


Currently there are 3 types, Field (an untyped field with no FT), FieldNNN (a typed field with FT) and AnnotWidget. And there is a single type for a list of these items, ArrayOfFields which is used for both Fields in the Form and Kids in the Fields. It's a list of: [FieldTx,FieldBtn,FieldCh,FieldSig,Field,AnnotWidget] - I'm ignoring the predicate for FieldSig.

This means that we have the following allowed behaviour:

  1. The form can contain a Fields array that references a widget that has no field (either combined or as a parent)
  2. A widget can belong to a Field with no FT, or belong to no field at all.
  3. The form Fields array can point to elements with a Parent
  4. There is no requirement for consistency between the Parent and Kids arrays
  5. If a Field is combined with a widget, there is no check to ensure it has no Kids
  6. There is no requirement for a Field to have any Widgets.

I think all of those are disallowed (happy to justify if required), so here's a proposal to remedy this.

To fix the first two issues you could split ArrayOfFields into ArrayOfFieldsOrWidgets. Your types then look like

Form
  Fields [ArrayOfFields]

Field
  Parent [Field,FieldTx,FieldCh,FieldBtn,FieldSig]
  Kids [ArrayOfFields]

FieldTx, FieldCh etc
  Parent [Field,FieldTx,FieldCh,FieldBtn,FieldSig]
  Kids [ArrayOfFieldsOrWidgets]

AnnotWidget
  Parent [FieldTx,FieldCh,FieldBtn,FieldSig]
  Kids [none - it's currently defined as ArrayofFields, but should be removed]

ArrayOfFields
  * [Field,FieldTx,FieldCh,FieldBtn,FieldSig]

ArrayOfFieldsOrWidgets
  * [FieldTx,FieldCh,FieldBtn,FieldSig,AnnotWidget]

The last issues can be done with some magic in your SpecialCase field - we need to check

  • if we have a Parent, we're in the Parent's Kids
  • if we don't have a Parent, we're in the Fields array in the Form
  • if we are a terminal field and are not combined with a widget, we have one or more widgets
  • if we are a terminal field and are combined with a widget, we have no Kids

because the rules for Fields are:

Parent - (Required if this field is the child of another in the field hierarchy; absent otherwise) The field that is the immediate parent of this one (the field, if any, whose Kids array includes this field). A field can have at most one parent; that is, it can be included in the Kids array of at most one other field.

Kids - In a non-terminal field, the Kids array shall refer to field dictionaries that are immediate descendants of this field. In a terminal field, the Kids array ordinarily shall refer to one or more separate widget annotations that are associated with this field. However, if there is only one associated widget annotation, and its contents have been merged into the field dictionary, Kids shall be omitted.

and for Widgets:

Parent - (Required if this widget annotation is one of multiple children in a field; optional otherwise) An indirect reference to the widget annotation’s parent field. A widget annotation may have at most one parent; that is, it can be included in the Kids array of at most one field

I think we can represent all that with anfn:Eval that looks like this (expanded to make it a bit more legible):

(
 ((@Parent==null) && (fn:InArray(trailer::Root::AcroForm::Fields))) ||
 ((@Parent!=null) && (fn:InArray(parent::Kids)))
) && (
 ((@Subtype==Widget) && (Kids==null)) ||
 ((@Subtype==null) && (fn:ArraySize(Kids)>0))
)

It's using /Subtype/Widget as the test for "is a widget", which is not quite right, and I've also just invented fn:InArray, and presumed that ==null is the same as "field is not there" - which probably isn't the case. However I think the logic is correct.

Finally, as an alternative if you don't want to go crazy with the special case field, I think we could capture the same logic by splitting FieldTx into lots of subtypes eg FieldTxNonTerminal, FieldTxTerminal, FieldTxTerminalCombined etc, with the same for the other field types. It's a more declarative but explodes the number of types.

Sorry, that's a rough one to start the day with.

@faceless2
Copy link
Collaborator Author

faceless2 commented Sep 28, 2022

Incidentally I tried the first suggestion, the splitting of ArrayOfFields into ArrayOfFieldsAndWidgets, and it tested well against some valid forms combining combined and uncombined fields, and fields with name hierarchies.

EDIT 20201001 - the one complexity is Parent in AnnotWidget - if it contains FT, we also need to allow Field as an option for Parent

@faceless2
Copy link
Collaborator Author

Another followup on this:

Field places restrictions on Ff - [fn:Eval(fn:BitsClear(4,32))]. But this is an intermediate type - it must have an entry in the Kids array which is a FieldNNN, and any restrictions on the Flags are checked there. So I don't think this restriction should be here.

@faceless2
Copy link
Collaborator Author

faceless2 commented Oct 3, 2022

At the very least on this one, even if none of the above changes are applied:

AnnotWidget Parent should go from Field to [FieldTx,FieldBtnPush,FieldBtnCheckbox,FieldBtnRadio,FieldChoice,fn:SinceVersion(1.3,FieldSig),Field] - otherwise Parent cannot be a terminal field, which is clearly not right.

On reflection this is an aspect of our implementation not the model.

@petervwyatt
Copy link
Member

I'm starting to work on this.
Just did some tidy up and referencing of all Annot*.tsv.

petervwyatt added a commit that referenced this issue Jan 18, 2023
"Required if field contains variable text" is not yet captured via predicates. See also Issue #28
@bdoubrov
Copy link
Collaborator

bdoubrov commented Dec 8, 2023

This issue still pops up regularly in the tests, as Field and AnnotWidget have different permitted entries. One of the solutions would be to have a separate type for the merged Field+Widget dictionary. This would lead in fact to the following new types:

AnnotWidgetField, AnnotWidgetFieldTx, AnnotWidgetFieldCh, AnnotWidgetFieldBtn, AnnotWidgetFieldSig

This looks a bit weird, but I don't immediately see any other nicer solutions.

@bdoubrov
Copy link
Collaborator

Just as an update, the latest version of veraPDF-based Arlington app implements the above suggestion when doing the conversion of the .tsv files to the veraPDF Arlington profile. Seems to work as expected

https://software.verapdf.org/develop/arlington/1.25/verapdf-arlington-1.25.236-installer.zip

@petervwyatt petervwyatt added this to the PDF Data Model milestone Mar 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants