Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that script and script resources are exempt #2655

Closed
wants to merge 4 commits into from

Conversation

mattgarrish
Copy link
Member

@mattgarrish mattgarrish commented Oct 3, 2024

The current prose about exempt resources including resources not referenced from the spine or embedded in content documents already excludes these resources, but this creates a formal exemption group to make this clear.

The pull request does not get into whether WASM should be a core media type or be supported by reading systems, so it only rises to a class 3 change. (For that reason, I'm not setting this to auto-close #2649.)


Preview | Diff

<div class="proposed correction" id="change_1">
<span class="marker">Proposed Correction 1</span>
<p>The exemption for resources not referenced from the spine or embedded in content documents
(refer to the paragraphs following note below), includes resources referenced from
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really true - resources referenced from the script element itself did require a fallback before this change, so we are in fact loosening the restriction here. I am not sure if that still falls in class 3, maybe it does? But it isn't really a required change at this time, wasm is not technically importable via the script element (yet). Happy leaving this if we do decide this is still class 3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resources referenced from the script element itself did require a fallback before this change

I'd argue not this change but the change that added the exemption below. It was probably too broad in hindsight, as the script element doesn't render the text of its script (it's not embedding the script in the content document; it's there to execute). So if a script is not embedded content and not referenced from the spine, then it's exempt.

If we don't include this exemption explicitly, the only way to make script require core media types again is to do a different class 3 change that adds it along with the embeddable content elements. I'd rather just move ahead this way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am getting lost in the reviews here. Which is the change that added the exemption? At least in the github UI I see "Any resource referenced from the html script element that is not already a core media type resource [snip] is an exempt resource." That seems to be a newly added change. Are you saying that it was already implied because referencing from the script element isn't really embedding? It's an interesting argument, and hard to dispute since we don't really define what it means to embed something.

Copy link
Member Author

@mattgarrish mattgarrish Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting argument, and hard to dispute since we don't really define what it means to embed something.

Right, in 3.3 we added the new exemption for data files and such. The first condition is the resource not be in the spine and the second that:

it is not embedded directly in EPUB content documents (e.g., via [html] embedded content and [svg] image and foreignObject elements).

Embedding has always been the term we use for rendering a resource within a content document (via object, img, iframe, etc.). We usually use some form of "referenced from" if we mean any kind of reference (script, link, a, etc.).

script and link both allow references to resources without embedding them. That's partly why we made explicit that resources referenced from link exempt in 3.3 (plus there's a stronger case that nothing uses them than with script).

It's also why, at least to me, we've already (accidentally) opened the door to the script element not needing backups in the current spec. If we opt not to keep it exempt, we'd need to make that second bullet clearer.

Another reason it probably makes more sense to exempt script is that the only way you can provide an alternative for a resource referenced from it is through a manifest fallback. The less of that we have the better. 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a review of embed use in the spec, and I can't say the use is as clear as you make it sound, It is certainly never defined that way. We do talk about embedded scripts (i.e scripts that are written directly in a script element), so those scripts are embedded, but if you move it to another file they are referenced? But we don't have that distinction for images (a data url embedded image does not become referenced when we move it to a new file). And what about CSS? If it is in another file is it referenced and therefore exempt?

FWIW, my understanding of embedded vs referenced is embedded content is used by the current document in the (continuing) process of rendering it, while referenced content does not participate in the rendering of the current document, it is simply mentioned (links, etc). Of course, that isn't defined anywhere either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is certainly never defined that way.

No, that's a big part of the problem, but that might be a useful cleanup step for a future update.

But in this case we've selectively picked definitions and elements that are for rendering a resource directly. I can't square that with how script works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you arguing against having the exemption for script, though, or is this just about whether the resources were exempt before or not? (I'll get on editing out some of that prose now, since it's arguable both ways.)

I don't see how making script resources exempt would go up to a class 4 change, for example, even if we disagree over the current exemptions. Exempt resources are not new and exempting resources only gets them out of fallbacks. It doesn't add a new feature to the specification.

<dt id="exempt-script">Scripts and script resources</dt>
<dd id="confreq-resources-cd-fallback-script">
<p>Any resource referenced from the [[html]] <a data-cite="html#the-script-element"
><code>script</code></a> element that is not already a core media type resource
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/already// for brevity, but you are the wordsmith so final authority is yours :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I copied the text from exemption above but agree it doesn't add anything. Will take it out of both.

><code>script</code></a> element that is not already a core media type resource
(e.g., JavaScript) is an exempt resource.</p>
<p>In addition, any resources retrieved by a script (e.g., using an <code>import</code>
declaration [[ecmascript]] or the XmlHttpRequest [[xhr]] or Fetch [[fetch]] APIs) are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super happy with the examples here, since it seems to confuse exempt and foreign resources. At the very least it is a bit murky. Maybe just the import example? Or drop the examples since people seem to get very confused by them and decide it is an exhaustive list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose a script could read a foreign resource that is embedded somewhere else in the publication, which would make it bypass the exemption.

It might be better to leave the second paragraph out and let the exemption below continue to handle exempt files for processing. I can't think of how we make this clearer without duplicating the two exemption scenarios.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rereading it this is probably fine as is. Do people really have to list remote resources loaded via XmlHttpRequest in the manifest? I guess they do. Scripting is horrible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do people really have to list remote resources loaded via XmlHttpRequest in the manifest? I guess they do.

Yup, that's my take, too. Used in the rendering = publication resource = manifest listing required.

Scripting is horrible.

Ya, hoping we could get by in 3.0 underspecifying how scripting works wasn't the best of ideas.

What makes it more confusing is that you can't host most resources outside the container, unless, seemingly, you fetch them by script. So, if you want remote images, just fetch them and write them into a data url in an img tag. Spec rules circumvented.

@iherman
Copy link
Member

iherman commented Oct 8, 2024

(I comment on the version after 1784159; @bduga, @mattgarrish it would be better to officially "resolve" the conversations, to make any further discussion unambiguous.)

I am fine with the text as is now. I agree it also settles the original issue in #2655: after some cursory look at some examples out there, it seems that .wasm files are to be fetched in a JavaScript module, which is covered by the new text and firmly places these files as exempts. (Without even mentioning them, although it may be a good idea to add it as an example to the text to make things absolutely clear)

However, I continue to feel uneasy on declaring this new text as class 3, i.e., include it in the upcoming publishing round. I may buy the argument that this is what we always intended to happen. However, it looks like we simply forgot to make an explicit case for the script element alongside the link element, and this PR takes care of this mistake. It very much "smells" a new feature, which may lead to disagreeable discussions on process issues if we want to push it through the current PM WG route. I do not think we need a formal objection on our hands, even if it is, at the end, unjustified.

If we were at the beginning of the WG's life span, I would say "ok, let's try it". However, with the discussions on the new charter in the making, with the hopeful new incarnation of the PM WG aiming at EPUB 3.X, I would really prefer to consider this as a class 4. We should find a proper formulation to replace bullet point 4 in the charter scope, and keep this PR open until we begin to work on the new version. We could live without .wasm files up until now, I do not think that is a highly urgent issue.

cc @wareid @shiestyle

@mattgarrish
Copy link
Member Author

We could live without .wasm files up until now, I do not think that is a highly urgent issue.

Well, this change has no effect on the ability to include and use wasm files right now, which you can do.

Epubcheck doesn't inspect js code, so fetching/instantiating a wasm file isn't going to be flagged. And as it's not referenced from one of the embedding elements in the current exemption, the wasm file will fly under the radar in the manifest as just travelling in the container.

The only practical effect this would have is to allow any resource from script without a fallback, something that epubcheck can and will flag. But that's hardly a critical case since js is the only scripting language with broad support (and seems unlikely reading systems would support any of the more obscure possibilities given js support isn't even universal).

In other words, I don't think it really matters when we formally change anything. And as this is dragging on our timeframe to publish, I can get onboard with leaving it until we get the updated versions out.

@bduga
Copy link
Collaborator

bduga commented Oct 8, 2024

The only practical effect this would have is to allow any resource from script without a fallback, something that epubcheck can and will flag. But that's hardly a critical case since js is the only scripting language with broad support (and seems unlikely reading systems would support any of the more obscure possibilities given js support isn't even universal).

+100

In other words, I don't think it really matters when we formally change anything. And as this is dragging on our timeframe to publish, I can get onboard with leaving it until we get the updated versions out.

Agreed. I don't disagree with this change is principle, but it seems like extra credit for some future potential scripting spec and not worth blocking on.

@bduga
Copy link
Collaborator

bduga commented Oct 8, 2024

Epubcheck doesn't inspect js code, so fetching/instantiating a wasm file isn't going to be flagged. And as it's not referenced from one of the embedding elements in the current exemption, the wasm file will fly under the radar in the manifest as just travelling in the container.

I would make the stronger argument that it is explicitly allowed today, and called out in the spec with "The primary case for this exemption is to allow data files to travel with an EPUB publication, whether for scripts to use in their constituent EPUB content documents ..."

@mattgarrish
Copy link
Member Author

The primary case for this exemption is to allow data files to travel with an EPUB publication

And it ends with the case of a data file in a journal. That's the only case I remember us taking up, not that data file means any code file. It's also the only purpose noted in the use cases that were the basis of this exemption: https://docs.google.com/document/d/1msVuFxWk7ALQZ6PAaOy6MvO__s6L5r6vy-icqLWZRy4/edit

@bduga
Copy link
Collaborator

bduga commented Oct 8, 2024

The primary case for this exemption is to allow data files to travel with an EPUB publication

And it ends with the case of a data file in a journal. That's the only case I remember us taking up, not that data file means any code file.

But that is just an example, not an exhaustive list. And I don't think the journal file was meant to be related to scripting, it was just a bonus file users could find. So there are no examples of data files for scripts, but I don't see how that changes anything. A wasm file pretty clearly falls into the category of data file used by scripts.

@mattgarrish
Copy link
Member Author

mattgarrish commented Oct 8, 2024

And I don't think the journal file was meant to be related to scripting,

Except that the file I pointed to says:

Use case 3: As a journals publisher, I want to embed data files in the package. I don’t expect the RS to be able to render them natively, but I want them there either a)to be rendered by a JS-based renderer that I supply, or b) to be there for completeness/longevity, and/or for the user to retrieve and render in another application on the platform.

That's exactly the use case that the exemption talks about.

I could also throw out the definition of a data file from wikipedia:

A data file is a computer file which stores data to be used by a computer application or system, including input and output data. A data file usually does not contain instructions or code to be executed (that is, a computer program).

But it seems we're forever going to disagree until we revisit it and decide what we want.

(And for the record, I'm all on board with code being exempt, and accidentally being exempt now; I just don't agree it was intended to be captured in the current exemption. We're kind of fighting over nothing! 😄 )

@mattgarrish
Copy link
Member Author

Closing this off, as if we're not going ahead with it now then who knows what state the spec might be in when we return to it, or if we want to add other clarifications. It's easy enough to reopen it or lift the changes if we can use it.

@iherman
Copy link
Member

iherman commented Oct 9, 2024

Actually, also with regard to w3c/publ-maintenance-wg-charter#30, it may be better to leave this PR open and label it as status-deferred. @mattgarrish wdyt?

@mattgarrish
Copy link
Member Author

mattgarrish commented Oct 9, 2024

PR open and label it as status-deferred

Maybe it's my OCD kicking in, but I hate having open pull requests like #2527 that sit around. Plus it required the other proposed corrections to be renumbered, so I just think it's going to be a mess compared to starting over once we work out anything and everything that needs fixing.

It's probably better to refer to #2649 and #2656 than this pr.

@iherman
Copy link
Member

iherman commented Oct 10, 2024

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wasm support
4 participants