-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verification of use of remote resources #869
Comments
So the spec says:
My understanding is that it only applies to foreign resources, correct? Maybe this could be clarified in the spec? |
No, a script can grab whatever type of resource it needs. They can be core media types or foreign resources. |
Oh OK. So a remote CMT would be allowed if not referenced directly (e.g. remote images allowed if not in an |
Right, if a script can grab a resource by an xhr call or fetch, it's all legitimate. The additional requirement that resources used by scripts in the rendering have to be CMTs or provide an adequate fallback was intended to ensure rendering regardless of which type of resource is retrieved, even though it's not really testable. But scripting generally undermines all content restrictions, as you can write and rewrite just about anything after the fact, including what the In theory, all these resources have to be listed in the manifest and the remote-resources flag applied to their content document, but given that we can't test what scripts actually do, you could even ignore doing any of that and still have a legit epub. Sorry, that was probably more rambling than you wanted... :) |
Yeah, exactly… which kinda makes me wonder why we have any restrictions on remote resources. I could use a remote image, as long as it is somehow added by a script. What's the point of disallowing using remote images then? And not even images, your EPUB could actually consist of empty-shell XHTML content documents downloading remote XHTML documents. |
To be clear: I thought this new exception was intended for scripts to be able to download external data resources, to be later rendered via dynamically-generated HTML or SVG. I wasn't aware it was intended to allow directly-rendered remote content (mostly images or HTML). Is there really a big diff between: <body>
<img id="img" src="https://www.w3.org/2008/site/images/logo-w3c" alt="w3c"/>
</body> and <body onload="document.getElementById('img').setAttribute('src','https://www.w3.org/2008/site/images/logo-w3c')">
<img id="img" src="#" alt="w3c"/>
</body> That the former is non-conforming and the latter conforming feels like an arbitrary restriction. |
Right.
It's not, but that's where I kind of went off on a rant. It's trivial to work around everything EPUB recommends or requires with a bit of script. It's only the general lack of support for network access that works as a deterrent. |
The latter isn't conforming, but it's not easily detectable so how do we prevent it? You can't reference remote scripts, but what stops you from fetching a script and writing it into the dom? You don't even have to declare what you're accessing is a script. Call it a text resource. But let's not go down this rabbit hole... |
I'm not sure I understand: why is it not conforming with the current wording?
If we want to be more restrictive we could for instance forbid referencing of remote |
The script rewrites the |
ok, right. |
Could you do this without executing the script, though? If you don't know what the script is doing, and I don't declare it in the manifest, then the epub is valid. I think we've opened an exploit here, for what was proposed as needed for remote data sets, but it is what it is. We really need a more detailed JS model for EPUB, but we can't solve that here. Reading systems can prevent remote access calls, so that's ultimately what will determine how usable this is. History suggests significant interop problems. |
And it's not all new to this version. You could always try to rewrite src attributes or use xhr to pull files. I just wonder sometimes what the point of the content restrictions are when they only exist until run time. |
Well, only if the resources are also declared in the OPF. It's true that if they're not even declared, there's absolutely nothing EPUBCheck can do (as was the case before).
Yeah, this exploit have always kinda been there, as it was never (and probably never will be) reported by EPUBCheck. Still, the current spec wording doesn't really express the intent I think. But that's probably not an issue for this tracker :-) |
Message changes: - message `OPF-018` (`WARNING`) is only reported when a Content Doc is declared in the Package Document with the property `remote-resource`, and neither remote references nor scripts are found. - new message `OPF-018b` (`USAGE`) reported when a Content Doc is declared in the Package Document with the property `remote-resource`, and no remote references are found but the Content Doc has scripted content. - message `RSC-006` (`ERROR`) is only reported when a remote publication resource is declared in the Package Document with a type that doesn't specifically allow it to be located outside the container, and when no scripted content is found. - new message `RSC-006b` (`USAGE`) is reported when a remote publication resource is declared in the Package Document with a type that doesn't specifically allow it to be located outside the container, and scripted content is found. Internal changes: - new utility static method `OPFChecker.isScriptType(String)` tells if a media type is a JavaScript MIME type essence match (see https://mimesniff.spec.whatwg.org/#javascript-mime-type-essence-match) - better identify scripted content: from event handler attributes, or script element with a type matching a JavaScript MIME type essence. - add a couple tests and fix existing tests Fix #869.
Message changes: - message `OPF-018` (`WARNING`) is only reported when a Content Doc is declared in the Package Document with the property `remote-resource`, and neither remote references nor scripts are found. - new message `OPF-018b` (`USAGE`) reported when a Content Doc is declared in the Package Document with the property `remote-resource`, and no remote references are found but the Content Doc has scripted content. - message `RSC-006` (`ERROR`) is only reported when a remote publication resource is declared in the Package Document with a type that doesn't specifically allow it to be located outside the container, and when it is directly referenced from a Content Document, or used in the spine, or else if there is no scripted content in the publication. - new message `RSC-006b` (`USAGE`) is reported when a remote publication resource is declared in the Package Document with a type that doesn't specifically allow it to be located outside the container, and scripted content is found. Internal changes: - new utility static method `OPFChecker.isScriptType(String)` tells if a media type is a JavaScript MIME type essence match (see https://mimesniff.spec.whatwg.org/#javascript-mime-type-essence-match) - better identify scripted content: from event handler attributes, or script element with a type matching a JavaScript MIME type essence. - add a couple tests and fix existing tests Fix #869. fixup: better check remote resources
Message changes: - message `OPF-018` (`WARNING`) is only reported when a Content Doc is declared in the Package Document with the property `remote-resource`, and neither remote references nor scripts are found. - new message `OPF-018b` (`USAGE`) reported when a Content Doc is declared in the Package Document with the property `remote-resource`, and no remote references are found but the Content Doc has scripted content. - message `RSC-006` (`ERROR`) is only reported when a remote publication resource is declared in the Package Document with a type that doesn't specifically allow it to be located outside the container, and when it is directly referenced from a Content Document, or used in the spine, or else if there is no scripted content in the publication. - new message `RSC-006b` (`USAGE`) is reported when a remote publication resource is declared in the Package Document with a type that doesn't specifically allow it to be located outside the container, and scripted content is found. Internal changes: - new utility static method `OPFChecker.isScriptType(String)` tells if a media type is a JavaScript MIME type essence match (see https://mimesniff.spec.whatwg.org/#javascript-mime-type-essence-match) - better identify scripted content: from event handler attributes, or script element with a type matching a JavaScript MIME type essence. - add a couple tests and fix existing tests Fix #869. fixup: better check remote resources
Scripts are allowed to reference remote resources, so whether a remote foreign resource listed in the manifest is in actual use or not may be complicated to check. It might make sense to only issue an informative message that no reference could be found, or emit nothing at all, if all resources are checked to see if they are in use.
Reference: https://w3c.github.io/publ-epub-revision/epub32/spec/epub-spec.html#sec-resource-locations-script
The text was updated successfully, but these errors were encountered: