Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote resources not on web? #2610

Open
mattgarrish opened this issue Apr 3, 2024 · 9 comments · May be fixed by #2616
Open

Remote resources not on web? #2610

mattgarrish opened this issue Apr 3, 2024 · 9 comments · May be fixed by #2616
Labels
Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation Topic-General The issue applies generally to listed specification

Comments

@mattgarrish
Copy link
Member

I noticed the definition of remote resource is:

remote resource
A publication resource that is located outside of the EPUB container, typically, but not necessarily, on the web.

The "typically, but not necessarily, on the web" part looks like a relic from when we didn't ban the file:// protocol in URLs, or is there some other non-web hosting that we support? The recommendation now is to use https for anything not in the container.

I think we can make the "web" part a parenthetical clarification:

A publication resource that is located outside of the EPUB container (i.e., on the web).

Or even just say: "A publication resource that is located on the web." It kind of stands to reason that a web-based resource can't also be in the container.

@mattgarrish mattgarrish added Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation EPUB33 Issues addressed in the EPUB 3.3 revision Topic-General The issue applies generally to listed specification and removed EPUB33 Issues addressed in the EPUB 3.3 revision labels Apr 3, 2024
@iherman
Copy link
Member

iherman commented Apr 3, 2024

What happens to a resource that has a URL but is not http(s)? There is a plethora of such thing, like, for example, doi:. Is that up to the RS whether that is considered as valid?

@mattgarrish
Copy link
Member Author

A doi wouldn't refer to a publication resource, would it? Publication resources are things in the manifest used in rendering the publication. They're fine in metadata, etc.

In any case, you could refer to a resource obtained via ftp, for example, but I'd still think that'd have to be web based, at least in the general sense of accessing via the internet. The likelihood of reading systems supporting any references that aren't http/https is probably approaching zero, if not zero (or of authors using them). We don't require reading systems to support remote resources or any protocols for obtaining them. The only thing we do is ban file:

@mattgarrish
Copy link
Member Author

mattgarrish commented Apr 3, 2024

We don't require reading systems to support remote resources or any protocols for obtaining them.

Scratch that. We don't require it, but we recommend that they only support remote resources via https.

https://www.w3.org/TR/epub-rs-33/#sec-epub-rs-conf-remote-res

@iherman
Copy link
Member

iherman commented Apr 3, 2024

I did not remember the explicit recommendation on http(s)... But it is not banning non https resources.

I realize doi: is a bit convoluted, because, in practice, doi: URLs usually redirect to a Web resource. So it is (eventually) on the Web, just in a roundabout way. But ftp: is a good example; we can also refer to did: (although the probability of a did being used in an ebook is certainly close to zero).

That being said, what is now in the spec is not wrong and, paired with the recommendation for the RS, is also reasonably secure. I do not think there is any harm in leaving it as is...

@bduga
Copy link
Collaborator

bduga commented Apr 3, 2024

I agree with @iherman - there is no harm leaving it, and changing it to i.e. seems like a new requirement. Does "i.e. on the web" mean "MUST be on the web"? And then what does "on the web" mean? Is it a protocol thing (MUST use http(s)) or an access thing (MUST be available for public download)? I think that is a can or worms I would rather not open.

@mattgarrish
Copy link
Member Author

Is it a protocol thing (MUST use http(s))

Isn't that what we've turned it into by recommending https for remote resources, though? In a world where warnings are like requirements, we've already kind of cut people off from using anything else.

@bduga
Copy link
Collaborator

bduga commented Apr 3, 2024

I am not sure I agree with the statement that warnings are essentially requirements. In a previous job I held, there was an explicit allowlist of outright errors that we accepted from epubcheck because they were so prevalent, and all warnings were allowed. But "MUST use https" seems like a new requirement, and not the same as "on the web". For instance, a document available only on a corporate intranet might be accessible via https, but it doesn't seem like it is "on the web." If we want to make the change, I think it should be done by changing the SHOULD to a MUST in https://www.w3.org/TR/epub-rs-33/#sec-epub-rs-conf-remote-res though I don't support that change at the moment, mainly because I am not sure what real world problem it is fixing.

@mattgarrish
Copy link
Member Author

mattgarrish commented Apr 3, 2024

But "MUST use https" seems like a new requirement

Sure, but I think we're drifting from why I opened this.

I can agree that "i.e." is perhaps too strong a wording, although this feels like we're accommodating scenarios that are theoretically possible but never done in practice. When I re-read the definition, the contrast of a resource not being on the web is that it is available offline, connecting it to our allowing file: protocol for remote resources in the past.

Can we at least take out the "not necessarily" as redundant with "typically" and link this definition to the remote resources section for clarity:

A publication resource that is located outside of the EPUB container, typically on the web.

Refer to 3.6 Resource location for more information.

I'd also like to see a note in 3.6 that refers to the ban on the file protocol. There are a lot of jumps you have to make to connect the pieces together right now.

@iherman
Copy link
Member

iherman commented Apr 4, 2024

Can we at least take out the "not necessarily" as redundant with "typically" and link this definition to the remote resources section for clarity:

A publication resource that is located outside of the EPUB container, typically on the web.
Refer to 3.6 Resource location for more information.

I am fine with that.

@mattgarrish mattgarrish linked a pull request Apr 30, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation Topic-General The issue applies generally to listed specification
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants