Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch: allow colons (:) in dataset URLs #702

Open
joverlee521 opened this issue Aug 10, 2023 · 2 comments
Open

fetch: allow colons (:) in dataset URLs #702

joverlee521 opened this issue Aug 10, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@joverlee521
Copy link
Contributor

Context

The colon (:) is reserved as separator for the dual tree display, so we explicitly split all paths by colon.

This results in 404 errors when loading URLs with colons through /fetch

For example, Nextclade dataset reference trees with timestamps in the URL cannot load since the URL gets truncated:
https://nextstrain.org/fetch/data.clades.nextstrain.org/datasets/flu_h3n2_ha/references/EPI1857216/versions/2023-04-02T12:00:00Z/files/tree.json

This behavior was reported by @corneliusroemer on Slack.

@joverlee521 joverlee521 added the enhancement New feature or request label Aug 10, 2023
joverlee521 added a commit to nextstrain/docs.nextstrain.org that referenced this issue Aug 10, 2023
Since the colon (:) is reserved as the separator for dual tree displays,
URLs that include colons result 404 errors as the nextstrain.org server
splits paths on the colon.¹ This results in a 404 error as the dataset
URL is truncated (e.g. when trying to load Nextclade dataset ref trees)²

We may update the nextstrain.org server to handle colons in dataset URLs
in the future, but better to note this behavior now in case others run
into the same issue.

¹ https://github.com/nextstrain/nextstrain.org/blob/221c67e5dad919edea65787bc11e4cc2227c42ce/src/endpoints/sources.js#L263
² nextstrain/nextstrain.org#702
joverlee521 added a commit to nextstrain/docs.nextstrain.org that referenced this issue Aug 10, 2023
Since the colon (:) is reserved as the separator for dual tree displays,
the nextstrain.org server splits paths on the colon.¹ This results in a
404 error as the dataset URL is truncated (e.g. when trying to load
Nextclade dataset ref trees)²

We may update the nextstrain.org server to handle colons in dataset URLs
in the future, but better to note this behavior now in case others run
into the same issue.

¹ https://github.com/nextstrain/nextstrain.org/blob/221c67e5dad919edea65787bc11e4cc2227c42ce/src/endpoints/sources.js#L263
² nextstrain/nextstrain.org#702
@tsibley
Copy link
Member

tsibley commented Aug 17, 2023

A workaround already exists via double-encoding, e.g.

https://nextstrain.org/fetch/data.clades.nextstrain.org/datasets/flu_h3n2_ha/references/EPI1857216/versions/2023-04-02T12%253a00%253a00Z/files/tree.json

However, Auspice (I believe) updates the URL after load and ends up only singly-encoding colons so the displayed URL isn't usable as is (e.g. reloading the page doesn't work). We could potentially update Auspice to DTRT here when it gets an encoded colon.

@jameshadfield
Copy link
Member

This results in 404 errors when loading URLs with colons through /fetch

To allow : in fetch paths whilst also supporting tanglegrams such as /fetch/data.nextstrain.org/measles_genome.json:fetch/data.nextstrain.org/measles_N450.json seems like it'd take a lot of error prone path checking code. Double encoding them is a nice solution.

Auspice (I believe) updates the URL after load and ends up only singly-encoding colons so the displayed URL isn't usable as is (e.g. reloading the page doesn't work). We could potentially update Auspice to DTRT here when it gets an encoded colon.

Yeah, Auspice is doing a single round of URL decoding. Happy for changes to Auspice to be proposed here. BTW Auspice logs this change, e.g. for your link above:

Pathname for "fetch/data.clades.nextstrain.org/datasets/flu_h3n2_ha/references/EPI1857216/versions/2023-04-02T12%253a00%253a00Z/files/tree.json" changing to "fetch/data.clades.nextstrain.org/datasets/flu_h3n2_ha/references/EPI1857216/versions/2023-04-02T12%3a00%3a00Z/files/tree.json".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

No branches or pull requests

3 participants