Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Formal URL API #1365

Open
NickCrews opened this issue Sep 13, 2023 · 7 comments
Open

Feature: Formal URL API #1365

NickCrews opened this issue Sep 13, 2023 · 7 comments

Comments

@NickCrews
Copy link

First, thanks for your work on this, I think vega is brilliant.

I want to make it easy to share interactive versions of charts with coworkers. I am using altair to make charts. I could export as a static HTML, but that involves uploading and downloading files.

In your webui, you can get a shareable link to a chart by encoding the JSON spec into a LZ compressed string, which is then stored directly in the URL. Brilliant! I can share this URL with coworkers and it's super easy.

Currently, I can go through this flow:

  1. use chart.to_json() from Altair
  2. copy that json string manually into the webui
  3. use the webui to get the shareable link

I would like to make this more streamlined, so there is something like Chart.to_url() which gives me the link directly.

The problem is that the web UI uses a custom implementation of LZ compression, and while there is a python port it is outdated and doesn't work (doesn't decode from URI). So from python, I don't have a good way of encoding the JSON spec into the URL.

To make this work, I think we would need to:

  1. Get buy in from you that this is worthwhile
  2. Switch to have a more formal/portable URL encode/decode algorithm so other languages can write to it
  3. Ideally, formalize the URL API even more so you can choose options like "enter in fullscreen"
  4. Ideally, have an option to make the viewed chart super clean and minimal, even more minimal than fullscreen.
  5. Ideally, add this .to_url() method to altair.Chart. But I can write my own helper function for this, so not really needed.
@NickCrews
Copy link
Author

NickCrews commented Sep 13, 2023

OK I just did a quick prototype. Per some googling, I found this article, which recommends brotli for JSON. Brotli is well-defined, mature, and available in most languages.

Here is some python that encodes the json with brotli, and then base64 encodes it so it is valid in a URI:

import brotli
import base64
import altair as alt
from vega_datasets import data

source = data.cars()
chart = alt.Chart(source).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    size='Acceleration'
)
json = chart.to_json()
json_bytes = json.encode("utf-8")
compressed = brotli.compress(json_bytes)
encoded = base64.b64encode(compressed)
decoded = base64.b64decode(encoded)
decompressed = brotli.decompress(decoded)
json_restored = decompressed.decode("utf-8")
assert json_restored == json
print(encoded)
# b'W8zWIYqypFoCrAu4w8rgT3h3YQLVJXA+/lZfywhGSDK7uVQegr....
print(len(encoded))
# 9156

This is that chart in the web viewer

If I share that URL, it is 19512 characters long, twice the length of the 9k from the brotli encoding (if I am understanding this right)!

So not only would we get a more portable API, but also we get a performance boost! With the current lz implementation, if I have very many datapoints (> ~500??) then the URL gets too long and many apps like slack and Google Docs stop working with them, defeating the point.

@domoritz
Copy link
Member

domoritz commented Sep 14, 2023

Vega-Embed has an "open in the editor" action that uses an API in the editor to load the spec. Maybe that works as well (although you can only trigger it from js: https://github.com/vega/vega-embed/blob/880d55f7cf57e27716c510ad73715db877cd718c/src/post.ts#L29).

I am all open to use brotli but we do need to be backwards compatible. We need to use a different URL, I suppose. Would you like to send a pull request?

@NickCrews
Copy link
Author

NickCrews commented Sep 15, 2023

I'm a little intimidated by the typescript, I have never worked with it before. I can try to task a stab at it, but I might give up.

First, we should figure out a rough shape of the API. To make it backward-compatible, we could move to parameters, something like:

https://vega.github.io/editor/#/url/vega-lite?
brotli={encoded}&
view={"edit"/"fullscreen"/something else}

but IDK, I am no expert in web APIs so I'm not sure what the conventions are here. I want to figure this out before I actually implement anything.

Like should it be this?

https://vega.github.io/editor/#/url?
format={"vega-lite"/"vega"/something else}&
encoding={"brotli"/"lzstring"/something else}&
data={encoded}&
view={"edit"/"fullscreen"/something else}

@domoritz
Copy link
Member

domoritz commented Sep 15, 2023

These are the current URLs.

https://vega.github.io/editor/#/examples/vega-lite/rect_heatmap
https://vega.github.io/editor/#/custom/vega
https://vega.github.io/editor/#/custom/vega-lite
https://vega.github.io/editor/#/gist/455e1c7872c4b38a58b90df0c3d7b1b9/bar.vl.json
https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JACyUAVhDYA7EABoQAEzjQATjRyZ289AEEABBBoIcguIaZJ1h2DcyGA7nRiHETOMtXLDypJhUiiAuyvRoAMwAbAAMSv6BaKDESIIMamgA2qAoBsFMaABMABwAvgo5aCAAQvloAKz15ZXoAMJ1qGIRzSC5IAAiHQCcAIw9fQCiHcVjFb1VAGId9d1zfQDiHSND41UAEtMA7LvoAJLLhaUAuuUgyOoA1lXW6sFwslBsyjSyZEkgAA9-gAzGhwQTKKooJSYACeODgVTY6m+slSIFusJBYIhz2CcIRVQAjgwkLIdIEdKQMTC2GxBDocNjwZD0AUYfDEegSWSKQEaNTSkKgA
https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289AEEABBBoIcguIaZJ1h2DcyGA7nRiHETOMtXLDypJhUiioBKKigxEiCDGpoANqgYSD6wUxoAEwAHAC+ColoIABCqWhiYrn56ADCJagALADMFSBJACK1AJwAjM1JAKK1mT15LQUAYrViTSNJAOK1XR29BQASgwDsy+gAkpPp2QC6uSDI6gDWBdbqwXCyUGzKNLJkaKAAHq8gAGY0cILKBRQSkwAE8cHACrI2AgnlFgkg3jQIJ9BEhPIJ9M8LGgAAzZY4gz4-P4A9BpYFgiHoACODCQsh0gR0pBA+OyQA/view

Maybe the easiest would be to have a prefix brotli- before the encoded string. We could check for that prefix and then call the right decoder. It's not totally failsafe since the encoded string might randomly have this format but it seems super unlikely.

https://vega.github.io/editor/#/url/vega-lite/brotli-XXXX

Alternatively, we could use https://vega.github.io/editor/#/url/vega-lite/brotli/XXXX or https://vega.github.io/editor/#/url-brotli/vega-lite/XXXX.

@domoritz
Copy link
Member

Btw, I looked at lz-string and the replacement for uri encoding is pretty simple: https://github.com/pieroxy/lz-string/blob/35cdd797ae7415211add846e529669643e893904/src/main.ts#L136C16-L136C29. Maybe you can dig a bit more to see whether you can replicate it in Python. Brotli will add some overhead in terms of bundle size that I want to be careful with.

@NickCrews
Copy link
Author

I suppose if lz-string is always going to be supported by the editor, then there is no harm in adding it to Altair, it should be a single python file. Then moving to brotli could be a later discussion.

What do you think about the better compression of brotli? Have you heard from any other users complaints about long URLs?

@domoritz
Copy link
Member

I thought Brotli was for fast compression, not small.

I've not heard about many issues with long urls. If the spec is large, I always recommend gist or the api I linked to above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants