Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send WebAssembly binary over Jupyter WebSocket #461

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kylebarron
Copy link
Member

We use Parquet as the internal format for data transfer. For reasons linked, Parquet is great. But to read Parquet on the client, we need to use a Wasm-based Parquet reader like my own https://github.com/kylebarron/parquet-wasm. Wasm-based libraries need a sidecar binary .wasm file, which is usually distributed separately.

We currently fetch this file from CDN, but for environments with a strong outbound firewall, a CDN may not be allowed. See #457. To get around this, we serialize the gzipped Wasm content on the Anywidget model itself. We then decompress it on the client and pass it into Parquet-wasm's initializer.

Closes #457

@manzt
Copy link

manzt commented Apr 10, 2024

one suggestion, you can use some indirection to create a model with just the static contents so that it is hoisted from each model instance:

import ipywidgets
import anywidget
import traitlets


class StaticAsset(ipywidgets.Widget):
    contents = traitlets.Any().tag(sync=True)

asset = StaticAsset(contents=b"hello, world")
    
class Widget(anywidget.AnyWidget):
    _esm = """
    async function load_asset(model, name) {
        let model_id = model.get(name).slice("IPY_MODEL_".length);
        let asset_model = await model.widget_manager.get_model(model_id);
        return asset_model.get("contents");
    }
    async function render({ model, el }) {
        let asset = await load_asset(model, "asset")
        el.innerText = new TextDecoder().decode(asset)
    }
    export default { render }
    """
    asset = traitlets.Any(asset).tag(sync=True, **ipywidgets.widget_serialization)
    

Widget()

Each widget instance will just have IPY_MODEL_xxxx, and reuse the asset in the front end. I am working on making anywidget hoist _esm and _css assets this way to avoid duplication in the HTML.

@manzt
Copy link

manzt commented Apr 10, 2024

It should be noted this is just a way to get the static assets into the front end, but derived objects (e.g., initialized parquet module) would need to be cached somewhere. Probably easiest to make some global for now, but would be more elegant if anywidget could provide a more ideomatic API.

Concretely,

import ipywidgets
import anywidget
import traitlets


class StaticAsset(ipywidgets.Widget):
    contents = traitlets.Any().tag(sync=True)

asset = StaticAsset(contents=b"hello, world")
    
class Widget(anywidget.AnyWidget):
    _esm = """
    async function load_asset(model, name) {
        let model_id = model.get(name).slice("IPY_MODEL_".length);
        let asset_model = await model.widget_manager.get_model(model_id);
        return asset_model.get("contents");
    }
    
    async function initialize({ model }) {
        if (!globalThis._TREVORS_DECODED_ASSET) {
            // cache this globally for all others....
            let asset = await load_asset(model, "asset");
            globalThis._TREVORS_DECODED_ASSET = new TextDecoder().decode(asset);
        }
    }
    
    async function render({ model, el }) {
        el.innerText = globalThis._TREVORS_DECODED_ASSET
    }
    export default { initialize, render }
    """
    asset = traitlets.Any(asset).tag(sync=True, **ipywidgets.widget_serialization)
    

Widget()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

import parquet-wasm from local bundle rather than CDN
2 participants