Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🏆 Golden path scenario] Browsers can reliably retrieve content from any modern Kubo node providing content #255

Open
10 of 14 tasks
BigLep opened this issue Sep 7, 2023 · 2 comments
Labels
dif/hard Having worked on the specific codebase is important

Comments

@BigLep
Copy link
Contributor

BigLep commented Sep 7, 2023

Done Criteria

A user can reliably author/provide content in a local Kubo node behind a NAT, advertise the content in the public IPFS DHT or an IPNI like cid.contact, and have it retrievable via any modern browser (desktop or mobile) via Helia running on a different local network without relying on pinning services.

Why Important

This is a common usecase that users hit. Failure here feeds the narrative that "IPFS doesn't just work".

Content Routing

  1. For content routing (for both public IPFS DHT and IPNI), it’s acceptable to rely on delegated HTTP /routing/v1 from a public endpoint like routing.delegate.ipfs.tbd, cid.contact, etc.

Node Connectivity

  1. a local Kubo node behind a NAT being connected by any modern browser (desktop or mobile) means we can’t rely on WebTransport here given WebTransport can’t be used to dial private nodes and because Safari (necessary for iOS) doesn’t have WebTransport support yet (but will at some point in the future).
    1. That said, using WebTransport as much as possible is encouraged and will undoubtedly be a stepping stone as this end-to-end usecase is flushed out.

Reliability Notes

Reliability is critical here. We need to move beyond demos we scrape together. As example, we want to get to a point where we could some instructions to Juan and not be praying in the audience hoping that it will work. We want to flush out the bugs that happen in a user’s browser under normal loads of multiple tabs, retrieving a range of file sizes, etc. (We know from putting together IPFS Thing 2023 demos together that there are reliability issues.). A key aspect is determining how we’re going to “stress test” this.

To really guarantee reliable retrieval, we should leverage trustless block-by-block fetch over HTTP as a feature, and make it the ultimate fallback enabled by default.

If a Helia node is unable to find CID via delegated routing or fetch from a discovered provider peer, there should be an attempt for raw block fetch from trustless gateways defined in config (or discovered via /routing/v1 and announcing recursive flag via HTTP OPTIONS).

How is this better than what we have with preloads?
Uses plain HTTP fetch and a plain block gateway, no special configuration, no need for libp2p stack, very easy to deploy Kubo or implement own backend, if needed.

Isn’t this step back? We want p2p.
The idea is to define this as “trustless gateway fallback feature”. It could be a separate project that wraps a helia instance and has additional config. Helia would prefer p2p, but if that fails, will try asking trustless gateway as last resort. We’ve already did something like this for SW gateway; this would be productizing it.

This allows us to iterate on making % of p2p retrievals higher, while giving developers an escape hatch, a way to fall back to self-hosted gateway instead of hard fail for their users.

Caveat (1): to make sure “it just works”, the implicit default would be a trustless gateway provided by PL like we do with bootstrappers. Users MUST be able to override it via config with own list of gateways, allowing self-hosting and scaling independent of PL infra.

Caveat (2): This one is tedious but also easy to do: we’d need to set up trustless-raw-block-only gateway under hostname other than ipfs.io, to avoid safebrowsing errors. This would be 1 line in nginx config similar to this, but for application/vnd.ipld.raw

Testing

Below are some testing ideas (but this warrants its own discussion/doc). We should have a pulse on how much worse from a performance and reliability regard loading content via Helia vs. the trusted ipfs.io gateway. (We should pretend to wear the ipfs.io gateway hat. If we owned that and wanted to shed some of the traffic to be more p2p, what would the user impact be?)

  1. Hook into the existing Probelab website testing. Currently it is using headless Chromum to compare Kubo (via http://localhost:8080/ipns/<website>) vs default HTTP (via https://<website>) (read more). We could add an additional scenario that hits a service worker gateway.
  2. Get a service worker gateway hooked into an experimental Companion build that we start using ourselves and observe anecdotally how things work.
  3. Get a sample of ipfs.io gateway request CIDs and feed them to Helia to load in the browser.

Getting Started

While there are some specific Go tasks to fully wrap up this task listed below, there isn’t anything stopping the JS side from validating and hardening the browser happy path sooner (e.g., testing retrieving content from public WSS / WebTransport multiaddr).

General Notes

  1. Per above, this isn't a pure Helia issue. Tracking the usecase needs to go somewhere though, so I'm putting it Helia for now so we can link against it.

Tasks

Retrieval guarantees with using the trustless HTTP gateway spec

  1. 3 of 3
    P0 dif/expert effort/hours kind/architecture status/in-progress
    SgtPooki
  2. 10 of 11
    SgtPooki

Tasks for testing reliability

  1. 12 of 13
    SgtPooki

Tasks for browser-accessible delegated routing

  1. hacdias

Tasks for browser to private Kubo retrievability

  1. env:browser kind/bug need/triage
@SgtPooki
Copy link
Member

SgtPooki commented Sep 11, 2023

There is still an issue with nodes in the network not providing public webtransport addresses, we should have kubo/boxo/go-libp2p look into this more seriously. See libp2p/js-libp2p#2040 & libp2p/go-libp2p#2568 (comment)

@SgtPooki SgtPooki added the dif/hard Having worked on the specific codebase is important label Oct 9, 2023
@BigLep
Copy link
Contributor Author

BigLep commented Oct 12, 2023

For anyone watching this issue, progress is being made. Please see the linked issues from the task lists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dif/hard Having worked on the specific codebase is important
Projects
No open projects
Status: 🏃‍♀️ In Progress
Status: No status
Development

No branches or pull requests

2 participants