Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Estuary To IPFS Cluster #1804

Open
ohmpatel1997 opened this issue Nov 15, 2022 · 1 comment
Open

Add Estuary To IPFS Cluster #1804

ohmpatel1997 opened this issue Nov 15, 2022 · 1 comment
Assignees
Labels
effort/weeks Estimated to take multiple weeks exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature

Comments

@ohmpatel1997
Copy link
Contributor

Describe the feature you are proposing

I am looking for a way to integrate the estuary as one of the remote node for my IPFS cluster along with normal local IPFS nodes.
Ideally I would be able to interact with single cluster API and the cluster should be able to pin data across all/configured nodes which can be estuary node.

@ohmpatel1997 ohmpatel1997 added the need/triage Needs initial labeling and prioritization label Nov 15, 2022
@ohmpatel1997 ohmpatel1997 changed the title Adding Estuary To IPFS Cluster Add Estuary To IPFS Cluster Nov 15, 2022
@hsanjuan hsanjuan added kind/enhancement A net-new feature or improvement to an existing feature exp/expert Having worked on the specific codebase is important effort/weeks Estimated to take multiple weeks and removed need/triage Needs initial labeling and prioritization labels Nov 16, 2022
@hsanjuan
Copy link
Collaborator

Let's see:

  • The ipfsconn component in cluster allows to interact with a Kubo daemon using the Kubo RPC API.
  • A new implementation could be provided to interact with something that exposes a Pinning Services API (i.e. Estuary)
  • This would allow making a cluster not only of Kubo nodes but of other pinning services.
  • Including other clusters (cluster composition already possible using the ipfs-proxy API)

Providing a new implementation for this component requires re-implementing a few methods:

  • ID(context.Context) (api.IPFSID, error): This assumes that there's a single IPFS peer with an ID. The pinning services API does not give this information (one or multiple). One possible option is to allow responding with a slice of IPFSIDs, and to allow that it is empty. But this affects how cluster reports on pin status, as ClusterPeer -> IPFSID associations are cached etc. It would be very nice if the Pinning Services API provided a list of IPFS Peer IDs or something for us.
  • Pin(context.Context, api.Pin) error: The pinning services API does async pinning, unlike Kubo. We would have to send the pin request and follow up with status requests until it is done. The main problem here is that the Pinning Services API does not provide any pin progress updates, so we will not be able to tell if a pin is stuck, which means we will have to have fixed pinning timeouts. We also have limited semantics for direct vs recursive pins.
  • Unpin(context.Context, api.Cid) error: similar to above, but usually unpinning is less problematic and finishes fast.
  • PinLsCid(context.Context, api.Pin) (api.IPFSPinStatus, error): The pinning services API provides status: we would have to decide how we translate a "queued" status, as we expect to answer if the pin exists and whether it is "recursive/direct". Responding that something is not pinned might retrigger repinning, and we may have to deal with that in the Pin() method (i.e. not sending another pin request for something that is queued and just watching until timeout).
  • PinLs(ctx context.Context, typeFilters []string, out chan<- api.IPFSPinInfo) error: this should be possible but might require implementing how to deal with pagination etc.
  • ConnectSwarms(context.Context) error: this could no-op I guess.
  • SwarmPeers(context.Context) ([]peer.ID, error): this would no-op too I guess.
  • ConfigKey(keypath string) (interface{}, error): this is implemented using default helpers.
  • RepoStat(context.Context) (api.IPFSRepoStat, error): This affects how cluster performs allocations, but the pinning services API does not give us any way to know how much available space the user can enjoy. We must then make it up somehow, either setting to infinite, or to 0. That will result in always allocating to the peer, or in never doing it (unless manually specified).
  • RepoGC(context.Context) (api.RepoGC, error): this can no-op
  • Resolve(context.Context, string) (api.Cid, error): this one is tricky. Pinning Services API does not let us do this. This affects cluster's ability to pin paths (i.e. <cid>/path/to/something). I guess we can return an error. Or we could try to find another peer in the cluster that does have a Kubo node behind and redirect to it. Ok to start by erroring.
  • BlockStream(context.Context, <-chan api.NodeWithMeta) error: adding is not supported by pinning services API, so we would error.
  • BlockGet(context.Context, api.Cid) ([]byte, error): we would error this too.

There's another deeper problem: cluster indexes everything by CID, while the pinning svc API needs RequestIDs as inputs. This forces us to keep local state (or rebuild it on startup) in the component implementation (the mapping between CID and requestIDs).

The fact that a pinning service would have its own queue management system may mean we need to expand IPFSStatus to include things like "RemoteQueues, RemotePinning" and similar, and deal with such status accross the application (i.e. by translating a RemotePinning ipfs status into a pinning cluster status. This however has more implications and means touching at least the PinTracker component too.

Another time-taking issue will be the need to build a mock pinning service API for testing the whole thing.

All in all I think it is worth prototyping this first, if we can limit ripple effects to the cluster architecture and stay within the component implementation we might find a good compromise that is shippable, but I'm not fully convinced it will be that easy.

@hsanjuan hsanjuan self-assigned this Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/weeks Estimated to take multiple weeks exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants