Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: endless session routing #501

Closed
SgtPooki opened this issue Apr 17, 2024 · 10 comments · Fixed by #542 or #539
Closed

bug: endless session routing #501

SgtPooki opened this issue Apr 17, 2024 · 10 comments · Fixed by #542 or #539
Assignees

Comments

@SgtPooki
Copy link
Member

SgtPooki commented Apr 17, 2024

When looking at ipfs/helia-verified-fetch#50, I ran into an issue: ipfs/helia-verified-fetch#50 (comment)

It seems like a request for QmVP2ip92jQuMDezVSzQBWDqWFbp9nyCHNQSiciRauPLDg endlessly polls for providers of QmVP2ip92jQuMDezVSzQBWDqWFbp9nyCHNQSiciRauPLDg with the default delegated-ipfs.dev router. I think this is due to

.then(async () => {
this.log('finding %d-%d new provider(s) for %c', count, this.maxProviders, cid)
for await (const provider of this.findNewProviders(cid, options)) {
if (found === this.maxProviders || options.signal?.aborted === true) {
break
}
if (this.hasProvider(provider)) {
continue
}
this.log('found %d/%d new providers', found, this.maxProviders)
this.providers.push(provider)
// let the new peer join current queries
this.safeDispatchEvent('provider', {
detail: provider
})
found++
if (found === count) {
this.log('session is ready')
deferred.resolve()
// continue finding peers until we reach this.maxProviders
}
if (this.providers.length === this.maxProviders) {
this.log('found max session peers', found)
break
}
}
this.log('found %d/%d new session peers', found, this.maxProviders)
if (found < count) {
throw new CodeError(`Found ${found} of ${count} ${this.name} providers for ${cid}`, 'ERR_INSUFFICIENT_PROVIDERS_FOUND')
}
})

something like maxProviders++ is occurring (i.e. #495 (comment)), but much less obviously

Though the real fix in helia-verified-fetch is to fix tests to be offline only, we should probably provide a default timeout because i'm sure others not in test environments will also run into this.

references:

@SgtPooki
Copy link
Member Author

If we don't handle this automatically, we should probably expose routers on

export interface Routing {
so session and users of sessions can determine if routers exist

@SgtPooki
Copy link
Member Author

running in test env locally, it seems like the problem is no providers have http/https and so we're just pausing because trustless-gateway/session.ts is not yielding any providers.

In our test environment, we have a local test kubo configured with ipfsd-ctl that should be used in the session, or fallen back to somehow

@helia/verified-fetch-interop: 2024-04-17T23:25:14.316Z helia:trustless-gateway:session allow insecure true, allow local true, all options = { redirect: 'manual', signal: undefined, minProviders: 0, allowInsecure: true, allowLocal: true }
@helia/verified-fetch-interop: 2024-04-17T23:25:14.317Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q finding 0-5 new provider(s) for bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: 2024-04-17T23:25:14.318Z delegated-routing-v1-http-api-client getProviders starts: bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: 2024-04-17T23:25:14.396Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider 12D3KooWAb1rjmJAGG3Vqk35QuYxxDfDrNXLWbJPdujE1pEtbZSd for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: returning false for multiaddr:  /ip4/212.248.62.42/tcp/17159/p2p/12D3KooWAb1rjmJAGG3Vqk35QuYxxDfDrNXLWbJPdujE1pEtbZSd
@helia/verified-fetch-interop: 2024-04-17T23:25:14.396Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider 12D3KooWAb1rjmJAGG3Vqk35QuYxxDfDrNXLWbJPdujE1pEtbZSd has no http-gateway addresses
@helia/verified-fetch-interop: 2024-04-17T23:25:14.396Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider 12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: returning false for multiaddr:  /ip4/76.219.232.45/tcp/24888/p2p/12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i
@helia/verified-fetch-interop: 2024-04-17T23:25:14.396Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider 12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i has no http-gateway addresses
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider 12D3KooWJLDwovGo1rFfqTKGcLprqUKoAvwweJVXoexk7E5bh9BL for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider 12D3KooWJLDwovGo1rFfqTKGcLprqUKoAvwweJVXoexk7E5bh9BL has no http-gateway addresses
@helia/verified-fetch-interop: returning false for multiaddr:  /ip4/106.240.230.122/tcp/8888/p2p/12D3KooWJLDwovGo1rFfqTKGcLprqUKoAvwweJVXoexk7E5bh9BL
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider 12D3KooWB6fUFjjLSYcWfM2iAd5ScgknqVP1Dbe6pff4Kqq64CLM for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider 12D3KooWB6fUFjjLSYcWfM2iAd5ScgknqVP1Dbe6pff4Kqq64CLM has no http-gateway addresses
@helia/verified-fetch-interop: returning false for multiaddr:  /ip4/109.199.233.7/tcp/8888/p2p/12D3KooWB6fUFjjLSYcWfM2iAd5ScgknqVP1Dbe6pff4Kqq64CLM
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider 12D3KooWJWAwD6CbJnDm6tHcFuZxvWrjhp53hDjLHM1GiP1pYXmY for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider 12D3KooWJWAwD6CbJnDm6tHcFuZxvWrjhp53hDjLHM1GiP1pYXmY has no http-gateway addresses
@helia/verified-fetch-interop: returning false for multiaddr:  /ip4/64.16.239.0/tcp/11337/p2p/12D3KooWJWAwD6CbJnDm6tHcFuZxvWrjhp53hDjLHM1GiP1pYXmY
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: returning false for multiaddr:  /dns4/elastic.dag.house/tcp/443/wss/p2p/QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC has no http-gateway addresses
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q got provider 12D3KooWPkHBZbbDMAYLeGavFrrxZzCrB31ifUvJWV6ZBZ5ikMQT for cid bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q
@helia/verified-fetch-interop: 2024-04-17T23:25:14.397Z helia:trustless-gateway:session:bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q provider 12D3KooWPkHBZbbDMAYLeGavFrrxZzCrB31ifUvJWV6ZBZ5ikMQT has no http-gateway addresses
@helia/verified-fetch-interop: returning false for multiaddr:  /ip4/210.119.240.2/tcp/8888/p2p/12D3KooWPkHBZbbDMAYLeGavFrrxZzCrB31ifUvJWV6ZBZ5ikMQT
@helia/verified-fetch-interop: 2024-04-17T23:25:14.403Z delegated-routing-v1-http-api-client getProviders finished: bafybeifq2rzpqnqrsdupncmkmhs3ckxxjhuvdcbvydkgvch3ms24k5lo7q

@SgtPooki
Copy link
Member Author

FYI I threw together https://github.com/SgtPooki/repro-kubo-not-announcing-self-as-provider which shows that kubo doesn't announce itself as a provider, so we need some solution for our tests

@SgtPooki
Copy link
Member Author

SgtPooki commented Apr 18, 2024

playing around with patch-package. I think we need some config option to allow using the gateways already provided to helia block-brokers

Here are the diffs that solved my problem:

@helia/block-brokers

diff --git a/node_modules/@helia/block-brokers/dist/src/trustless-gateway/broker.js b/node_modules/@helia/block-brokers/dist/src/trustless-gateway/broker.js
index c243453..d261d57 100644
--- a/node_modules/@helia/block-brokers/dist/src/trustless-gateway/broker.js
+++ b/node_modules/@helia/block-brokers/dist/src/trustless-gateway/broker.js
@@ -70,7 +70,7 @@ export class TrustlessGatewayBlockBroker {
         return createTrustlessGatewaySession({
             logger: this.logger,
             routing: this.routing
-        }, options);
+        }, {...options, providers: this.gateways});
     }
 }
 //# sourceMappingURL=broker.js.map
diff --git a/node_modules/@helia/block-brokers/dist/src/trustless-gateway/session.js b/node_modules/@helia/block-brokers/dist/src/trustless-gateway/session.js
index 4bc6b63..b80b07b 100644
--- a/node_modules/@helia/block-brokers/dist/src/trustless-gateway/session.js
+++ b/node_modules/@helia/block-brokers/dist/src/trustless-gateway/session.js
@@ -17,6 +17,7 @@ class TrustlessGatewaySession extends AbstractSession {
         this.routing = components.routing;
         this.allowInsecure = init.allowInsecure ?? DEFAULT_ALLOW_INSECURE;
         this.allowLocal = init.allowLocal ?? DEFAULT_ALLOW_LOCAL;
+        this.log('allow insecure %o, allow local %o, all options = %o', this.allowInsecure, this.allowLocal, init);
     }
     async queryProvider(cid, provider, options) {
         this.log('fetching BLOCK for %c from %s', cid, provider.url);
@@ -27,9 +28,11 @@ class TrustlessGatewaySession extends AbstractSession {
     }
     async *findNewProviders(cid, options = {}) {
         for await (const provider of this.routing.findProviders(cid, options)) {
+            this.log('got provider %p for cid %c', provider.id, cid);
             // require http(s) addresses
             const httpAddresses = filterMultiaddrs(provider.multiaddrs, this.allowInsecure, this.allowLocal);
             if (httpAddresses.length === 0) {
+                this.log('provider %p has no http-gateway addresses', provider.id);
                 continue;
             }
             // take first address?
@@ -52,13 +55,18 @@ function filterMultiaddrs(multiaddrs, allowInsecure, allowLocal) {
     return multiaddrs.filter(ma => {
         if (HTTPS.matches(ma) || (allowInsecure && HTTP.matches(ma))) {
             if (allowLocal) {
+                console.log('local is allowed, returning true for multiaddr: ', ma.toString());
                 return true;
             }
             if (DNS.matches(ma)) {
+                console.log('dns matches, returning true for multiaddr: ', ma.toString());
                 return true;
             }
-            return isPrivateIp(ma.toOptions().host) === false;
+            const result = isPrivateIp(ma.toOptions().host) === false;
+            console.log('isPrivateIp result: ', result);
+            return result;
         }
+        console.log('returning false for multiaddr: ', ma.toString());
         return false;
     });
 }

@helia/utils

diff --git a/node_modules/@helia/utils/dist/src/abstract-session.js b/node_modules/@helia/utils/dist/src/abstract-session.js
index 9ceee2a..490b4dc 100644
--- a/node_modules/@helia/utils/dist/src/abstract-session.js
+++ b/node_modules/@helia/utils/dist/src/abstract-session.js
@@ -23,7 +23,7 @@ export class AbstractSession extends TypedEventEmitter {
         this.requests = new Map();
         this.minProviders = init.minProviders ?? DEFAULT_SESSION_MIN_PROVIDERS;
         this.maxProviders = init.maxProviders ?? DEFAULT_SESSION_MAX_PROVIDERS;
-        this.providers = [];
+        this.providers = init.providers ?? [];
         this.evictionFilter = BloomFilter.create(this.maxProviders);
     }
     async retrieve(cid, options = {}) {
@@ -167,6 +167,12 @@ export class AbstractSession extends TypedEventEmitter {
             .then(async () => {
             this.log('finding %d-%d new provider(s) for %c', count, this.maxProviders, cid);
             for await (const provider of this.findNewProviders(cid, options)) {
+                this.log('abstract-session got provider %p for cid %c', provider.id, cid);
+                if (found >= count) {
+                    this.log('session is ready');
+                    deferred.resolve();
+                    // continue finding peers until we reach this.maxProviders
+                }
                 if (found === this.maxProviders || options.signal?.aborted === true) {
                     break;
                 }
@@ -180,11 +186,6 @@ export class AbstractSession extends TypedEventEmitter {
                     detail: provider
                 });
                 found++;
-                if (found === count) {
-                    this.log('session is ready');
-                    deferred.resolve();
-                    // continue finding peers until we reach this.maxProviders
-                }
                 if (this.providers.length === this.maxProviders) {
                     this.log('found max session peers', found);
                     break;

This issue body was partially generated by patch-package.

@SgtPooki
Copy link
Member Author

#501 (comment) also allows us to bypass the piping through of "allowLocal" and "allowInsecure" when we explicitly provide a local/insecure gateway in gateways: [] config.

@SgtPooki SgtPooki self-assigned this Apr 18, 2024
@achingbrain
Copy link
Member

I guess maybe we want to limit how many times we search for providers unsuccessfully before giving up.

Right now it'll give up if you pass a signal and that signal aborts (e.g. after a timeout) but maybe that's too subtle.

As an aside I do think we should pass signals into every function that accepts them in our examples, to try to normalise the practice.

I think we need some config option to allow using the gateways already provided to helia block-brokers

I was thinking about something like this but if we do that, you might as well not use the session?

@SgtPooki
Copy link
Member Author

I think we need some config option to allow using the gateways already provided to helia block-brokers

I was thinking about something like this but if we do that, you might as well not use the session?

I think it would still be valuable with sessions, because there's no way to know if there are valuable providers or not until you try. If we don't find https providers after a timeout (we can use signals here sure) we could use the gateways configured on the root TrustlessGateway instance

@SgtPooki
Copy link
Member Author

I think

await this.findProviders(cid, this.minProviders, options)
// keep trying until the abort signal fires
this.log('found new providers re-retrieving %c', cid)
this.requests.delete(cidStr)
deferred.resolve(await this.retrieve(cid, options))
is causing endless loop, because we're not ensuring that we don't get new providers.

We await this.findProviders and then delete the existing request, and start a new one, regardless of whether this.findProviders finds us new providers or not.

@SgtPooki
Copy link
Member Author

I am able to reproduce this consistently with the steps in ipfs/helia-verified-fetch#84

Here is a section of the log, that slowly increases in the duplicates we find, and yet we still keep querying them all:

2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a session is ready
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a comparing providers http://127.0.0.1:3440/ and http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a found existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a skipping existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a comparing providers http://127.0.0.1:3440/ and http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a found existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a skipping existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a comparing providers http://127.0.0.1:3440/ and http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a found existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a skipping existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a comparing providers http://127.0.0.1:3440/ and http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a found existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a skipping existing provider http://127.0.0.1:3440/
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a found new providers re-retrieving bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a
2024-05-16T21:03:56.212Z helia:trustless-gateway:session:bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a fetching BLOCK for bafkreia32gpab5kxt6utjt623vchiblgyblfslhih76i2rg62ukgp6aw7a from http://127.0.0.1:3440/

I've truncated the log a lot (33M of repeatedly querying the same local gateway in less than 10 seconds) log.txt but you can see some of the issues.

I also modified abstract-session in node_modules with:

            if (found < count) {
                this.log('THROWING ERROR: found (%d) is < count (%d)', found, count);
                throw new CodeError(`Found ${found} of ${count} ${this.name} providers for ${cid}`, 'ERR_INSUFFICIENT_PROVIDERS_FOUND');
            } else {
              this.log('found (%d) is not < count (%d)', found, count);
            }

and don't see either of those lines in the log at all.

@SgtPooki
Copy link
Member Author

Repro test in #538

achingbrain added a commit that referenced this issue May 20, 2024
When new providers are found, check the eviction filter in `isEvicted`
rather than the current list of providers.

Fixes #501
achingbrain added a commit that referenced this issue May 20, 2024
When new providers are found, check the eviction filter in `isEvicted`
rather than the current list of providers.

Fixes #501
achingbrain added a commit that referenced this issue May 20, 2024
When new providers are found, check the eviction filter in `isEvicted`
rather than the current list of providers.

Fixes #501
achingbrain added a commit that referenced this issue May 20, 2024
When new providers are found, check the eviction filter in `isEvicted`
rather than the current list of providers.

Fixes #501
achingbrain added a commit that referenced this issue May 20, 2024
When new providers are found, check the eviction filter in `isEvicted`
rather than the current list of providers.

Fixes #501
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants