Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tooling to retry failed contract renewals #138

Open
MeijeSibbel opened this issue Oct 7, 2020 · 5 comments
Open

Tooling to retry failed contract renewals #138

MeijeSibbel opened this issue Oct 7, 2020 · 5 comments

Comments

@MeijeSibbel
Copy link

Quote: "Attempt renewal again if the transaction ends up being reorged. This doesn't strike me as too difficult, but currently I don't think there's any us tooling for it."

This tooling will help resolve no record of that host after renewal.

image

@lukechampine
Copy link
Owner

lukechampine commented Oct 13, 2020

On second thought, I'm not sure any special tooling is needed here; the old contract is not deleted from the muse server, so if you try to use a renewed contract and it fails, you should be able to simply attempt the renewal again (by calling the /renew endpoint with the same arguments).

Another thing you could try is to always attempt a second renewal 12 blocks after the first one (or 2 hours, if that's easier). If the first renewal succeeded and has not been reorged, you'll get a predictable error, e.g. "contract can not be revised." Otherwise, you'll get a different error, or nil, in which case you know that you should try again in another 2 hours.

@jkawamoto
Copy link
Contributor

jkawamoto commented Oct 15, 2020

If I'm not mistaken, the old contract is already finalized, and we cannot renew it anymore, can we? The problem is the host accepts renewing a contract but later it removes the contract. Does the host keep the old contract and use it after removing the renewed contract?

@lukechampine
Copy link
Owner

Hmm. I'll have to look at the host code. This sounds like something that the host should handle (either by restoring the old contract, or by automatically resubmitting the new contract) but may not be.

I'd suggest at least attempting to renew again when this happens, just to see what sort of error you get. That could be helpful.

@MeijeSibbel
Copy link
Author

Guys, @jkawamoto @lukechampine what is the status on this issue? Looking at Kibana i can still se endless messages with:

too many hosts did not supply their shard (needed 5, got 3): 
c31b05d3: no record of that host
1c96a10c: no record of that host
713bee98: no record of that host
2cb71a1e: no record of that host
d9cd1249: no record of that host
bd572d72: no record of that host
f8642258: NewUnlockedSession: connect: no route to host
d0ced087: no record of that host
c32db729: no record of that host
3c6952cf: no record of that host
c204cc3e: no record of that host
0dbd913b: NewUnlockedSession: connect: connection refused
a85816e6: no record of that host
d441be7b: Settings: couldn't read LoopSettings response: read tcp 10.244.0.86:54556->50.35.89.213:9982: i/o timeout
9d43f278: no record of that host
c99c8227: no record of that host
4b804458: NewUnlockedSession: connect: connection timed out
too many hosts did not supply their shard (needed 5, got 2): 
713bee98: no record of that host
a40bdf4a: no record of that host
1c96a10c: no record of that host
6f79ed6c: no record of that host
c32db729: no record of that host
9ba38c50: no record of that host
bd572d72: no record of that host
d9cd1249: no record of that host
57dbc08d: no record of that host
6c202b48: no record of that host
c99c8227: no record of that host
f5fe9ca9: no record of that host
b63fb3df: no record of that host
3c6952cf: no record of that host
d0ced087: no record of that host
f1200eea: NewUnlockedSession: lookup madbri.ddns.net on 10.245.0.10:53: no such host
0dbd913b: NewUnlockedSession: connect: connection refused
d441be7b: Lock: couldn't read LoopLock response: read tcp 10.244.0.86:53906->50.35.89.213:9982: i/o timeout

We're basically still permanently losing data because of this.

@lukechampine
Copy link
Owner

lukechampine commented Mar 30, 2021

I'm skeptical that this is being caused by renewals being reorged, but I can't say for sure because I don't know how common reorgs are on mainnet. I'll try to get some stats on that. (UPDATE: According to SiaStats, there have been 10 reorgs in the past ~6000 blocks, so about one reorg per week. All of those reorgs were 1 block deep.)

If reorgs are the cause, then the impact could be minimized by "staggering" your renewals, i.e. renew one contract every 10 minutes instead of renewing all of the contracts together.

We should evaluate other potential causes as well, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants