-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Domains: Bundle limit check breaks extrinsic ordering #2715
Comments
This is quite a surprise to me as I did consider this scenario when working on #2454, BTW have you reproduced this with integration test?
The extrinsic with nonce 4 (and any following extrinsic from the same sender) even can fit in the bundle will still fail later in the As for the second issue, the With a closer look, I'm suspected with the
|
Does this mean they will be slashed?
This is how Liu-Cheng implemented long time ago, is it not the case anymore?
IIRC 3h was supposed to be the last network before mainnet, is this still the case @dariolina? |
We can reset 3h if we need to, but this issue does not seem sufficient reason for reset. Is it? |
If the issue is the But in case we need to do such FP verification/ER derivation upgrade in the future, we should consider #2712 |
Yes, if we fix the second issue, operator running old version will get slashed due to change in Domain block extrinsic hash
I don't believe it was never changed. AFAIK, we never adjusted the order but will need to check the history if we broke it post that.
Turns out this is issue with pre_dispatch from EVM like @NingLin-P suggested but
|
Yeah, this is another issue we have 👍 I think the fix will be:
All these changes are in the evm domain runtime so we will need a domain runtime upgrade.
Also, this seems not related to the issue we met here, but is something we need to handle. |
I would first like to understand why frontier deviated in the first place. Maybe this should be fixed there if possible |
The issue is fixed in #2723, do you think we need to test it on devnet before closing this issue? @vedhavyas |
I think it should be deployed to devnet so @abhi3700 can test whether it fixes the issue he was facing with tx ordering. |
I have tested Initially, when I sent 32 txs using a foundry script, it kind of got stuck (also in some cases failed due to So, @NingLin-P suggested to do the testing on devnet (run locally) to avoid repetitive devnet reboot. I found some issues there:
Here is the Notion doc which details out everything (success/failures doing some targeted testing). |
@abhi3700 The issue talks about whether transaction ordering is fixed with the recent patch we made. We are aware of the OutOfGas issue and this is unrelated to this issue we are talking about. |
I tested on local devnet and I didn't find the tx ordering issue. But on testing over the current live devnet, txs are still stuck in the pool (as shown in the image above). |
@abhi3700 transactions being stuck in the tx-pool seems not related to the inclusion of transactions in order. Earlier transactions were included out of order and as a result txns are failing to be executed. Transactions getting stuck in tx-pool means either of the following
I would suggest you ensure there are no nonce-gaps created by your script. If that is not the case, since you mentioned running the node locally, is the operator submitting bundles? I have recently ran multiple scripts provided by you and @marc-aurele-besner which sends around 30 txns and all of them goes through fine given operator is submitting bundles. Now coming to the issues you found on your local node,
There is know issue with frontier to gas_estimate for a given transaction and we are working to fix it upstream. How was GasLimit supplied to these txns? Have you tried overriding gas limit and sending the txn and see if the error is fixed for now?
This seems like a contract issue rather than EVM issue. Does the same txn goes through in other test chains you have used before, if so we would need to see how EVM contract execution is different from ours. Also would be great if you can provide a script to reproduce this to debug further.
Where do you see this error? is this on the client submitting txns or on the node? I dont think we have such an error from operator node earlier. |
In the script, there is no such setting done to set nonce manually.
Yes it is.
I am not sure about the current live devnet. But as I saw the txs stuck in the pool on it. That's why I highlighted. Although while running devnet locally, txs didn't get stuck in the pool.
The gas limit was supplied via 2 main ways via
You can check these: If I missed any, please let me know on this.
I have run the script successfully on ETH Sepolia chain multiple times.
On the client side in CLI. It can be reproduced as well. |
okay seems unrelated to this. Need to debug devnet for this.
I'm confused on this. For me while running locally, with gas_limit overrided, everything seems to work from the contract scripts you have provided. I'm wondering if this is specific to that contract itself.
Alright, can you create a new issue with a script to reproduce this locally for further debugging ?
I'm not sure if this is actually node issue but lets create an issue tracker for this with details on github instead of notion. @abhi3700 it would be ideal to put all the details including scripts to reproduce on Github issue instead of Notion so we have one place to track them. Overall, transaction ordering issue is resolved. I wold like to close this issue and open any new issues found here seperately to keep it easier to track unless you disagree @dariolina |
If it's a separate issue in your opinion, then absolutely open a new issue |
Okay let's close this issue. I will open new issues for the unrelated ones. |
Recently we saw on gemini-3h for EVM chain where extrinsic ordering was broken within each bundles. I initially assumed this was due to either substrate upstream or frontier upstream. Turns out this was breaking the extrinsic ordering due to the regression introduced in this - https://github.com/subspace/subspace/pull/2454/files
We seems to be skipping any ready extrinsics coming from tx_pool. If there are multiple extrinsics from the same sender for example with nonce: 0, 1, 2, 3, 4, 5
If adding the extrinsic with nonce 3 makes either bundle limit or size more than the maximum, then we skip that and we move to next extrinsic, which is 4. If that fits in, we add that to the bundle there by breaking the order. With this, any extrinsic after nonce 2 will fail at pre_dispatch since txn with nonce 3 is missing during the domain block import.
If the extrinsic with nonce 3 is included in the next bundle within the same consensus block, the order is not restored since we order the final extrinsics list is defined by the order of the bundles included in the Consensus block.
This brings us to two issues at hand
a
was skipped, then all the proceeding extrinsics from the same sender should be skipped within a given bundle. This should be a trivial fix and I have tested this locally.Gemini-3h
Thoughts @nazar-pc @dariolina @NingLin-P ?
The text was updated successfully, but these errors were encountered: