Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Burn state handling in the Clarity VM #4789

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from
Open

Burn state handling in the Clarity VM #4789

wants to merge 17 commits into from

Conversation

obycode
Copy link
Contributor

@obycode obycode commented May 14, 2024

Description

In Nakamoto, a transaction can access the current burn block, while in epoch 2.x, it can only access the burn block of its parent. This changeset updates the behavior, so that every block in a tenure, including the first block, can access the burn block information of the tenure's burn block.

Note, this is built off of #4745. I currently have this PR targeting that branch, so the new changes are clear. Once that one is merged, I will set this back to target develop.

Applicable issues

Checklist

  • Test coverage for new or modified code paths
  • Changelog is updated
  • Required documentation changes (e.g., docs/rpc/openapi.yaml and rpc-endpoints.md for v2 endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo
  • New integration test(s) added to bitcoin-tests.yml

@obycode obycode force-pushed the fix/burn-state branch 2 times, most recently from 6ebdb01 to 822813f Compare May 15, 2024 17:38
@obycode obycode requested review from a team as code owners May 15, 2024 17:38
@obycode obycode changed the base branch from develop to feat/stacks-block-height May 15, 2024 17:39
@obycode obycode changed the base branch from feat/stacks-block-height to develop May 15, 2024 22:02
obycode and others added 2 commits May 15, 2024 22:34
Co-authored-by: Jeff Bencin <jeff.bencin@gmail.com>
jbencin
jbencin previously approved these changes May 16, 2024
Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good to me -- I had a handful of comments, but I think the approach for SortitionDBConn needs to be changed.

.into()
})
// In epoch 2, we can only access the burn block associated with the last block
if self.get_clarity_epoch_version()? < StacksEpochId::Epoch30 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should prefer defining characteristics of epochs as methods of StacksEpochId:

/// Whether or not this epoch uses the tip for reading Clarity burn block information (3.0+ behavior)
/// or should use the parent block's burn block (behavior before 3.0)
pub fn clarity_uses_tip_burn_block(&self) -> bool

And then, self.get_clarity_epoch_version().clarity_uses_tip_burn_block().

I think the advantages to this are that (1) we don't repeat the same gating logic in multiple places, and (2) the StacksEpochId can become a pretty reliable place for seeing which features activated when.

Copy link
Contributor Author

@obycode obycode May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I now see the other similar methods there. I see most of these methods use a match to make it explicit for each epoch. Is that preferred over just using a >= so that it needs to be considered for each new epoch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added method in e29ab98.

Comment on lines 165 to 174
let mut db = self.0.as_clarity_db();
db.begin();
db.set_clarity_epoch_version(epoch).unwrap();
db.commit().unwrap();
if epoch >= StacksEpochId::Epoch30 {
db.begin();
db.set_tenure_height(1).unwrap();
db.commit().unwrap();
}
let mut owned_env = OwnedEnvironment::new(db, epoch);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these changes have happened in the previous PR? I thought memory env tests were passing okay in that PR without this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they should have, but I believe the problem wasn't hit in the previous PR because of the special case handling of block 0.

Comment on lines 582 to 590
fn get_tip_burn_block_height(&self) -> Option<u32> {
let tip = SortitionDB::get_canonical_burn_chain_tip(self.conn()).ok()?;
tip.block_height.try_into().ok()
}

fn get_tip_sortition_id(&self) -> Option<SortitionId> {
let tip = SortitionDB::get_canonical_burn_chain_tip(self.conn()).ok()?;
Some(tip.sortition_id)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that this is safe. I think the SortitionDBConn implementation is used by the mining path, while the SortitionHandleTx is used by the block processor. I think that this leads to the "correct" behavior: the SortitionHandleTx has its chain_tip field set to the "burn_view", and when mining, the canonical tip should be the tip (but I'm not sure that it always will be -- maybe a sortition could be processed between the block starting mining and finishing mining?). Either way, this seems like something that could come back to bite us someday (I think we've been bitten before by a difference between the SortitionDBConn and SortitionHandleTx impls).

My suggestion is to replace impl BurnStateDB for SortitionDBConn with impl BurnStateDB for SortitionHandleConn (which would require updating any consumers of this impl to create handles). I think this is safer in the long run (and HandleConn and HandleTx are close enough to each other, that maybe at some point, we use either a macro or the SortitionHandle trait to specify the exact same impl for both).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I pushed a change for this, but it's possible that I may have gone overboard and let this extend out further than necessary. I'll re-review this change tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, SortitionDB::get_canonical_burn_chain_tip() is NOT safe to use in consensus code. Both SortitionDBConn and SortitionHandleTx have self.context.sortition_id, which represents their respective views of the burnchain tip. They're guaranteed not to change over the course of their respective lifetimes, whereas the result of SortitionDB::get_canonical_burn_chain_tip() can change from call to call from outside of the coordinator thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, even within the coordinator thread, SortitionDB::get_canonical_burn_chain_tip() can return arbitrarily different values between Nakamoto blocks, which would lead to a chain split If used in this manner. For example, suppose tenure T had 10 blocks, labeled T0 through T9. It's entirely possible that node A processes a new burnchain block in-between processing T3 and T4, while node B processes the same new burnchain block between processing T5 and T6. Now, nodes A and B may diverge due to different return values from this impl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To further elaborate, the only safe place to call SortitionDB::get_canonical_burn_chain_tip() in consensus code is within code reachable from either ChainsCoordinator::handle_new_epoch2_burnchain_block() or ChainsCoordinator::handle_new_nakamoto_burnchain_block(). Only in these code paths is the value returned by SortitionDB::get_ccanonical_burn_chain_tip() guaranteed to be consistent, because these are the only two places where new sortitons can get created.

Use `SortitionHandleConn` instead of `SortitionDBConn`. This change
required propagation through many locations.
@@ -3274,7 +3274,7 @@ impl<
if let Some(ref mut estimator) = self.cost_estimator {
let stacks_epoch = self
.sortition_db
.index_conn()
.index_handle_at_tip()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid using this in consensus code? It can panic instead of propagating an underlying DB error. Perhaps add a fallible variant of index_handle_at_tip?

})
})
.with_read_only_unconfirmed_clarity_tx(
&sortdb.index_handle_at_tip(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here -- let's use a fallible variant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter here, since this is in a test?

})
})
.with_read_only_clarity_tx(
&sortdb.index_handle_at_tip(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

let readonly_marf = self
.index
.reopen_readonly()
.expect("BUG: failure trying to get a read-only interface into the sortition db.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just return an error here instead?

)
})
chainstate.maybe_read_only_clarity_tx(
&sortdb.index_handle_at_tip(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use something fallible here

.unwrap_or_else(|| (STXBalance::zero(), None))
};
chainstate.maybe_read_only_clarity_tx(
&sortdb.index_handle_at_tip(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use something fallible here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets used all over the place in this PR, so I'm just going to summarize my feelings in a review comment instead.

Copy link
Member

@jcnelson jcnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are at least two consensus bugs here:

  • You cannot call SortitionDB::get_canonical_burn_chain_tip() from chainstate-processing code, since it can change arbitrarily between two successive calls. Only the chains coordinator can safely call this, and only in specific places. This also applies to anything related to loading "canonical" data, including SortitionDB::get_canonical_sortition_tip() (see next point).

  • It is also not safe to call SortitionDB::index_handle_at_tip() or SortitionDB::tx_begin_at_tip(). Both of these call SortitionDB::get_canonical_sortition_tip() internally, and furthermore, both of these contain a needless .unwrap(). These two functions really need to be marked as #[cfg(any(test, feature = "testing"))], since they're not meant to be used in any other context.

@obycode
Copy link
Contributor Author

obycode commented May 29, 2024

I'm struggling trying to get an implementation of this SortitionDB method to work correctly:

pub fn index_handle_at_block<'a>(
        &'a self,
        stacks_block_id: &StacksBlockId,
    ) -> Result<SortitionHandleConn<'a>, db_error> { ... }

For example, this would be called in places like StacksChainState::eval_boot_code_read_only to get the iconn at the appropriate block. What would be the correct way to do this?

@obycode
Copy link
Contributor Author

obycode commented May 29, 2024

I currently am working with this version that adds a chainstate parameter. Will push an update soon. A few tests still fail after this change.

    pub fn index_handle_at_block<'a>(
        &'a self,
        chainstate: &StacksChainState,
        stacks_block_id: &StacksBlockId,
    ) -> Result<SortitionHandleConn<'a>, db_error> {
        let (consensus_hash, bhh) = match chainstate.get_block_header_hashes(stacks_block_id) {
            Ok(Some(x)) => x,
            _ => return Err(db_error::NotFoundError),
        };
        let snapshot = SortitionDB::get_block_snapshot_consensus(&self.conn(), &consensus_hash)?
            .ok_or(db_error::NotFoundError)?;
        Ok(self.index_handle(&snapshot.sortition_id))
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants