Skip to content

Latest commit

 

History

History
349 lines (296 loc) · 16.4 KB

status.md

File metadata and controls

349 lines (296 loc) · 16.4 KB

Current status of the Rust node

High Level Functionality Overview

  •  Consensus logic
  •  VRF evaluator
  • Block production logic
    • Without transactions and without proof
    • Full block with proof
    • Blocks with transactions. - Missing because we don't yet have the transaction pool logic.
  • Networking layer
    •  P2P layer in general along with serialization/deserialization of all messages
    • RPCs support
      • Get_some_initial_peers(this is not used by the OCaml node)
      • Get_staged_ledger_aux_and_pending_coinbases_at_hash
      • Answer_sync_ledger_query
      • Get_transition_chain
      • Get_transition_knowledge (I don't think this one is used at all, Get_transition_chain_proof is used instead)
      • Get_transition_chain_proof
      • Get_ancestry
      • Ban_notify
      • Get_best_tip
      • Get_node_status
    • Peer discovery/advertising
      •  Peer discovery through kademlia
      •  Advertising the node through kademlia so that OCaml nodes can see us
  • Trust system (to punish/ban peers): not implemented (and no equivalent)
  • Pools
    • Transaction pool: not implemented
      • No pool is maintained, transactions received over the gossip network are not processed or re-broadcasted
    • SNARK pool
      • SNARK Verification
      •  Pool is implemented
      •  SNARK work production is implemented (through OCaml). Node can complete and broadcast SNARK work.
  •  Compatible ledger implementation
  •  Transition frontier
  • Bootstrap/Catchup process
    • Ledger synchronization
      • Snarked ledgers (staking and next epoch ledgers + transition frontier root)
        • Handling of peer disconnections, timeouts or cases when the peer doesn't have the data
        • Detecting ledger hash mismatches for the downloaded chunk
        • Handling ledger hash mismatches gracefully, without crashing the node
        • Optimized snarked ledgers synchronization (reusing previous ledgers when constructing the next during (re)synchronization)
      • Staged ledgers (transition frontier root)
        • Handling of peer disconnections, timeouts or cases when the peer doesn't have the data
        • Detection and handling of validation errors
      • Handling of the rpc requests from other nodes to sync them up
    •  Moving root of the transition frontier
    •  Maintaining ledgers for transition frontier root, staking and next epoch ledgers
      • When scan state tree gets committed, snarked ledger of the block is updated. When that happens for the root block in the transition frontier, reconstruct the new root snarked ledger
      • At the end of an epoch make the "next epoch" ledger the new "staking" ledger, discard the old "staking" ledger and make the snarked ledger of the best tip the new "next epoch" ledger
    • Best chain synchronization
      • Download missing blocks from peers
        • Handling of peer disconnections, timeouts or cases when the peer doesn't have the data
        • Downloaded block header integrity validation by checking it's hash and handling the mismatch
        • Downloaded block body integrity validation by checking it's hash and handling the mismatch
      • Missing blocks application
        • Graceful handling of block application error without crashing the node
    • Handling of reorgs (short/long range forks) or best chain extension after or even mid-synchronization, by adjusting synchronization target and reusing what we can from the previous synchronization attempt
  • Block application
    •  Transaction application logic
    •  Block application logic
    • Proof verification:
      • Block proof verification
      • Transaction proof verification (same as above)
      • Zkapp proof verification (same as above)
  • Client API (currently the node has a very partial support, not planned at the moment)
  • Support for the archive node sidecar process (sending updates through RPC calls).

VRF Evaluator

  • VRF evaluator functionality:
    • Calculation of the VRF output
    • Threshold calculation determining if the slot has been won
    • (Optional) Providing verification of the producers VRF output (Does not impact the node functionality, just provides a way for the delegates to verify their impact on winning/losing a slot)
  • Implement VRF evaluator state machine
    • Computation service
    • Collecting the delegator table for the producer
    • Integrate with the block producer
    • Handling epoch changes - starting new evaluation as soon as new epoch data is available
    • Retention logic - cleanup slot data that is in the past based on current global slot (Slight node impact - the won slot map grows indefinitely)
  • Testing
    • Correctness test - Selecting the correct ledgers
      • (Edge case) In genesis epoch
      • In other (higher) epochs
    • Correctness test - Computation output comparison with mina cli
    • Correctness test - Start a new VRF evaluation on epoch switch for the next available epoch
    • Correctness test - Retaining the slot data only for future blocks
  • Documentation

Block Producer

  • Block producer
    • Integrate with VRF evaluator
    • Include coinbase transactions
    • Include fee transfers
    • Include simple transactions (transaction pool missing)
    • Include zkapp transactions (transaction pool missing)
    • Ledger diff creation
    • Integrate with transition frontier
    • New epoch seed calculation
    • Staking epoch ledger selection
    • Proof generation
  • Testing
  • Documentation

Ledger

  • Ledger/Mask implementation
  • Staged Ledger implementation
    • Scan state
    • Pending coinbase collection
    • Transaction application
      • Regular transaction (payment, delegation, coinbase, fee transfer)
      • Zkapps
  • Persistent database

Proofs

  • Proof verification
    • Block proof
    • Transaction/Merge proof
    • Zkapp proof
  • Proof/Witness generation
    • Block proof
    • Transaction/Merge proof
    • Zkapp proof
  • Circuit generation

P2P Implementation (State Machine Version)

Handshake

  • Create a service for low level TCP networking (mio, epoll).
  • DNS support.
  • Pnet protocol.
  • Multistream select protocol.
  • Handle simultaneous connect case.
  • Noise protocol for outgoing connections.
  • Noise protocol for incoming connections.
  • Yamux multiplexer.

Peer management

  • Create connection scheduler to limit work for each peer
  • Handle reconnection and exponential backoff.

Identify

  • Identify protocol implementation

Peer discovery

  • Implement Kademlia algorithm.
    • Implement Kademlia FIND_NODE (client/server).
    • Implement Kademlia Bootstrap process.
    • Update Kademlia routing table according to Identify protocol messages.

RPC

  • Perform outgoing RPC requests.
  • Handle incoming RPC requests.

Gossipsub

  • Implement gossipsub compatible with libp2p.
  • Research how to use "expander graph" theory to make gossipsub robust and efficient.

Testing

  • Fix bootstrap sandbox record/replay for the latest berkeley network.
  • Fix network debugger for the latest berkeley network.
  • Test that the Openmina node can bootstrap from the replayer tool.
  • Test that the OCaml node can bootstrap from the Openmina node.
  • Test that the Openmina node can bootstrap from another instance of openmina node.

P2P Related Tests

  • P2p functionality tests
    • p2p messages
      • Binprot types (de)serialization testing/fuzzing
      • Mina RPC types testing (ideally along with OCaml codecs)
      • hashing testing (ideally along with OCaml hash implementations)
    • Connection
      • Proper initial peers handling, like reconnecting if offline
      • Peers number maintaining, including edge cases, when we have max peers but still allow peers to connect for e.g. discovery, that is dropping connection strategy
      • Other connection constraints, like no duplicate connections to the same peer, peer_id, no self connections etc
      • Connection quality metrics
    • Kademlia
      • Peers discovery, according to Kademlia parameters (a new node gets 20 new peers)
      • Kademlia routing table is up-to-date with the network (each peer status, like connected/disconnected/can_connect/cant_connect, reflects actual peer state)
    • Gossipsub
      • Reacheability (all nodes get the message)
      • Non-redundancy (minimal number of duplicating/unneeded messages)
  • Interoperability with OCaml node
    • Bootstrap Rust node from OCaml and vice versa
    • Discovery using Rust node
    • Gossipsub relaying
  • Public network tests. This should be the only set of tests that involve publicly available networks, and should be executed if we're sure we don't ruin them.
  • Attack resistance testing

Frontend

Pages

  • Nodes - Overview
  • Nodes - Live
  • Nodes - Bootstrap
  • State - Actions
  • Snarks - Work Pool
  • Snarks - Scan State
  • Resources - Memory
  • Network - Messages
  • Network - Blocks
  • Network - Connections
  • Network - Topology
  • Network - Node DHT
  • Peers - Dashboard
  • Testing Framework - Scenarios

Testing

  • Tests for Nodes Overview
  • Tests for Nodes Live
  • Tests for Nodes Bootstrap
  • Tests for State - Actions
  • Tests for Snarks - Work Pool
  • Tests for Snarks - Scan State
  • Tests for Resources - Memory
  • Tests for Network - Messages
  • Tests for Network - Blocks
  • Tests for Network - Connections
  • Tests for Network - Topology
  • Tests for Network - Node DHT
  • Tests for Peers - Dashboard
  • Tests for Testing Framework - Scenarios

Other

  • CI Integration and Docker build & upload
  • State management
  • Update to Angular v16
  • Ensure performant application (lazy load & standalone components)
  • Ensure reusable components/css/BL

Documentation

By module

By use-case

Experimental State Machine Architecture

Core state machine

  • Automaton implementation that separates action kinds in pure and effectful.
  • Callback (dispatch-back) support for action composition: enable us to specify in the action itself the actions that will dispatched next.
  • Fully serializable state machine state and actions (including descriptors to callbacks!).
  • State machine state management
    • Partitioning of the state machine state between models sub-states (for pure models).
    • Forbid direct access to state machine state in effectful models.
    • Support for running multiple instances concurrently in the same state machine for testing scenarios: for example if the state machine represents a node, we can "run" multiple of them inside the same state machine.

Models

Each model handles a subset of actions and they are registered like a plugin system.

Effectful

Thin layer of abstraction between the "external world" (IO) and the state machine.

  • MIO model: provides the abstraction layer for the polling and TCP APIs of the MIO crate.
  • Time model: provides the abstraction layer for SystemTime::now()

Pure

Handle state transitions and can dispatch actions to other models.

  • Time model: this is the pure counterpart which dispatches an action to effectful time model to get the system time and updates the internal time in the state machine state.
  • TCP model: built on top of the MIO layer to provide all necessary features for handling TCP connections (it also uses the time model to provide timeout support for all actions).
  • TCP-client model: built on top of the TCP model, provides a high-level interface for building client applications.
  • TCP-server model: built on top of the TCP model, provides a high-level interface for building server applications.
  • PRNG model: unsafe, fast, pure RNG for testing purposes.
  • PNET models: implements the private network transport used in libp2p.
    • Server
    • Client
  • Testing models:
    • Echo client: connects to an echo server and sends random data, then checks that it receives the same data.
    • Echo server.
    • Echo client (PNET).
    • Echo server (PNET).
    • Simple PNET client: connects to berkeleynet and does a simple multistream negotiation.

Tests

  • Echo network
    • State machine with a network composed of 1 client and 1 server instance.
    • State machine with a network composed of 5 clients and 1 erver instance.
    • State machine with a network composed of 50 clients and 1 erver instance.
  • Echo network PNET (same tests as echo network but over the PNET transport).
  • Berkeley PNET test: runs the simple PNET client model.