Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prover SDK tracking issue #148

Open
slumber opened this issue Apr 22, 2024 · 7 comments
Open

Prover SDK tracking issue #148

slumber opened this issue Apr 22, 2024 · 7 comments
Assignees

Comments

@slumber
Copy link
Contributor

slumber commented Apr 22, 2024

The main prover functionality is gathered in nexus-prover crate, which does all sort of stuff: generate public parameters, print nice CLI output, compress proofs etc.

I propose to remove it in favor of nexus-sdk which provides abstracted interface like

pub trait Prover {
    /// Output type.
    type Proof;
    
    /// Public parameters needed for generating proofs, if any.
    type PublicParams;
    
    fn prove(&self, params: &Self::PublicParams) -> Self::Proof;
    
    /// or something like prove_step to facilitate usage of "stepped" cli output.
}

And then use it

pub struct NexusProver<P: Prover = CompressedNova> {
    /* private */
}

pub struct NexusVM { }

This SDK is then integrated into nexus-tools to be reused with CLI and pretty output.

@slumber
Copy link
Contributor Author

slumber commented Apr 22, 2024

@govereau @sjudson @danielmarinq

@sjudson
Copy link
Contributor

sjudson commented Apr 22, 2024

You propose doing this alongside or in place of the nexus-api approach (#146) @slumber? For dev I think it'll be important to expose VM interfaces alongside the prover interface so that users can develop without needing to invoke the proof apis.

@slumber
Copy link
Contributor Author

slumber commented Apr 22, 2024

You propose doing this alongside or in place of the nexus-api approach (#146) @slumber? For dev I think it'll be important to expose VM interfaces alongside the prover interface so that users can develop without needing to invoke the proof apis.

nexus-api doesn't look like proper sdk and is more of a short-term solution, as it just re-exports what we currently have.
This issue is about generically wrapping the proving system and the vm into an extendable interface which we ourselves can reuse.

@sjudson
Copy link
Contributor

sjudson commented Apr 22, 2024

I'm amenable to really taking the time to think through the api design in a more comprehensive way than the #146 approach, which is really an attempt to get programmatic usage "out the door" in an at least somewhat constrained and opinionated way.

I do think it makes sense though to have an independent public facing API from the prover API that is also consumed internally though.

@slumber
Copy link
Contributor Author

slumber commented Apr 22, 2024

I'm amenable to really taking the time to think through the api design in a more comprehensive way than the #146 approach, which is really an attempt to get programmatic usage "out the door" in an at least somewhat constrained and opinionated way.

I do think it makes sense though to have an independent public facing API from the prover API that is also consumed internally though.

This is relevant because we already have Nova(seq), Nova(pcd), Nova(pcd+spartan), HyperNova soon, maybe Jolt which is going to be different.
There's no convenient way of switching between these. We could also add things like curve choice or commitment scheme later.

@sjudson
Copy link
Contributor

sjudson commented May 7, 2024

Here's a rough early sketch of a proposal for the overall SDK architecture. In terms of code, the idea will be to have an API that is consumed by both the SDK and CLI, providing two routes to similar functionality (albeit with the CLI being a simplified, streamlined version).

Objects

The central object of the SDK/CLI will be an engine. Each engine will correspond to an ISA + arithmetization method, which takes in a complied program using the nexus-rt and outputs a sequence of circuits to be handed to the verifier.

Each engine will be configurable with respect to its memory model and prover backend -- as checked by a compatibility layer -- and will also expose a set of properties indicating whether it supports key features (e.g., a private input tape, 32-bit vs. 64-bit architecture, various precompiles when they become available, etc.). Each engine will also be opinionated about its configuration, so that the vast majority of users should need to do no more than {NameOf}Engine::default(), while expert users have access to all possible (valid) configurations.

The prover backend will be the secondary object of the SDK/CLI. It itself will be configurable, e.g., with choice of whether to operate in sequential or PCD mode, as well as with its commitment scheme (pair) and curve (pair). Similarly, it will have compatibility checks and be opinionated about what defaults should be. Backends will also support a set of helpful property checks (e.g., whether they are recursive, Ethereum verifiable, support compression, etc.).

Interface

As for the user API exposed by the SDK, the goal is to support operation chaining using the ? operator. Ideally, we want the user to be able to write something like:

let vm = {NameOf}Engine::default()
             .load('/path/to/binary/')?
             .set_input<T>(input)
             .execute()?
             
assert!(vm.started());
assert!(vm.halted());
assert!(!vm.proven());

check(vm.output)?; // some user function returning Result<(), Error> checking that the output looks right

vm.rewind(); // do not clear loaded binary or input tape, as compared to vm.reset()

assert!(!vm.started());
assert!(!vm.halted());
 
vm.enable_prover_mode();
vm.prover.enable_sequential(); // likely to be the default
assert!(vm.is_proving());
 
vm.generate()?
  .execute()?; // alternatively, .prove()?
  .verify()?
  
assert!(vm.started());
assert!(vm.halted());
assert!(vm.proven());
assert!(vm.verified());

if vm.prover.can_compress() {
    vm.compress()?
    assert!(vm.compressed());
}

vm.proof 

I/O

For I/O (into engines that support it), we will use postcard to support serializing/deserializing arbitrary structs that implement Serde::Serialize and Serde::Deserialize. Essentially, the API (and so SDK) will support two interfaces to each of the public and private tapes:

-- set_input
-- set_input_bytes
-- set_public_input
-- set_public_input_bytes

Internal to the engines that support I/O, these are mirrored:

-- read_input
-- read_input_bytes
-- read_public_input
-- read_public_input_bytes

The bytes entries just read in and out arbitrary sequences of bytes. The non-bytes entries are generic, single-use functions, trait bound by Serde::Serialize + Serde::Deserizialize, that take care of serializing and deserializing information accordingly so that users never need to touch the raw bytes.

Compile-time Configuration

The last major piece of the early design is a compilation hook. This will be an (entirely optional) function that allows the user to trigger compilation + linkage of their program programmatically, through cargo. The main use case of this function will be to dynamically support emitting the linker, thereby enabling setting the memory limit through the SDK (per #152). The goal here will be to allow the user to provide a callback to do arbitrary compilation steps, with a default callback that just initiates the compilation, similar to: https://github.com/a16z/jolt/blob/c9bfe3226b8be6d5012495a7e2c72605980c6ae1/jolt-core/src/host/mod.rs#L93

@mx00s
Copy link
Contributor

mx00s commented May 7, 2024

The proposed Prover trait and the sample interface code are both helping me get a taste of where we are and potential directions we can go.

Highlighting a couple points that resonate with me:

This is relevant because we already have Nova(seq), Nova(pcd), Nova(pcd+spartan), HyperNova soon, maybe Jolt which is going to be different.
There's no convenient way of switching between these.

Yes, there are so many potential configurations, both now and in the future, and it's not a simple matter to swap them in and out.

Each engine will also be opinionated about its configuration, so that the vast majority of users should need to do no more than {NameOf}Engine::default(), while expert users have access to all possible (valid) configurations.

Definitely, it makes sense that we'd want both.

General princples we seem aligned on

  1. Internally, we want the flexibility to evolve implementation details to support executing multiple VM configurations with few code changes.
  2. Externally, we want to expose an opinionated set of configurations that are well tested and offer a consistent and ergonomic UX.

To make that a bit more concrete, a likely implementation strategy is that (1) involves traits and generic parameters and (2) provides wrapper types that handle selecting generic type arguments and the construction of those internal values.

Additional proposed principles

  1. The aformentioned compositional pattern can sensibly be used at varying layers of abstraction, not solely as part of the API/SDK distinction.
  2. As much as reasonable, minimize API/SDK dependence by leaking 3rd party types, and even std, for types we consume and produce via items with pub visibility.

For example, workshopping on the I/O concept to make flexible internal abstractions:

extern crate alloc;

use alloc::vec::Vec;

enum Error {
    // TODO
}

trait RawInputTape {
    fn read_byte(&mut self) -> Option<u8>;
}

trait RawOutputTape {
    fn write_byte(&mut self, byte: u8) -> Result<(), Error>;
}

impl RawInputTape for Vec<u8> {
    fn read_byte(&mut self) -> Option<u8> {
        self.pop()
    }
}

impl RawOutputTape for Vec<u8> {
    fn write_byte(&mut self, byte: u8) -> Result<(), Error> {
        self.push(byte);
        // TODO: handle failed memory allocation
        Ok(())
    }
}

trait RawProof {}
trait RawVirtualMachine<I, O>
where
    I: RawInputTape,
    O: RawOutputTape,
{
    type Proof: RawProof;
    
    fn run(&mut self, input: I) -> Result<(O, Self::Proof), Error>;
}

This allows us to keep the engine's interfaces as minimal as possible to reduce the cost of adding more implementations.

For ergonomic conveniences like input serialization and output deserialization (which could happen outside proof machinery) we could wrap those abstractions in high-level ones. Implementations of these higher-level abstractions would automatically work with all of the engines implementing the core abstractions they consume:

trait InputValue {
    fn serialize<I: RawInputTape>(&self) -> Result<I, Error>;
}

trait OutputValue {
    fn deserialize<O: RawOutputTape>(tape: &O) -> Result<Box<Self>, Error>;
}

struct UserInput;
impl InputValue for UserInput {
    fn serialize<I: RawInputTape>(&self) -> Result<I, Error> {
        todo!();
    }
}

struct UserOutput;
impl OutputValue for UserOutput {
    fn deserialize<O: RawOutputTape>(tape: &O) -> Result<Box<Self>, Error> {
        todo!();
    }
}

trait VirtualMachine<I, O>
where
    I: InputValue,
    O: OutputValue,
{
    type Input: RawInputTape;
    type Output: RawOutputTape;
    type Proof: RawProof;
    type Core: RawVirtualMachine<Self::Input, Self::Output, Proof = Self::Proof>;
    
    fn core(&mut self) -> &mut Self::Core;
    
    fn run(&mut self, input: I) -> Result<(O, Self::Proof), Error> {
        let input: Self::Input = input.serialize()?;
        let (output, proof) = self.core().run(input)?;
        let output = O::deserialize(&output)?;
        Ok((*output, proof))
    }
}

There's a lot more we could do with traits to model valid machine state transitions through UX interactions @sjudson alludes to in their sample code and @slumber's mention of "stepped CLI output".

EDIT: I realize now, and confirmed with @sjudson, that the point of postcard is to incorporate input serialization and output deserialization in the proofs. That's likely still be possible with a trait design following the principles I suggested, but maybe it makes sense to incrementally refine the abstractions/implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants