[Feature] SDK Refactor + Enhanced Multithreading Support #790

Pauan · 2023-10-25T18:31:45Z

Replaces the process_inputs, execute_program, and execute_fee macros with methods.
This removes a lot of code duplication and makes the code simpler.
Adds a new thread_pool::spawn function which allows for running any Send + 'static code on the
Rayon threadpool.
All of the various methods (execute_function_offline, execute, deploy, split, etc.) are
now automatically run on the Rayon threadpool, which means everything runs in parallel.

This means the user no longer needs to use multiple Workers for paralellism, they can
run all of their JS code within a single Worker.

Pauan · 2023-11-03T02:35:14Z

The old code created various variables (process, program, rng, etc.) and then used macros (such as execute_program!) which would mutate those variables.

The new code instead bundles those variables inside of a ProgramState struct, and then provides methods which accomplish the same thing as the macros.

Bundling variables inside of a struct is a very common Rust idiom, it has many advantages:

The variables are now logically organized, instead of spread out all over the place.
You can use a new method to create the variables and the struct, which removes code duplication.
You can attach helper methods to the struct.

The ProgramState struct is not public, it is only used inside of the implementation of the ProgramManager methods.

The struct is created inside of the ProgramManager methods, and then it is dropped when the method finishes. So it is temporary and ephemereal. It only exists for the duration of the method, and it is a purely internal implementation detail.

This pattern of creating temporary wrapper structs is very common and idiomatic in Rust.

Structs are not classes, structs in Rust serve many different purposes, and creating wrapper structs is very common and completely normal.

For example, Iterator, Future, and Stream all use a similar pattern of returning temporary wrapper structs:

let state = vec![...];

let state = state.into_iter();

let state = state.map(|x| ...);

let state = state.filter(|x| ...);

let state = state.collect();

The into_iter method consumes the Vec and returns a new IntoIter struct:

https://doc.rust-lang.org/std/vec/struct.IntoIter.html

The IntoIter struct wraps the original Vec, and it also holds some additional state which is needed for iteration.

And then when you call the map method, it consumes self and returns a new Map struct:

https://doc.rust-lang.org/std/iter/struct.Map.html

The Map struct provides some additional state on top of the IntoIter.

And then the filter method consumes self and returns a new Filter struct:

https://doc.rust-lang.org/std/iter/struct.Filter.html

The Filter struct provides some additional state.

Lastly the collect method consumes self and returns some new collection. That collection is based on the previous structs.

All of these structs are temporary, they exist only to describe the business logic, they aren't intended as a long-term storage of state. They are only used locally inside of a method. And when you are finished with them, the structs are simply thrown away.

The structs in this pull request behave in the same way:

let state = ProgramState::new(program, imports).await?;

let (state, deploy) = state.deploy().await?;

deploy.check_fee(fee_microcredits)?;

let (state, fee) = state
    .execute_fee(
        deploy.execution_id()?,
        url,
        private_key.clone(),
        fee_microcredits,
        fee_record,
        fee_proving_key,
        fee_verifying_key,
    )
    .await?;

state.deploy_transaction(deploy, fee).await

The ProgramState wraps the process, program, and rng. It provides various methods which return a new state, and some extra additional data.

For example, the deploy method returns a Deploy struct, execute_fee returns an ExecuteFee struct, execute_program returns an ExecuteProgram struct, etc.

This is the same as how the iterator map method returns a Map struct, the iterator filter method returns a Filter struct, the into_iter method returns an IntoIter struct, etc.

All of these structs are temporary, and they are thrown away when the ProgramManager method finishes. They are just an ephemereal container for passing data around. This is very normal and idiomatic for Rust.

Why do the ProgramState methods return a new state? Multi-threading cannot use references, which means it cannot use &self or &mut self. So that means everything must be owned.

That means that all of the ProgramState methods must use self, which consumes the ProgramState. However, we want to keep using the ProgramState even after the method is finished. So the methods must return self so that way it can continue to be used.

This pattern is very common in functional programming languages, because functional programming languages cannot mutate, so they must always return new state. It is also common in Rust, for example with the builder pattern. Returning self from a method is very normal.

And the reason why the ProgramState is temporary is because multi-threading requires it to be owned. And so the ProgramState struct is created, it is moved to another thread, it runs some code in that other thread, and it's finally discarded when it's finished.

If the ProgramState was long-lived, then it could not be owned, and so it could not be used with multi-threading.

ProgramState is designed in this way because it is necessary, because of the restriction of multi-threading.

The end result is that the new code is exactly the same as the old code, except:

It uses a struct with multiple fields instead of using multiple variables (it is normal and idiomatic in Rust to use structs to bundle multiple variables).
It uses methods instead of macros (e.g. the execute_program! macro becomes an execute_program method).
It is multi-threaded.

Perhaps the name ProgramState is confusing. It can be easily renamed to something else, such as MethodState, or ExecutionState. The name is not important, it is an internal struct, purely an implementation detail.

Pauan added 6 commits October 22, 2023 14:41

Initial work on running code on Rayon threads

3a48077

Major refactoring, converting macros into methods

892a1ad

Fixing issue with PrivateKey

24d087e

Merge branch 'testnet3' into build-fixes

5a34b8f

rustfmt

d4491d0

Fixing unit tests

3398e30

iamalwaysuncomfortable changed the title ~~Major refactor~~ [Feature] SDK Refactor + Enhanced Multithreading Support Oct 26, 2023

iamalwaysuncomfortable linked an issue Oct 26, 2023 that may be closed by this pull request

[Feature] Parallel Executions within a single web worker #778

Open

3 tasks

Pauan mentioned this pull request Apr 27, 2024

Adding in thread_pool::spawn function #881

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] SDK Refactor + Enhanced Multithreading Support #790

[Feature] SDK Refactor + Enhanced Multithreading Support #790

Pauan commented Oct 25, 2023

Pauan commented Nov 3, 2023 •

edited

[Feature] SDK Refactor + Enhanced Multithreading Support #790

Are you sure you want to change the base?

[Feature] SDK Refactor + Enhanced Multithreading Support #790

Conversation

Pauan commented Oct 25, 2023

Pauan commented Nov 3, 2023 • edited

Pauan commented Nov 3, 2023 •

edited