Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: move to Nix (simpler, easy to maintain version) #1320

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

storopoli
Copy link
Contributor

@storopoli storopoli commented Feb 4, 2024

Open questions

  1. I still get some random failures (at least locally) because of the tests DDoS'ing the poor electrs/bitcoind local binaries. This can be fixed with -- --test-threads=2. Shall we add?
  2. I am pinning bitcoind to 0.25.1. 0.26.0 I am getting build errors in darwin systems (Apple stuff). Do we need 0.26.0 now? Or can we wait for nixpkgs 24.5?

TODO

  • If nix 24.5 get launched before this is merged, then pin nixpkgs to 24.5.

Description

This is a simpler version of #1257.
Although not fully reproducible (we are fetching Cargo dependencies instead of declaring all of them).
It is WAY easier to maintain and provides most of the benefits and Dev-Experience from Nix.

You can switch flawlessly from Rust stable, MSRV, and nightly using either:

  • nix develop .: latest stable
  • nix develop ".#msrv": Rust MRSV
  • nix develop ".#nightly": Rust Nightly

It handles all the annoyances of Cargo.locks and cargo updates.

Also you don't need to worry about installing binaries from bitcoind or electrs crates,
since we fetch these from nixpkgs (with pinned versions) and add them to your local environment.
Additionally, all the extra dependencies that you need for WASM are handled as well.
Finally, no need to set ENV vars: they are also set all by default to you.

If you are committing typos, or failing to adhere to our guidelines in CONTRIBUTING.md;
we won't let you commit or push.
This is handled by Nix's automated pre-commit hooks which gives helpful error messages.

Please check the NIX.md file to learn more about Nix and what we are aiming to do with the Nix integration into BDK.

I am taking the liberty of tagging certain people that are Nix enthusiast to help the discussion and review.
Cc @notmandatory, @dpc, @casey-bowman, @plebhash, @matthiasdebernardini, @gudnuf, @oleonardolima.

Notes to the reviewers

  • Pre-commit checks for everything that we enforce:
    • Checks for typos in the source code and documentation.
    • Checks if all the commits are GPG-signed and follow the conventional commits style.
      (again, please check CONTRIBUTING.md)
    • Runs cargo clippy in all workspace.
    • Runs cargo fmt in all workspace.
    • Runs nixpgs-fmt in all workspace (for .nix files)
  • Instructions and rationale in NIX.md

Closes #1162. Superseds #1122, #1156, #1165, #1257.

Pinneds Dependencies:

Changelog notice

  • We moved to Nix-based CI and Local development environment to better onboard new developers and enhance the productivity of developers.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

Bugfixes:

  • This pull request breaks the existing API
  • I've added tests to reproduce the issue which are now passing
  • I'm linking the issue being fixed by this PR

@storopoli storopoli force-pushed the js/nix-simple branch 7 times, most recently from ab132c9 to d139091 Compare February 4, 2024 18:50
@dpc
Copy link

dpc commented Feb 4, 2024

This can be fixed with -- --test-threads=2. Shall we add?

Why not increase bitcoind's rpc threads and work queue from the potato-like defaults instead?

@dpc
Copy link

dpc commented Feb 4, 2024

So just the dev shells, no crane etc.? Seems reasonable. Dev shells are like 80% of benefits for very little Nix involvment.

crates/bdk/Cargo.toml Outdated Show resolved Hide resolved
@storopoli
Copy link
Contributor Author

This can be fixed with -- --test-threads=2. Shall we add?

Why not increase bitcoind's rpc threads and work queue from the potato-like defaults instead?

Good call I did this in a1997cf

bitcoind_conf.args.push("-rpcworkqueue=1024");
bitcoind_conf.args.push("-rpcthreads=64");

@evanlinjin
Copy link
Member

This can be fixed with -- --test-threads=2. Shall we add?

Why not increase bitcoind's rpc threads and work queue from the potato-like defaults instead?

Good call I did this in a1997cf

bitcoind_conf.args.push("-rpcworkqueue=1024");
bitcoind_conf.args.push("-rpcthreads=64");

64 threads is way too much. I think the ideal is to have 1 thread per CPU core. I think a good compromise is 4 (between desktop and CI).

@storopoli
Copy link
Contributor Author

This can be fixed with -- --test-threads=2. Shall we add?

Why not increase bitcoind's rpc threads and work queue from the potato-like defaults instead?

Good call I did this in a1997cf

bitcoind_conf.args.push("-rpcworkqueue=1024");
bitcoind_conf.args.push("-rpcthreads=64");

64 threads is way too much. I think the ideal is to have 1 thread per CPU core. I think a good compromise is 4 (between desktop and CI).

4 is the default. So no change then? I can easily delete a1997cf (that's why I did it in a separate commit 😄)

@dpc
Copy link

dpc commented Feb 4, 2024

64 threads is way too much. I think the ideal is to have 1 thread per CPU core. I think a good compromise is 4 (between desktop and CI).

Its a io bound workload. Absent async capabilities default 4 is laughable even for one core and I don't even know why bitcoind is doing it.

@evanlinjin
Copy link
Member

64 threads is way too much. I think the ideal is to have 1 thread per CPU core. I think a good compromise is 4 (between desktop and CI).

Its a io bound workload. Absent async capabilities default 4 is laughable even for one core and I don't even know why bitcoind is doing it.

I didn't realize it was io-bounded. However, in our test environment, I doubt there will be 64 concurrent calls. How about we meet in the middle and say 16?

@dpc
Copy link

dpc commented Feb 5, 2024

I was on my phone before, so let me elaborate.

I didn't realize it was io-bounded. However, in our test environment, I doubt there will be 64 concurrent calls. How about we meet in the middle and say 16?

From a perspective of most workloads e.g. asking for blockchain data, bitcoind (and electrs) acts very much like database software, e.g. posgresql. Postgresql is an interesting case because it utilizes process-based model - meaning it will spawn not just a thread, but a whole process for every client connection - waaay heavier than just a thread.

Even for that heavy model, the recommendations will be to limit it around around 500.

It takes many threads to saturate modern IO device. Just an example from a yt video I had on hand benchamarking filesystems. Each bar is number of workers. And these are dedicated IO-generating workers, generating heavy load in a tight loop. So in this benchmark, it will take like 4 workers. Possibly one worker is even more threads. But an rpc thread in bitcoind by necessity will be spending lots of time reading things and writing stuff to a tcp connection, doing some checks, re-encodings and what not. To actually get a maximum IO performance there will need to be a lot of them. I'd say 64 per core.

On my mostly idle laptop at this very moment there are 2k threads running:

image

On a modern Linux a thread is just a stack, which due to CoW optimizations will really amount to 1 or 2 pages (4K page size) of physically used memory plus a tiny (1K?) amount to track it all in the kernel.

There's really not much to save on thread counts in a grand scheme of things.

Personally, I like to run as lots and lots of tests in parallel, especially any sort of integration and e2e tests, which tend to have lots of dead time - waiting for stuff (like disk and network IO - latency-bound things). The usual limit to this is the available memory. Additional benefit is also that it tends to expose problems more often, as all operations tend to have more ... timing jitter.

So I'm really surprised that the default in bitcoind is 4. That's too little even for an old generation raspberry pi.

CONTRIBUTING.md Show resolved Hide resolved
CONTRIBUTING.md Show resolved Hide resolved
NIX.md Show resolved Hide resolved
NIX.md Show resolved Hide resolved
@matthiasdebernardini
Copy link

Can you have it print at the beginning, "First time takes a good long while..." or something like that. Some parts seem so slow that you'd think the shell crashed or something like that. Would be good for the user to know this.

Also I am having 8 tests fail, is that just me?

running 8 tests
test mempool_re_emits_if_tx_introduction_height_not_reached ... FAILED
test mempool_avoids_re_emission ... FAILED
test no_agreement_point ... FAILED
test mempool_during_reorg ... FAILED
test tx_can_become_unconfirmed_after_reorg ... FAILED
test test_sync_local_chain ... FAILED
test ensure_block_emitted_after_reorg_is_at_reorg_height ... FAILED
test test_into_tx_graph ... FAILED

failures:

---- mempool_re_emits_if_tx_introduction_height_not_reached stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- mempool_avoids_re_emission stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- no_agreement_point stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- mempool_during_reorg stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- tx_can_become_unconfirmed_after_reorg stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- test_sync_local_chain stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- ensure_block_emitted_after_reorg_is_at_reorg_height stdout ----
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)

---- test_into_tx_graph stdout ----
getting new addresses!
got new addresses!
mining block!
Error: JSON-RPC error: transport error: Couldn't connect to host: Resource temporarily unavailable (os error 35)


failures:
    ensure_block_emitted_after_reorg_is_at_reorg_height
    mempool_avoids_re_emission
    mempool_during_reorg
    mempool_re_emits_if_tx_introduction_height_not_reached
    no_agreement_point
    test_into_tx_graph
    test_sync_local_chain
    tx_can_become_unconfirmed_after_reorg

test result: FAILED. 0 passed; 8 failed; 0 ignored; 0 measured; 0 filtered out; finished in 24.35s

error: test failed, to rerun pass `-p bdk_bitcoind_rpc --test test_emitter`
(nix:nix-shell-env) matthiasdebernardini@Matthiass-MacBook-Pro-2:~/Projects/bdk_nix$

@storopoli

This comment was marked as resolved.

flake.nix Outdated Show resolved Hide resolved
flake.nix Outdated Show resolved Hide resolved
@storopoli storopoli force-pushed the js/nix-simple branch 2 times, most recently from 1171285 to c126d49 Compare March 25, 2024 16:09
@storopoli storopoli requested a review from yanganto March 25, 2024 16:10
Comment on lines -23 to +37
let bitcoind = match std::env::var_os("BITCOIND_EXE") {
let bitcoind = match std::env::var_os("BITCOIND_EXEC") {
Some(bitcoind_path) => electrsd::bitcoind::BitcoinD::new(bitcoind_path),
None => {
let bitcoind_exe = electrsd::bitcoind::downloaded_exe_path()
.expect(
"you need to provide an env var BITCOIND_EXE or specify a bitcoind version feature",
"you need to provide an env var BITCOIND_EXEC or specify a bitcoind version feature",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were driving me crazy. It was not trivial to find.
These were wrong according to the bitcoind/electrsd docs.
Glad that I was able to fix it.

Cc @LagginTimes.

@storopoli
Copy link
Contributor Author

This rebasing is turning out to be not trivial.
@yanganto can you take a look at the CI errors when you have the time, please?

@yanganto
Copy link
Contributor

You can see the error

   Downloaded js-sys v0.3.68
error: package `powerfmt v0.2.0` cannot be built because it requires rustc 1.67.0 or newer, while the currently active rustc version is 1.63.0

Hi @storopoli
What is your base? It seems not rebased on main branch.
You can see the dependency in the main is

electrsd = { version= "0.25.0", features = ["bitcoind_25_0", "esplora_a33e97e1", "legacy"] }

But now your feature branch is
https://github.com/storopoli/bdk/blob/488cdc3b8004b0c9b77ce160675dfcddbfcf7c70/crates/esplora/Cargo.toml#L26

electrsd = { version = "0.26.0", features = ["legacy"] }

And this possibly needs 1.68 as MSRV, or make an optional feature in electrsd and avoid that. (base on electrsd 0.26 -> zip 0.6 -> time 0.3.7)

Currently, time msrv is 1.68, but I think it is not worthy to dig here, because the branch is rebased to something wrong place.


Hi @ValuedMammal @evanlinjin
Rust crate is possibly dependent on the Rust version. If we do not use msrv to develop, it will possibly upgrade the dependency, not suitable for msrv and we don't know. (Dev will always say it should work but actually, it does not) That is also the reason why we need to fix the environment and nix help on this.

This is what I did a similar task before storopoli#3 for the PR. If this is still in a not merged PR, then a similar thing will keep happening. So that is why the nix needs to be introduced in the early stage of a project.

@storopoli storopoli force-pushed the js/nix-simple branch 3 times, most recently from 6f567d3 to 864f78c Compare March 26, 2024 09:56
@storopoli
Copy link
Contributor Author

What is your base? It seems not rebased on main branch. You can see the dependency in the main is

electrsd = { version= "0.25.0", features = ["bitcoind_25_0", "esplora_a33e97e1", "legacy"] }

But now your feature branch is
https://github.com/storopoli/bdk/blob/488cdc3b8004b0c9b77ce160675dfcddbfcf7c70/crates/esplora/Cargo.toml#L26

electrsd = { version = "0.26.0", features = ["legacy"] }

@yanganto this is the exact change that we we're proposing with the Nix CI.
If you

See the esplora/Cargo.toml from previous iterations before the rebasing attempt.
This is an example from your last PR to fix devshells (https://github.com/yanganto/bdk/blob/361c81bfaf988c65df24394aa349efb22bf843c7/crates/esplora/Cargo.toml#L25):

electrsd = { version = "0.26.0", features = ["legacy"] }

If we include "bitcoind_25_0", "esplora_a33e97e1" they will trigger downloads.
This was working before.

I 0368051 and 864f78c fixes that in the testenv/Cargo.toml (and some other things) which is the major new moving part in this rebasing round.

I tried to fix the clippy warnings in 9373abb but now I am getting this error:

error[E0786]: found invalid metadata files for crate `bitcoin`
   --> crates/chain/src/spk_iter.rs:202:20
    |
202 |         let secp = bitcoin::secp256k1::Secp256k1::signing_only();
    |                    ^^^^^^^
    |
    = note: invalid metadata version found: /home/runner/work/bdk/bdk/target/debug/deps/libbitcoin-9e061ce2009267ea.rmeta

I cannot reproduce this locally with nix develop -L .#msrv --command cargo clippy --all-targets --all-features -- -D warnings

@yanganto
Copy link
Contributor

If we include "bitcoind_25_0", "esplora_a33e97e1" they will trigger downloads. This was working before.
Okay, I got it. Sorry about this.

error[E0786]: found invalid metadata files for crate `bitcoin`
   --> crates/chain/src/spk_iter.rs:202:20
    |
202 |         let secp = bitcoin::secp256k1::Secp256k1::signing_only();
    |                    ^^^^^^^
    |
    = note: invalid metadata version found: /home/runner/work/bdk/bdk/target/debug/deps/libbitcoin-9e061ce2009267ea.rmeta

I cannot reproduce this locally with nix develop -L .#msrv --command cargo clippy --all-targets --all-features -- -D warnings

The error is

error: could not compile `bdk_chain` due to 34 previous errors

The bdk crate can not be built when using clippy.
Your local crate is built before, so no fresh build again.
The error is under bdk crate nothing about nix.
When doing a fresh machine with msrv Rust, we will get this just like we see in CI.

Forgot the crate bug, I will just do clippy after the build, leave the hidden bug still in the crate, and focus on the ci itself here. Also, I found the pre-commit-check does not use msrv.
storopoli#4

@ValuedMammal
Copy link
Contributor

* current PR provided a reproducible environment (every dependency and version of the software in the system are locked) for developing, CI

* this will be the first step to make things robust, I believe you heard something only works on someone's machine, build server, or CI server in a lot of project.  Nix is a mature solution to solve this.

@yanganto Thank you for the insightful comments, and especially for the work you put into this. True I haven't given this the review time it deserves, though I'm willing to be open minded when it comes to improving the build system.

@storopoli
Copy link
Contributor Author

Alright, this is now once again ready to merge to master.
Since this is on the beta milestone, we'll wait for the alpha PRs to land.
I don't see any more painful Cargo.tomls changes like the ones that occurred in master in the future.
So rebasing will probably be easier now.

Again, thanks @yanganto for the amazing work.

@storopoli storopoli force-pushed the js/nix-simple branch 3 times, most recently from 677cac9 to 103bf17 Compare April 19, 2024 13:11
@storopoli storopoli force-pushed the js/nix-simple branch 3 times, most recently from f48c1b7 to 5ee7644 Compare May 12, 2024 11:13
You can switch flawlessly from Rust stable, MSRV, and nightly using either:

- `nix develop .`: latest stable
- `nix develop ".#msrv"`: Rust MRSV
- `nix develop ".#nightly"`: Rust Nightly

Also you don't need to worry about installing binaries from `bitcoind` or `electrs` crates,
since we fetch these from `nixpkgs` (with pinned versions) and add them to your local environment.
Additionally, all the extra dependencies that you need for WASM are handled as well.
Finally, no need to set ENV vars: they are also set all by default to you.

If you are committing typos, or failing to adhere to our guidelines in [CONTRIBUTING.md](CONTRIBUTING.md);
we won't let you commit or push.
This is handled by Nix's automated `pre-commit` hooks which gives helpful error messages.

Please check the [NIX.md](NIX.md) file to learn more about Nix and what we are aiming to do with the Nix integration into BDK.
@storopoli
Copy link
Contributor Author

Another painful rebase 😓.
But this is again ready to merge once we get to beta milestones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Needs Review
Development

Successfully merging this pull request may close these issues.

ci: workflows failing due to "Connection timed out" errors
9 participants