Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add a few more base32/64 calculation corner cases #29847

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

paplorinc
Copy link
Contributor

@paplorinc paplorinc commented Apr 10, 2024

Added tests for base32 and base 64 conversions.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 10, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

No conflicts as of last run.

@paplorinc paplorinc marked this pull request as draft April 10, 2024 12:53
@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch 2 times, most recently from 29491c6 to 0fde5c0 Compare April 10, 2024 13:20
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/23660462203

@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch 2 times, most recently from c3fe309 to bb2fb45 Compare April 10, 2024 14:16
@fanquake
Copy link
Member

fanquake commented Apr 10, 2024

https://github.com/bitcoin/bitcoin/pull/29847/checks?check_run_id=23664375509

Run base_encode_decode with args ['/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', PosixPath('/ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/base_encode_decode')]INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1375269970
INFO: Loaded 1 modules   (578208 inline 8-bit counters): 578208 [0x55e843d75228, 0x55e843e024c8), 
INFO: Loaded 1 PC tables (578208 PCs): 578208 [0x55e843e024c8,0x55e8446d4ec8), 
INFO:     1342 files found in /ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/base_encode_decode
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048564 bytes
util/strencodings.cpp:213:63: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'size_type' (aka 'unsigned long')
    #0 0x55e842a7e2c4 in DecodeBase32(std::basic_string_view<char, std::char_traits<char>>) src/util/strencodings.cpp:213
    #1 0x55e840ea6c48 in base_encode_decode_fuzz_target(Span<unsigned char const>) src/test/fuzz/base_encode_decode.cpp:34:19
    #2 0x55e841344a6d in std::function<void (Span<unsigned char const>)>::operator()(Span<unsigned char const>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
    #3 0x55e841344a6d in LLVMFuzzerTestOneInput src/test/fuzz/fuzz.cpp:180:5
    #4 0x55e840c6cad4 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a62ad4) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    #5 0x55e840c6dd01 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a63d01) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    #6 0x55e840c6e2f7 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a642f7) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    #7 0x55e840c5b7ef in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a517ef) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    #8 0x55e840c85f16 in main (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a7bf16) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    #9 0x7f3697ff01c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 232274c0019767b821da1c6ebc2df43e60503035)
    #10 0x7f3697ff028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 232274c0019767b821da1c6ebc2df43e60503035)
    #11 0x55e840c507d4 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a467d4) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)

SUMMARY: UndefinedBehaviorSanitizer: unsigned-integer-overflow util/strencodings.cpp:213:63 
MS: 0 ; base unit: 0000000000000000000000000000000000000000

https://github.com/bitcoin/bitcoin/actions/runs/8632800923/job/23664375055?pr=29847

 Assertion failed: (ToLower(encoded_string) == ToLower(TrimString(random_encoded_string))), function base_encode_decode_fuzz_target, file base_encode_decode.cpp, line 45.

@maflcko
Copy link
Member

maflcko commented Apr 10, 2024

Thank you for your contribution. While this stylistic change makes sense on its own, it comes at a cost and risk for the project as a whole. The weak motivation for the change does not justify the burden that it places on the project. A burden could be any of the following:

  • Time spent on review
  • Accidental introduction of bugs
  • (Silent) merge conflicts, either in the branch or a backport branch. Those conflicts demand further developer and reviewer time or introduce bugs.

For more information about refactoring changes and stylistic cleanup, see

Generally, if the style is not mentioned nor enforced by the developer notes, we leave it up to the original author to pick whatever fits them best personally and then leave it that way until the line is touched for other reasons.

Leave a comment below, if you have any questions.

@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch from bb2fb45 to 8345c93 Compare April 10, 2024 16:06
@paplorinc
Copy link
Contributor Author

paplorinc commented Apr 10, 2024

Thanks for the review @fanquake and @maflcko - though it was still in draft form because of the unsigned problems.

I didn't mean the change as a purely stylistic refactoring.

In the encodings the original str.reserve doesn't enforce the final size, but after the change it's obvious that it's an exact calculation, not just an upper bound, i.e. while (str.size() % 4) str += '='; vs str.append(size - str.size(), '=');.

In the second change the point was to unify the suffix removal with other trim algos (e.g. TrimStringView). This would unify the 32 and 64 bit algos and obviate the inconsistency between the 32 and 64 base conversions for paddings (i.e. roundtrip tests aren't working without padding).

If you think the changes are controversial, I can commit the test and the proposed changes separately.

@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch 2 times, most recently from 8345c93 to 9c1ae8a Compare April 10, 2024 21:39
@paplorinc
Copy link
Contributor Author

@maflcko, you were right, the changes didn't add enough value, I've reverted them.
I've kept the tests, since I think they're valuable and unified the style between base 32 and 64 in the source. If you think that's still too much, I'll revert it.

@paplorinc paplorinc changed the title refactor: Simplify base32/64 padding calculations test: add a few more base32/64 calculation corner cases Apr 10, 2024
@paplorinc paplorinc changed the title test: add a few more base32/64 calculation corner cases refactor/test: add a few more base32/64 calculation corner cases Apr 10, 2024
@paplorinc paplorinc marked this pull request as ready for review April 10, 2024 21:50
@maflcko
Copy link
Member

maflcko commented Apr 11, 2024

Please make sure to re-read the whole comment I shared and make sure to read all pages it links to. The burden to clearly motivate a change (and the review effort it requires) is on the pull request author (in this case here it would be you).

Open Source does not work by doing random unmotivated changes, opening a pull request, seeing it fail CI and tests, then adjust it in a ping-pong fashion to make CI green and go back and forth with reviewers until they have essentially written the whole changeset themselves.

At a minimum a (refactoring) pull request should have:

  • A clear motivation
  • Be correct and easy to review
  • Have passing tests and CI

If any of the above do not apply, a refactoring pull request should not be opened in the first place.

@paplorinc
Copy link
Contributor Author

@maflcko, which of the listed motivation/review/passing CI doesn't apply here?

@maflcko
Copy link
Member

maflcko commented Apr 11, 2024

All of it.

  • "unified ... code" is not a motivation, but a style choice
  • The CI didn't pass on several tries
  • Review doesn't seem worth it, unless there is a motivation

@@ -2,10 +2,11 @@
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.

#include <string>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong style in any case, see the dev notes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reorder them if you think it's worth it.
If you think these contributions are worthless I'll stop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reorder them if you think it's worth it.

I don't understand what you are trying to say.

I am saying that the order was correct before and you changed it to be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new import and the imports were reorganized as such. If you don't like the order, I can change it, I don't mind, though I would prefer the CI telling me these in the PR instead of wasting time reviewing styles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you are trying to say.

I'm trying to find out if my contributions are welcome, I've only received hostility so far in return for my invested effort.
It's a decentralized, open source project, I'll contribute what I can and want. If that's not welcome, I'll find a different project where my optimizations are welcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to find out if my contributions are welcome, I've only received hostility so far in return for my invested effort.

Contributions are welcome and contributors are needed. However, refactoring changes are usually the wrong pick for a newcomer to tackle. This is explained https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#refactoring

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also https://bitcoincore.academy/

If you are looking for useful contributions to help out with, you can

  • Search through the good first issues or the ones that are up for grabs. Some of them might no longer be applicable. So if you are interested, but unsure, you might want to leave a comment on the issue first.
  • Write tests to improve the coverage. Any kind of test is welcome and coverage information can be obtained from a relatively recent coverage report. If you are unsure, don't hesitate to check back first.
  • Help with review and testing. There are easy ones such as the gui and rpc. However, review on any open pull request is welcome. Review will also help you understand the codebase better.
  • Help on meta projects related to Bitcoin Core, such as https://github.com/corecheck.
  • Join the Bitcoin Core IRC channel(s) and leave a comment to share what you are interested in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking for introductory issues, couldn't find any that was a good fit for me.
But just looking at the code I found several optimization opportunities in relevant parts of the code, so I contributed those instead, e.g. the 4x speedup of base58.

Copy link
Contributor Author

@paplorinc paplorinc Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you consider optimizations useless refactorings? Other projects I've worked on consider them features.
For me it's a gentle introduction to the codebase, isolated from other complexities, I poke it left-and-right and see how the code behaves.
Is this not a valuable contribution? Especially since I usually cover the corner cases I find with new test cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are generally a good way to start contributing.

Just generally for each pull request, it needs review before merge, and each reviewer will have to trade-off whether it is worth it to review. There are 316 other pull requests open waiting for review, so that is another thing for reviewers to trade-off on.

@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch from 9c1ae8a to c6ee976 Compare April 11, 2024 11:41
src/util/strencodings.cpp Outdated Show resolved Hide resolved
@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch from c6ee976 to 8ecae4d Compare April 12, 2024 07:26
@laanwj
Copy link
Member

laanwj commented Apr 16, 2024

Thanks for your first contribution!

As it is hard to review PRs that change things all over the place, this is unlikely to be merged like this. I'd suggest keeping the code change focused. For example, make it add test cases only, and remove the changes (refactoring and otherwise) to strecodings.cpp and other non-test code.

@laanwj laanwj added the Tests label Apr 16, 2024
@paplorinc
Copy link
Contributor Author

Hey @laanwj, thanks for your first review.
Please see the commits separately, don't worry, they're not all over the place.

str.begin(), str.end(),
[](char c) { return decode32_table[uint8_t(c)]; }
[](auto c) { return decode32_table[uint8_t(c)]; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? This style change makes the code harder to read.

Unless there is a reason to change the code and review it, it shouldn't be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it to unify it with the previous lambda. Why would I leave the type there when I've used auto 2 lines above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment applies to all changes in this file. I am just trying to explain why such style changes do not make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the 3/4 to 6/8 change, kept only the test.
I don't yet understand how we expect this code to become more and more maintainable, if we don't regularly leave the campground cleaner than we've found it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how using auto makes anything clearer.

There certainly are style and language features that make maintenance easier, but they need to be well motivated and explained. Also, if there is a good reason for them, they should be applied globally via CI and checks. Otherwise, they are again applied inconsistently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Is there any solution?

Is there any non-test change that I could delve into—preferably an optimization that doesn't require me to understand every part of the code—that would be welcome?
For example, like my other base58 or bech32 speedups.
Or am I just barking up the wrong tree and should move on to other projects?

@maflcko
Copy link
Member

maflcko commented Apr 16, 2024

@paplorinc If you are not being receptive to reviewers' feedback, it is unlikely that anyone will review the pull request and without review it will not be merged.

@paplorinc
Copy link
Contributor Author

@maflcko, I've responded to every single comment within minutes. I just don't appreciate the patronizing style I keep seeing in this repo.

@laanwj
Copy link
Member

laanwj commented Apr 16, 2024

Take it how you wish, i was just trying to be helpful, not patronizing.

@maflcko
Copy link
Member

maflcko commented Apr 16, 2024

Please understand that this software project requires review of all code changes. It is not a typical software project that auto-merges any change as soon as the CI is green.

When writing a patch as a pull request, please consider how reviewers are supposed to review it and how easy it will be for them to follow the changes and their motivation.

@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch from 8ecae4d to a6ad9b9 Compare April 16, 2024 09:49
@paplorinc paplorinc changed the title refactor/test: add a few more base32/64 calculation corner cases test: add a few more base32/64 calculation corner cases Apr 16, 2024
@fanquake
Copy link
Member

I don't yet understand how we expect this code to become more and more maintainable, if we don't regularly leave the campground cleaner than we've found it.

The issue here is that even if you've found the campground, and while maybe it looks like it could use a little maintenance, if you zoom out for a birds-eye view, you'll actually notice that everyone else is busy somewhere next door, trying to fight the forest fires. Even if the code here is (maybe arbitrarily) "better" or "faster" by some metric, the impact to the whole project is ~0. In fact, it could even be argued that it's a net negative, given the burden of changing and reviewing the code, probably doesn't outweigh distracting engineers working on more important things and/or actually breaking something that already works (and will likely continue to work for many many years) + all the time spent here.

I understand that as a new contributor, you're looking for things to do, and picking random utility functions/code, and trying to generically improve it may seem easiest / appealing, and potentially something that other open source projects will happily take, however that isn't how things work here.

I'd suggest looking through https://github.com/bitcoin/bitcoin/issues (you can filter for "good first issues"), and either picking a bug that needs fixing, and working on fixing it, or, looking through, https://github.com/bitcoin/bitcoin/pulls, and finding something that interests you, and leaving some meaningful review comments. These will be much more productive avenues for you to become a productive contributor to this project. You should also familiarise yourself with https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md and the https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md.

@paplorinc
Copy link
Contributor Author

Thanks for the detailed answer @fanquake.

everyone else is busy [...] trying to fight the forest fires

So how do we prevent future forest fires while extinguishing current ones?

I have looked through good first issue since the beginning, there's barely any, all of which were started by others, often years ago.
That's why I started working on what I can contribute instead.

@maflcko
Copy link
Member

maflcko commented Apr 16, 2024

Another great way to contribute would be to spend time on in-depth review of open pull requests and compare your review with the review done by others. Obviously this will be time-consuming, starting out, but if you care about the project, over time it will teach you relevant skills (for this project and for yourself to apply elsewhere).

On some pull requests, there remain small follow-ups that were found during review, so you'll naturally find stuff to work on if you start out by reviewing pull requests done by others. Another benefit of this is, that if you end up creating a follow-up, the code will be fresh in the minds of some developers, who may be willing to spend time on review.

The new randomized roundtrips for base32 and base64 encoding make sure encoding and decoding are symmetric.
Note that if we omit the padding in EncodeBase32 we won't be able to decode it with DecodeBase32.
Added dedicated padding tests to hardcode the failure behavior.
@paplorinc paplorinc force-pushed the paplorinc/base-32-64-padding-simplification branch from a6ad9b9 to 9967fb5 Compare May 11, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants