Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc: avoid copying into UniValue #30115

Merged
merged 1 commit into from
May 23, 2024
Merged

Conversation

theuni
Copy link
Member

@theuni theuni commented May 15, 2024

These are the simple (and hopefully obviously correct) copies that can be moves instead.

This is a follow-up from #30094 (comment)

As it turns out, there are hundreds of places where we copy UniValues needlessly. It should be the case that moves are always preferred over copies, so there should be no downside to these changes.

willcl-ark, however, noticed that memory usage may increase in some cases. Logically this makes no sense to me. The only plausible explanation imo is that because the moves are faster, more ops/second occur in some cases.

This list of moves was obtained by changing the function signatures of the UniValue functions to accept only rvalues, then compiling and fixing them up one by one. There still exist many places where copies are being made. These can/should be fixed up, but weren't done here for the sake of doing the easy ones first.

I ran these changes through clang-tidy with performance-move-const-arg and bugprone-use-after-move and no bugs were detected (though that's obviously not to say it can be trusted 100%).

As stated above, there are still lots of other less trivial fixups to do after these including:

  • Using non-const UniValues where possible so that moves can happen
  • Refactoring code in order to be able to move a UniValue without introducing a use-after-move
  • Refactoring functions to accept UniValues by value rather than by const reference

@DrahtBot
Copy link
Contributor

DrahtBot commented May 15, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK achow101, ryanofsky, willcl-ark
Concept ACK TheCharlatan, hebasto, laanwj, stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 35b5b0f. I confirmed this change is only adding std::moves, and also tried to look for cases where an object was being used after a move, since that seems like the most likely type of bug this PR could introduce. I didn't check exhaustively, though, since that would be a lot more work and I think we have a clang-tidy check that would catch those cases.

@theuni
Copy link
Member Author

theuni commented May 16, 2024

I didn't check exhaustively, though, since that would be a lot more work and I think we have a clang-tidy check that would catch those cases.

Indeed, I believe c-i should be doing that check. But just in case, from the PR description:

I ran these changes through clang-tidy with performance-move-const-arg and bugprone-use-after-move and no bugs were detected (though that's obviously not to say it can be trusted 100%).

@TheCharlatan
Copy link
Contributor

Concept ACK

Could this be a case for a clang-tidy plugin?

@theuni
Copy link
Member Author

theuni commented May 16, 2024

Could this be a case for a clang-tidy plugin?

I considered this, but I don't think so. UniValue copies are reasonable in many cases, but our use of them often lends itself to moving. So we can't detect and disable copies as a rule, and (I assume) if clang-tidy could detect and suggest possible moves as optimizations, it would be offering a generic version of that already.

Here's an upstream discussion about it that has apparently gone stale: llvm/llvm-project#53489

@hebasto
Copy link
Member

hebasto commented May 16, 2024

Concept ACK.

@ryanofsky
Copy link
Contributor

Could this be a case for a clang-tidy plugin?

I think the ideal thing to do for types that are potentially expensive to copy is to disable implicit copies, but provide explicit T Copy() const and void CopyFrom(const T&) methods so copies can be made where needed. (There was a DISABLE_IMPLICIT_COPIES macro proposed in #24641 that tried to do this, but it didn't seem possible to generalize the implementation so we never added it.) It think it would be probably be reasonable to allow univalue to be explicitly but not implicitly copied, but I'm sure @theuni would have a better intuition.

@laanwj
Copy link
Member

laanwj commented May 16, 2024

Concept ACK.

Agree with @ryanofsky that if copy is something expensive (or generally undesirable) to do, it would make sense to make copy explicit, so that it is always a deliberate choice and stands out in code review.

@stickies-v
Copy link
Contributor

Concept ACK

Indeed, I believe c-i should be doing that check.

It does:

bugprone-use-after-move,

@willcl-ark
Copy link
Member

Concept ACK.

The code changes all look correct to me too, however

@willcl-ark, however, noticed that memory usage may increase in some cases. Logically this makes no sense to me. The only plausible explanation imo is that because the moves are faster, more ops/second occur in some cases.

I still have not been able to explain why i see these type of results, but I think it's most likely user error somewhere along the line.

Here is one example of a puzzling result I have, in this test fetching every 5000th block via REST (previously largely addressed by #30094). In all screenshots this PR is on the right hand side:

We can see here this PR made 96m allocations, vs 137m, a strict improvement?

screenshot_20240517-131022

The unnecessary calls to allocator functions are clearly gotten rid of:

screenshot_20240517-131039

However despite that total resource usage increases, albeit only a little, ~10MB (also see first image) :

screenshot_20240517-131046

easy-univalue-moves-comparison

The changes in this PR which affect this test (since #30094) are in core_write.cpp

I guess this may just be allocator stuff I might never get to the bottom of, but I will continue to investigate this for a little bit before bringing a full ACK back :)

FWIW these changes did provide a measurable speedup on my test, roughly of this order:

  • this PR: ~ 1:43
  • master: ~ 1:55

I also see similar result just fetching a single block (in this case block 800,000). Many fewer allocations, but larger total usage:

screenshot_20240517-134351

screenshot_20240517-134513

I think the clues may lie in here, but I am too dumb to know what they are. For some reason after this PR a second ~30MB allocation is made. Without this PR there are 37.2MB and 24.0MB allocations made,, and with this PR those are 38.2MB and 30.9MB, and I don't understand why:

screenshot_20240517-135346

heaptrack dumps in case anyone else wants to take a look:

heaptrack_dumps.zip

@theuni
Copy link
Member Author

theuni commented May 17, 2024

@willcl-ark Thanks, that's very helpful. Out of curiosity, what build options are you using for these? Are optimizations on?

@maflcko
Copy link
Member

maflcko commented May 20, 2024

Maybe rebase and re-bench after 75118a6, which replaced some copies with moves as well?

@theuni
Copy link
Member Author

theuni commented May 20, 2024

Maybe rebase and re-bench after 75118a6, which replaced some copies with moves as well?

Ah, great, thanks for pointing me to that. I had pretty much this same commit locally to address the third point in this PR's description:

  • Refactoring functions to accept UniValues by value rather than by const reference

Though I think there are still other functions that need the same treatment.

These are simple (and hopefully obviously correct) copies that can be moves
instead.
@theuni
Copy link
Member Author

theuni commented May 22, 2024

I think it's not really worth worrying about benchmarks here, I think it's pretty clear that moving can only be better.

I'd like to get this in if there are no objections, as there's some more low-hanging fruit wrt UniValue moves.

@achow101
Copy link
Member

ACK d7707d9

Checked that only std::moves were added, and letting the compiler warn for use after move, which there does not appear to be any.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK d7707d9. No changes since last review other than rebase. I agree benchmarks showing increased peak memory usage and RSS are surprising, but number of allocations is down as expected, and runtime is also decreased.

The fact that "peak allocations" appear to be larger after the change seems interesting. Before the change, the largest allocations are 38.4MB, 26.7MB, and 8.4MB, but after the change they are 39.4MB, 33.7MB, and 10.8MB. I'm not sure what could account for this though.

In general, it seems like using std::move could increase memory usage with vectors, because if you std::move into a vector, the new vector has the same capacity as the existing vector, but when you copy into a vector, the library could choose a smaller capacity just matching the size, and overall memory usage would be lower after the first vector is destroyed.

@sipa
Copy link
Member

sipa commented May 22, 2024

It's unclear to me whether the UniValue type has a move constructor, actually. If not, then these std::moves have no effect.

@ryanofsky If you theory is correct, then having a way to invoke shrink_to_fit on UniValue's internal vector might be useful.

@theuni
Copy link
Member Author

theuni commented May 22, 2024

It's unclear to me whether the UniValue type has a move constructor, actually. If not, then these std::moves have no effect.

@sipa I worked up a quick bench to test exactly that: theuni@35d5ffc

The numbers look as you'd expect.

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        7,273,304.00 |              137.49 |    0.1% |      0.08 | `UniValueAssignCopy`
|        1,040,156.00 |              961.39 |    0.5% |      0.02 | `UniValueAssignMove`
|        9,336,776.00 |              107.10 |    0.4% |      0.10 | `UniValueConstructCopy`
|          991,626.00 |            1,008.44 |    0.7% |      0.02 | `UniValueConstructMove`
|       11,783,927.00 |               84.86 |    0.2% |      0.13 | `UniValuePushKVCopy`
|        2,344,899.00 |              426.46 |    0.3% |      0.03 | `UniValuePushKVMove`

@maflcko
Copy link
Member

maflcko commented May 23, 2024

It's unclear to me whether the UniValue type has a move constructor, actually. If not, then these std::moves have no effect.

Since this question keeps coming up (#24542 (comment)), what about adding static_assert(std::is_move_constructible_v<UniValue>); somewhere?

@maflcko
Copy link
Member

maflcko commented May 23, 2024

@ryanofsky If you theory is correct, then having a way to invoke shrink_to_fit on UniValue's internal vector might be useful.

Not sure if this is ideal. UniValue is a recursive structure, so calling it on the top level vector only shouldn't cause any difference? Similarly, if shirking is done recursively, the runtime overhead will be equal to that of a copy, so might as well just do a copy instead? Finally, whenever a UniValue would be ready to shrink, it is usually one step away from being deleted completely anyway. The only missing step is to call write() on it to convert it to bytes to send on the wire. However that leads to an alternative, where a new write_and_destroy function is added, which recursively shrinks (deletes) the UniValue as soon as it was written to the stream. From a runtime-overhead this should be free (as the univalue is destroyed afterward anyway). I haven't tested or implemented this, but this may be a start:

diff
diff --git a/src/common/args.cpp b/src/common/args.cpp
index c90eb0c685..e66f18277d 100644
--- a/src/common/args.cpp
+++ b/src/common/args.cpp
@@ -248,7 +248,7 @@ bool ArgsManager::ParseParameters(int argc, const char* const argv[], std::strin
         const common::SettingsSpan values{*includes};
         // Range may be empty if -noincludeconf was passed
         if (!values.empty()) {
-            error = "-includeconf cannot be used from commandline; -includeconf=" + values.begin()->write();
+            error = "-includeconf cannot be used from commandline; -includeconf=" /*+ values.begin()->write_const()*/;
             return false; // pick first value as example
         }
     }
@@ -811,7 +811,7 @@ void ArgsManager::logArgsPrefix(
         for (const auto& value : arg.second) {
             std::optional<unsigned int> flags = GetArgFlags('-' + arg.first);
             if (flags) {
-                std::string value_str = (*flags & SENSITIVE) ? "****" : value.write();
+                std::string value_str = (*flags & SENSITIVE) ? "****" : ""/*value.write_const()*/;
                 LogPrintf("%s %s%s=%s\n", prefix, section_str, arg.first, value_str);
             }
         }
@@ -825,7 +825,7 @@ void ArgsManager::LogArgs() const
         logArgsPrefix("Config file arg:", section.first, section.second);
     }
     for (const auto& setting : m_settings.rw_settings) {
-        LogPrintf("Setting file arg: %s = %s\n", setting.first, setting.second.write());
+        LogPrintf("Setting file arg: %s = %s\n", setting.first, ""/*setting.second.write_const()*/);
     }
     logArgsPrefix("Command-line arg:", "", m_settings.command_line_options);
 }
diff --git a/src/common/settings.cpp b/src/common/settings.cpp
index c1520dacd2..268ab39f1e 100644
--- a/src/common/settings.cpp
+++ b/src/common/settings.cpp
@@ -99,7 +99,7 @@ bool ReadSettings(const fs::path& path, std::map<std::string, SettingsValue>& va
     file.close(); // Done with file descriptor. Release while copying data.
 
     if (!in.isObject()) {
-        errors.emplace_back(strprintf("Found non-object value %s in settings file %s", in.write(), fs::PathToString(path)));
+        errors.emplace_back(strprintf("Found non-object value %s in settings file %s", in.write_and_destroy(), fs::PathToString(path)));
         return false;
     }
 
@@ -138,7 +138,7 @@ bool WriteSettings(const fs::path& path,
         errors.emplace_back(strprintf("Error: Unable to open settings file %s for writing", fs::PathToString(path)));
         return false;
     }
-    file << out.write(/* prettyIndent= */ 4, /* indentLevel= */ 1) << std::endl;
+    file << out.write_and_destroy(/* prettyIndent= */ 4, /* indentLevel= */ 1) << std::endl;
     file.close();
     return true;
 }
diff --git a/src/httprpc.cpp b/src/httprpc.cpp
index 3eb34dbe6a..91b1574afc 100644
--- a/src/httprpc.cpp
+++ b/src/httprpc.cpp
@@ -87,7 +87,7 @@ static void JSONErrorReply(HTTPRequest* req, UniValue objError, const JSONRPCReq
     else if (code == RPC_METHOD_NOT_FOUND)
         nStatus = HTTP_NOT_FOUND;
 
-    std::string strReply = JSONRPCReplyObj(NullUniValue, std::move(objError), jreq.id, jreq.m_json_version).write() + "\n";
+    std::string strReply = JSONRPCReplyObj(NullUniValue, std::move(objError), jreq.id, jreq.m_json_version).write_and_destroy() + "\n";
 
     req->WriteHeader("Content-Type", "application/json");
     req->WriteReply(nStatus, strReply);
@@ -273,7 +273,7 @@ static bool HTTPReq_JSONRPC(const std::any& context, HTTPRequest* req)
             throw JSONRPCError(RPC_PARSE_ERROR, "Top-level object parse error");
 
         req->WriteHeader("Content-Type", "application/json");
-        req->WriteReply(HTTP_OK, reply.write() + "\n");
+        req->WriteReply(HTTP_OK, reply.write_and_destroy() + "\n");
     } catch (UniValue& e) {
         JSONErrorReply(req, std::move(e), jreq);
         return false;
diff --git a/src/rest.cpp b/src/rest.cpp
index 9fc5d4af04..f83133e187 100644
--- a/src/rest.cpp
+++ b/src/rest.cpp
@@ -269,7 +269,7 @@ static bool rest_headers(const std::any& context,
         for (const CBlockIndex *pindex : headers) {
             jsonHeaders.push_back(blockheaderToJSON(*tip, *pindex));
         }
-        std::string strJSON = jsonHeaders.write() + "\n";
+        std::string strJSON = jsonHeaders.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -338,7 +338,7 @@ static bool rest_block(const std::any& context,
         DataStream block_stream{block_data};
         block_stream >> TX_WITH_WITNESS(block);
         UniValue objBlock = blockToJSON(chainman.m_blockman, block, *tip, *pblockindex, tx_verbosity);
-        std::string strJSON = objBlock.write() + "\n";
+        std::string strJSON = objBlock.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -472,7 +472,7 @@ static bool rest_filter_header(const std::any& context, HTTPRequest* req, const
             jsonHeaders.push_back(header.GetHex());
         }
 
-        std::string strJSON = jsonHeaders.write() + "\n";
+        std::string strJSON = jsonHeaders.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -564,7 +564,7 @@ static bool rest_block_filter(const std::any& context, HTTPRequest* req, const s
     case RESTResponseFormat::JSON: {
         UniValue ret(UniValue::VOBJ);
         ret.pushKV("filter", HexStr(filter.GetEncodedFilter()));
-        std::string strJSON = ret.write() + "\n";
+        std::string strJSON = ret.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -591,7 +591,7 @@ static bool rest_chaininfo(const std::any& context, HTTPRequest* req, const std:
         jsonRequest.context = context;
         jsonRequest.params = UniValue(UniValue::VARR);
         UniValue chainInfoObject = getblockchaininfo().HandleRequest(jsonRequest);
-        std::string strJSON = chainInfoObject.write() + "\n";
+        std::string strJSON = chainInfoObject.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -634,7 +634,7 @@ static bool rest_deploymentinfo(const std::any& context, HTTPRequest* req, const
         }
 
         req->WriteHeader("Content-Type", "application/json");
-        req->WriteReply(HTTP_OK, getdeploymentinfo().HandleRequest(jsonRequest).write() + "\n");
+        req->WriteReply(HTTP_OK, getdeploymentinfo().HandleRequest(jsonRequest).write_and_destroy() + "\n");
         return true;
     }
     default: {
@@ -685,9 +685,9 @@ static bool rest_mempool(const std::any& context, HTTPRequest* req, const std::s
             if (verbose && mempool_sequence) {
                 return RESTERR(req, HTTP_BAD_REQUEST, "Verbose results cannot contain mempool sequence values. (hint: set \"verbose=false\")");
             }
-            str_json = MempoolToJSON(*mempool, verbose, mempool_sequence).write() + "\n";
+            str_json = MempoolToJSON(*mempool, verbose, mempool_sequence).write_and_destroy() + "\n";
         } else {
-            str_json = MempoolInfoToJSON(*mempool).write() + "\n";
+            str_json = MempoolInfoToJSON(*mempool).write_and_destroy() + "\n";
         }
 
         req->WriteHeader("Content-Type", "application/json");
@@ -747,7 +747,7 @@ static bool rest_tx(const std::any& context, HTTPRequest* req, const std::string
     case RESTResponseFormat::JSON: {
         UniValue objTx(UniValue::VOBJ);
         TxToUniv(*tx, /*block_hash=*/hashBlock, /*entry=*/ objTx);
-        std::string strJSON = objTx.write() + "\n";
+        std::string strJSON = objTx.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -940,7 +940,7 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
         objGetUTXOResponse.pushKV("utxos", utxos);
 
         // return json string
-        std::string strJSON = objGetUTXOResponse.write() + "\n";
+        std::string strJSON = objGetUTXOResponse.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -992,7 +992,7 @@ static bool rest_blockhash_by_height(const std::any& context, HTTPRequest* req,
         req->WriteHeader("Content-Type", "application/json");
         UniValue resp = UniValue(UniValue::VOBJ);
         resp.pushKV("blockhash", pblockindex->GetBlockHash().GetHex());
-        req->WriteReply(HTTP_OK, resp.write() + "\n");
+        req->WriteReply(HTTP_OK, resp.write_and_destroy() + "\n");
         return true;
     }
     default: {
diff --git a/src/rpc/util.cpp b/src/rpc/util.cpp
index f5a2e9eb63..4c1d64eb35 100644
--- a/src/rpc/util.cpp
+++ b/src/rpc/util.cpp
@@ -166,7 +166,7 @@ std::string HelpExampleCliNamed(const std::string& methodname, const RPCArgList&
     for (const auto& argpair: args) {
         const auto& value = argpair.second.isStr()
                 ? argpair.second.get_str()
-                : argpair.second.write();
+                : ""/*argpair.second.write()*/;
         result += " " + argpair.first + "=" + ShellQuoteIfNeeded(value);
     }
     result += "\n";
@@ -187,7 +187,7 @@ std::string HelpExampleRpcNamed(const std::string& methodname, const RPCArgList&
     }
 
     return "> curl --user myusername --data-binary '{\"jsonrpc\": \"2.0\", \"id\": \"curltest\", "
-           "\"method\": \"" + methodname + "\", \"params\": " + params.write() + "}' -H 'content-type: text/plain;' http://127.0.0.1:8332/\n";
+           "\"method\": \"" + methodname + "\", \"params\": " + params.write_and_destroy() + "}' -H 'content-type: text/plain;' http://127.0.0.1:8332/\n";
 }
 
 // Converts a hex string to a public key if possible
@@ -631,7 +631,7 @@ UniValue RPCHelpMan::HandleRequest(const JSONRPCRequest& request) const
         }
     }
     if (!arg_mismatch.empty()) {
-        throw JSONRPCError(RPC_TYPE_ERROR, strprintf("Wrong type passed:\n%s", arg_mismatch.write(4)));
+        throw JSONRPCError(RPC_TYPE_ERROR, strprintf("Wrong type passed:\n%s", arg_mismatch.write_and_destroy(4)));
     }
     CHECK_NONFATAL(m_req == nullptr);
     m_req = &request;
@@ -650,8 +650,8 @@ UniValue RPCHelpMan::HandleRequest(const JSONRPCRequest& request) const
         if (!mismatch.isNull()) {
             std::string explain{
                 mismatch.empty() ? "no possible results defined" :
-                mismatch.size() == 1 ? mismatch[0].write(4) :
-                mismatch.write(4)};
+                mismatch.size() == 1 ? ""/* mismatch[0].write_and_destroy(4) */:
+                mismatch.write_and_destroy(4)};
             throw std::runtime_error{
                 strprintf("Internal bug detected: RPC call \"%s\" returned incorrect type:\n%s\n%s %s\nPlease report this issue here: %s\n",
                           m_name, explain,
@@ -950,7 +950,7 @@ std::string RPCArg::ToDescriptionString(bool is_named_arg) const
     if (m_fallback.index() == 1) {
         ret += ", optional, default=" + std::get<RPCArg::DefaultHint>(m_fallback);
     } else if (m_fallback.index() == 2) {
-        ret += ", optional, default=" + std::get<RPCArg::Default>(m_fallback).write();
+        ret += ", optional, default=" ;//+ std::get<RPCArg::Default>(m_fallback).write();
     } else {
         switch (std::get<RPCArg::Optional>(m_fallback)) {
         case RPCArg::Optional::OMITTED: {
diff --git a/src/univalue/include/univalue.h b/src/univalue/include/univalue.h
index da12157555..cc741eb84f 100644
--- a/src/univalue/include/univalue.h
+++ b/src/univalue/include/univalue.h
@@ -73,7 +73,9 @@ public:
     void getObjMap(std::map<std::string,UniValue>& kv) const;
     bool checkObject(const std::map<std::string,UniValue::VType>& memberTypes) const;
     const UniValue& operator[](const std::string& key) const;
+    UniValue& operator[](const std::string& key);
     const UniValue& operator[](size_t index) const;
+    UniValue& operator[](size_t index);
     bool exists(const std::string& key) const { size_t i; return findKey(key, i); }
 
     bool isNull() const { return (typ == VNULL); }
@@ -94,8 +96,7 @@ public:
     void pushKV(std::string key, UniValue val);
     void pushKVs(UniValue obj);
 
-    std::string write(unsigned int prettyIndent = 0,
-                      unsigned int indentLevel = 0) const;
+    std::string write_and_destroy(unsigned int prettyIndent = 0,                       unsigned int indentLevel = 0) ;
 
     bool read(std::string_view raw);
 
@@ -107,8 +108,8 @@ private:
 
     void checkType(const VType& expected) const;
     bool findKey(const std::string& key, size_t& retIdx) const;
-    void writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const;
-    void writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const;
+    void writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s);
+    void writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s);
 
 public:
     // Strict type-specific getters, these throw std::runtime_error if the
diff --git a/src/univalue/lib/univalue.cpp b/src/univalue/lib/univalue.cpp
index 656d2e8203..8ceab1fc31 100644
--- a/src/univalue/lib/univalue.cpp
+++ b/src/univalue/lib/univalue.cpp
@@ -197,6 +197,13 @@ const UniValue& UniValue::operator[](const std::string& key) const
     return values.at(index);
 }
 
+UniValue& UniValue::operator[](const std::string& key)
+{
+    size_t index(-1);
+    if (!findKey(key, index)) index=-1;
+    return values.at(index);
+}
+
 const UniValue& UniValue::operator[](size_t index) const
 {
     if (typ != VOBJ && typ != VARR)
@@ -207,6 +214,11 @@ const UniValue& UniValue::operator[](size_t index) const
     return values.at(index);
 }
 
+UniValue& UniValue::operator[](size_t index)
+{
+    return values.at(index);
+}
+
 void UniValue::checkType(const VType& expected) const
 {
     if (typ != expected) {
diff --git a/src/univalue/lib/univalue_write.cpp b/src/univalue/lib/univalue_write.cpp
index 4a2219061a..df10467fb6 100644
--- a/src/univalue/lib/univalue_write.cpp
+++ b/src/univalue/lib/univalue_write.cpp
@@ -28,8 +28,8 @@ static std::string json_escape(const std::string& inS)
 }
 
 // NOLINTNEXTLINE(misc-no-recursion)
-std::string UniValue::write(unsigned int prettyIndent,
-                            unsigned int indentLevel) const
+std::string UniValue::write_and_destroy(unsigned int prettyIndent,
+                            unsigned int indentLevel) 
 {
     std::string s;
     s.reserve(1024);
@@ -59,6 +59,7 @@ std::string UniValue::write(unsigned int prettyIndent,
         break;
     }
 
+*this = {}; // delete
     return s;
 }
 
@@ -68,7 +69,7 @@ static void indentStr(unsigned int prettyIndent, unsigned int indentLevel, std::
 }
 
 // NOLINTNEXTLINE(misc-no-recursion)
-void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const
+void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s)
 {
     s += "[";
     if (prettyIndent)
@@ -77,7 +78,7 @@ void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, s
     for (unsigned int i = 0; i < values.size(); i++) {
         if (prettyIndent)
             indentStr(prettyIndent, indentLevel, s);
-        s += values[i].write(prettyIndent, indentLevel + 1);
+        s += values[i].write_and_destroy(prettyIndent, indentLevel + 1);
         if (i != (values.size() - 1)) {
             s += ",";
         }
@@ -91,7 +92,7 @@ void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, s
 }
 
 // NOLINTNEXTLINE(misc-no-recursion)
-void UniValue::writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const
+void UniValue::writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s)
 {
     s += "{";
     if (prettyIndent)
@@ -103,7 +104,7 @@ void UniValue::writeObject(unsigned int prettyIndent, unsigned int indentLevel,
         s += "\"" + json_escape(keys[i]) + "\":";
         if (prettyIndent)
             s += " ";
-        s += values.at(i).write(prettyIndent, indentLevel + 1);
+        s += values.at(i).write_and_destroy(prettyIndent, indentLevel + 1);
         if (i != (values.size() - 1))
             s += ",";
         if (prettyIndent)

Copy link
Member

@willcl-ark willcl-ark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d7707d9

Whilst I still measure slightly increased resource usage vs master, even after adding some strategic calls to shrink_to_fit() on the Univalue.values member (which likely didn't work for the reasons @maflcko described above), I am now convinced this is not something to worry about in practice.

I've done more measurements over the last few days, and even when testing codepaths which create very large numbers of UniValues, I only ever see a few MB more RAM being used at peak. In the order of single digit percentage points. And, this excess does not appear to grow over time.

Therefore in my opinion this effect, where it's measurable/noticable at all, is more than offset by the reduced number of allocations (e.g. I've seen 106,000,000 allocations being reduced to 60,000,000 in larger tests using e.g. blockToJSON), and the speedups shown using @theuni's benchmark.

@ryanofsky ryanofsky merged commit 6300438 into bitcoin:master May 23, 2024
16 checks passed
@ryanofsky
Copy link
Contributor

Went ahead and merged this. It seems the increased peak memory usage is not a major concern, and other metrics like number of allocations and run time do seem improved.

Marco seems right that naively calling shrink_to_fit might not help much, or could be basically equivalent to copying, and there might be better ways to decrease memory usage, such as with a write_and_destroy method.

(Note: I also edited PR description to remove @ from @willcl-ark, so Will doesn't receive github spam when the merge commit is pulled into other repositories.)

@theuni
Copy link
Member Author

theuni commented May 23, 2024

@ryanofsky Thanks.

I'm going to continue working on the more complicated copies in follow-up PRs. I'll play around with your Copy/CopyFrom idea while I'm at it. If nothing else, that would be a good way to help me track down the remaining cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet