Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow couchdb after many queries #5044

Open
luca-simonetti opened this issue May 2, 2024 · 29 comments
Open

Slow couchdb after many queries #5044

luca-simonetti opened this issue May 2, 2024 · 29 comments

Comments

@luca-simonetti
Copy link

luca-simonetti commented May 2, 2024

Description

CouchDB gets tremendously slow after many queries are made. The situation gets better after some pause in the process. But as soon as the queries resume the system gets slow really really fast.

Steps to Reproduce

  1. Make loads of queries
  2. See the time go up really fast

note: we are using couchdb as part of our Hyperledger Fabric Cluster. The queries are then made through blockchain requests.

Expected Behaviour

We expect that the same query takes roughly the same amount of time, regardless of the number of queries in the previous amount of time (let's say the previous 5 minutes)

Your Environment

  • CouchDB version used: 3.3.3
  • Browser name and version: N/A
  • Operating system and version: Ubuntu 22.04.4 LTS

Additional Context

We're using Hyperledger Fabric. After the service of couchdb is restarted with /etc/init.d/couchdb restart the situation goes back to normal, but it only takes a couple of minutes (which is something like 5 x 5 x 60 = 1500 queries) and the situation starts degrade real quick.

@rnewson
Copy link
Member

rnewson commented May 2, 2024

Your expectation is well-founded, and couchdb does behave that way. So there must be something wrong somewhere. Could it be that the number of in-progress requests keeps climbing? That is, making more and more queries even though previous ones have not yet completed? If so, increased latency on the latter requests is expected due to internal queues.

I suggest using a tool like ApacheBench to confirm or refute this, set it with some modest concurrency and point it at a view with some appropriate query parameters (i.e, don't read the entire view on each request).

Note that a view query operates in two phases. the first phase is to ensure the view is up to date with respect to any database updates (by running any new updates through the map and reduce functions). The second phase is the actual query. It will help us to know if the database in question is being written to (and, if so, how intensely) during these queries. e.g, we would expect a view query to take longer if a large number of updates were made to the database just beforehand.

@luca-simonetti
Copy link
Author

Thanks for the reply!
I don't think the issue is with queries not being completed, since

a. we do them in series - i.e. we wait for one response before throwing the new query
b. we even have a sleep(.5) after each request before throwing another one.
c. resources -- namely CPU and memory -- are pretty good (like 10% CPU and 10% memory usage)

from an outside point of view looks like there's some kind of short term memory cache which gets bigger and bigger and each query traverses the whole object which gets bigger and bigger all the time.

for the record: the requests right after the service restart take somethink like few milliseconds, whereas request after some few thousands get even to 120.000 ms (which is the maximum amount of time for us before calling it a day and raising an exception)

we'll try benchmarking with ab

@rnewson
Copy link
Member

rnewson commented May 2, 2024

there is no cache of query results inside couchdb, for sure.

Is the database growing during this? we'd expect longer query times for larger views, even if the view is up-to-date when queried (it's ultimately B+Tree's, so the more data the taller the tree, and the larger the database the more likely we have to read more nodes from disk)

At this point it would help to see your specific map/reduce functions and an example document.

@rnewson
Copy link
Member

rnewson commented May 2, 2024

also check /_active_tasks as view compaction tasks may be running. you can still query while those run, but they'll obviously be using more i/o and perhaps this is a factor in performance (less of an issue on SSD which, hopefully, you are already using).

@luca-simonetti
Copy link
Author

no the database is not growing during the read operations (the actual platform is a QA environment where the only operation is the "mega read"). We're speaking of 100.000 reads.
Also: right after the restart the whole thing gets back to normal (200ms) and starts degrade again in a couple of minutes.

_active_tasks is empty ([])
I'm not sure what do you mean by our map/reduce since we're using the Hyperledger API to access the CouchDB

I can show you the go code for the chaincode we does the reads though:

func (c *xxxxx) Token(ctx contractapi.TransactionContextInterface, tokenId string) (string, error) {

	token, err := GetTokenData(ctx, tokenId)
	if err != nil {
		return "", err
	}

	tokenxxxxx := xxxxxx
	iterator, err := ctx.GetStub().GetStateByPartialCompositeKey(PrefixTokenxxxxxx, []string{tokenId})
	if err != nil {
		return "", fmt.Errorf("error creating asset chaincode: %v", err)
	}
	for iterator.HasNext() {
		queryResponse, err := iterator.Next()
		if err != nil {
			return "", err
		}
		// Recupero il tokenId
		_, compositeKeyParts, err := ctx.GetStub().SplitCompositeKey(queryResponse.Key)
		if err != nil {
			return "", err
		}

		/// OTHER CODE REDACTED BUT OF NO CONCERN
	}
	_ = iterator.Close()
	resultByte, err := json.Marshal(tokenWithxxxxxxx)
	return string(resultByte), nil
}

@rnewson
Copy link
Member

rnewson commented May 2, 2024

Can you express that as an http request?

@luca-simonetti
Copy link
Author

luca-simonetti commented May 2, 2024

I think that gets translated to two kind of queries:

  1. /xxxxxx_yyyyyy/_all_docs?attachments=true&endkey="\\u0000xxxxx\\u000014302\\uxxxxx�"&include_docs=true&inclusive_end=false&limit=1001&startkey="\\u0000xxxxx\\u000014302\\u0000"
  2. /xxxxxx_yyyyyy/\x00xxxxx\x0014520\x009892299\x00?attachments=true

the first is the one I'm benchmarking now (-c=10 -n=100000) and is sometimes really slow
the second seems to take around 10-20ms and is run over the 100 results of the first query

but still I don't understand why if I run the requests one after another and not in parallel they influence each other.

@luca-simonetti
Copy link
Author

here's the ab result

Document Length:        47691 bytes

Concurrency Level:      10
Time taken for tests:   2219.996 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      4792500000 bytes
HTML transferred:       4769100000 bytes
Requests per second:    45.05 [#/sec] (mean)
Time per request:       222.000 [ms] (mean)
Time per request:       22.200 [ms] (mean, across all concurrent requests)
Transfer rate:          2108.19 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       45   54  41.0     52    1089
Processing:   147  168  28.6    163     854
Waiting:       47   57   7.1     56     361
Total:        193  222  51.2    216    1650

Percentage of the requests served within a certain time (ms)
  50%    216
  66%    220
  75%    223
  80%    227
  90%    238
  95%    250
  98%    291
  99%    349
 100%   1650 (longest request)

@luca-simonetti
Copy link
Author

browsing through old issues I found this one, which sounds a lot like our problem:
#1319 (comment)
at this point: could this be related to how hyperledger manages connections?

@rnewson
Copy link
Member

rnewson commented May 3, 2024

Hi, yes, it could be. A typical http client would reuse connections (indeed, would manage a pool of them), check that you're doing that first. Otherwise you could possibly use up all possible ports and have to wait for one to become available.

Try your ab command again, but change it to run for longer(either more requests or switch to -t) and compare a run with and without the -k parameter, which enables http keepalive. The results with -k should have much more consistent durations.

@nickva
Copy link
Contributor

nickva commented May 3, 2024

With include_docs=true tweaking [fabric] all_docs_concurrency = 10 might have an effect.

https://github.com/apache/couchdb/blob/main/src/fabric/src/fabric_view_all_docs.erl#L297

That spawns 10 background workers at the fabric (cluster level) to open 10 documents in parallel, but that puts more work on the cluster. Can try decreasing it or increasing it a bit and see if it changes anything.

@luca-simonetti
Copy link
Author

Hi, yes, it could be. A typical http client would reuse connections (indeed, would manage a pool of them), check that you're doing that first. Otherwise you could possibly use up all possible ports and have to wait for one to become available.

Try your ab command again, but change it to run for longer(either more requests or switch to -t) and compare a run with and without the -k parameter, which enables http keepalive. The results with -k should have much more consistent durations.

I tried with -k which yield similar results to the yesterday's test. We don't have much we can do with the connections since we're using CouchDB indirectly, through Hyperledger Fabric.

With include_docs=true tweaking [fabric] all_docs_concurrency = 10 might have an effect.

https://github.com/apache/couchdb/blob/main/src/fabric/src/fabric_view_all_docs.erl#L297

That spawns 10 background workers at the fabric (cluster level) to open 10 documents in parallel, but that puts more work on the cluster. Can try decreasing it or increasing it a bit and see if it changes anything.

we'll definitely try this!!
thanks

@luca-simonetti
Copy link
Author

With include_docs=true tweaking [fabric] all_docs_concurrency = 10 might have an effect.

https://github.com/apache/couchdb/blob/main/src/fabric/src/fabric_view_all_docs.erl#L297

That spawns 10 background workers at the fabric (cluster level) to open 10 documents in parallel, but that puts more work on the cluster. Can try decreasing it or increasing it a bit and see if it changes anything.

We tried using this, but with no luck.
One thing that might be of any help, we see the memory increasing linearly as the requests are made.

This is the memory in the last seven days:

Screenshot 2024-05-06 alle 16 08 14

The points were the memory rapidly decreases are the restarts of the couchdb service, which make the system fast again (for a couple of minutes).
I'm really wondering if there's any kind of memory leak somewhere.

@nickva
Copy link
Contributor

nickva commented May 7, 2024

@luca-simonetti thanks for trying the concurrency setting.

Yeah, memory leak is plausible we have debugged a number of those last few years. Some in OTP 25 and 24:

Some memory usage increase is expected perhaps if it's the page cache that's using the memory, thought that wouldn't explain the slowdown...

Another thing to tweak might be the max_dbs_open. That helped in another issue related to memory usage: #4992, another interesting one might be: #4988 (comment). Though, again in those cases performance didn't seem to be much of an issue.

Are there any exceptions or errors showing in the logs?

To get more details about the internals cay try getting the output from /_node/_local/_system and see what the "memory" objects and "message_queues" values increase. #4992 (comment) has an example of calling that and what part of the response might look like. _local can be replaced with an Erlang (distribution) node name as well.

@rnewson
Copy link
Member

rnewson commented May 7, 2024

which version of erlang are you using?

@luca-simonetti
Copy link
Author

which version of erlang are you using?

{
    "javascript_engine": {
        "version": "78",
        "name": "spidermonkey"
    },
    "erlang": {
        "version": "24.3.4.15",
        "supported_hashes": [
            "blake2s",
            "blake2b",
            "sha3_512",
            "sha3_384",
            "sha3_256",
            "sha3_224",
            "sha512",
            "sha384",
            "sha256",
            "sha224",
            "sha",
            "ripemd160",
            "md5",
            "md4"
        ]
    },
    "collation_driver": {
        "name": "libicu",
        "library_version": "70.1",
        "collator_version": "153.112",
        "collation_algorithm_version": "14"
    }
}

this is the full _config/_local/_versions output

as for the memory leaks provided: we tried to look into that but those are like 50GB memory leak whereas in our case the memory used is around 200MB and it's only 15% of the total available. For this very reason I don't really think it's related to those problems.

Also: we noticed that the CPU used is around 25% which is really low and tweaking the all_docs_concurrency to 100 didn't change really much meaning that the time taken increases over time, CPU and RAM usages are the same as before.

@luca-simonetti
Copy link
Author

and this is the current state of _node/_local/_system
it doesn't look too bad I guess...

{
    "uptime": 77078,
    "memory": {
        "other": 26982505,
        "atom": 631001,
        "atom_used": 605053,
        "processes": 170112432,
        "processes_used": 170076040,
        "binary": 474168,
        "code": 13917182,
        "ets": 2267344
    },
    "run_queue": 0,
    "run_queue_dirty_cpu": 0,
    "ets_table_count": 157,
    "context_switches": 851304076,
    "reductions": 3036158123783,
    "garbage_collection_count": 20109323,
    "words_reclaimed": 35778913031,
    "io_input": 144808316,
    "io_output": 1204370753,
    "os_proc_count": 0,
    "stale_proc_count": 0,
    "process_count": 444,
    "process_limit": 262144,
    "message_queues": {
        "couch_file": {
            "50": 0,
            "90": 0,
            "99": 0,
            "count": 38,
            "min": 0,
            "max": 0
        },
        "couch_db_updater": {
            "50": 0,
            "90": 0,
            "99": 0,
            "count": 38,
            "min": 0,
            "max": 0
        },
// a bunch of keys with value:0
    }

@luca-simonetti
Copy link
Author

luca-simonetti commented May 8, 2024

for an extreme test we tried to set [fabric] all_docs_concurrency = 10000 (ten thousands) and it suddendly worked perfectly for the first 8.000 requests. Then it started immediately slow down to the previous point
Since we need to get to 200.000 requests we need this behaviour to be constant.
With this new information what is the actual problem?

UPDATE: I actually restarted the process to verify that this value actually fixes the problem and it does not.
When it was fast it was basically requesting the "container" of the subtokens and somehow somewhere the values were given like super fast. After restarting couchdb it went back to the previous state where it takes the token and then the 100 subdocuments.
If you could explain what all_docs_concurrency does, since it's not documented.

UPDATE2: Idk if this is the case, but it could just be related to some hyperledger internal caching. I'm really struggling how the whole system works

@nickva
Copy link
Contributor

nickva commented May 8, 2024

for an extreme test we tried to set [fabric] all_docs_concurrency = 10000 (ten thousands) and it suddenly worked perfectly for the first 8.000 requests.

After restarting couchdb it went back to the previous state where it takes the token and then the 100 subdocuments.
If you could explain what all_docs_concurrency does, since it's not documented.

All doc_docs concurrency limit how many parallel doc read to do at the cluster level. That speeds up the include_docs=true _all_docs but at the expensive of consuming more resources per request.

Another factor to play with is Q. What is your Q value for the dbs? By default it's 2. You can try experimenting with higher values like 8, 16, 64.

UPDATE2: Idk if this is the case, but it could just be related to some hyperledger internal caching. I'm really struggling how the whole system works

If on your containers your CPU and memory usage are not very high, it does seem like a slow disk problem issue but a disk with a page cache or other limited faster cache in front of it. Try using a faster local disk or another setup as an experiment with the same container and see what numbers you get.

@luca-simonetti
Copy link
Author

I don't think the disk is the issue here:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 312512 478476 2741660    0    0     2    19    0    0  3  0 97  0  0

as you can see the system doesn't spend any time waiting for IO ops.

What I see though is a memory cache increasing. After restart that very same cache decreases a lot and starts growing back again.

One more note: the requests that are slow are not the "include_documents=true" ones, but the "attachments=true"
I reiterate just so we're on the same page: we do 10-20 millions request to download ALL documents with a key prefix in the database. For example we might have
piece0000001
piece0000002
...
piece1000000
...
piece9999999

and we make a request for each of those. So if there's something cached it's of no use or even in this case a issue since each subsequent request is accessing a different document from the request before.

Also: we cannot change the Q factor since the database is already created and we don't want to risk resharding and losing the whole thing...

@rnewson
Copy link
Member

rnewson commented May 16, 2024

a memory leak around attachments is feasible, as they are handled as large binaries in erlang, and there have been some leaks in BEAM around that.

What version of erlang is couchdb using here?

@rnewson
Copy link
Member

rnewson commented May 16, 2024

as for resharding, our advice to increase q would be to make a new, empty database with the new q and replicate from current db to the new one with admin credentials and no filter. This ensures everything gets across. You could leave a continuous:true replication running while switching clients over from the old url to the new one, and then cancel it, and delete the old database, once everything is switched (i.e, no more reads or writes arriving at the old db).

@luca-simonetti
Copy link
Author

a memory leak around attachments is feasible, as they are handled as large binaries in erlang, and there have been some leaks in BEAM around that.

What version of erlang is couchdb using here?

the Erlang version inside couchdb is:
"version": "24.3.4.15"

@rnewson
Copy link
Member

rnewson commented May 16, 2024

ok, let's get you onto 24.3.4.17, as we know there are bugs in earlier point releases of that series. Let's rule those out first.

@luca-simonetti
Copy link
Author

ok, cool! How do I do that?

@rnewson
Copy link
Member

rnewson commented May 16, 2024

you'll need to download or build 24.3.4.17 (https://www.erlang.org/patches/otp-24.3.4.17) and build couchdb against it.

How did you install couchdb initially? We should update our binary packages with 24.3.4.17 for the same reason.

@luca-simonetti
Copy link
Author

we installed the current couchdb version (3.3..3) using the official repository version for Ubuntu. I think erlang came bundled with it

@nickva
Copy link
Contributor

nickva commented May 16, 2024

I just rebuilt all the 3.3.3 packages to use the latest Erlang patch version 24.3.4.17. If you update from the deb repo you should get the latest version from there: 3.3.3-2

@luca-simonetti
Copy link
Author

luca-simonetti commented May 17, 2024

thank you for the help! We updated couchdb as requested:

{
"javascript_engine": {
"version": "78",
"name": "spidermonkey"
},
"erlang": {
"version": "24.3.4.17",
"supported_hashes": [
"blake2s",
"blake2b",
"sha3_512",
"sha3_384",
"sha3_256",
"sha3_224",
"sha512",
"sha384",
"sha256",
"sha224",
"sha",
"ripemd160",
"md5",
"md4"
]
},
"collation_driver": {
"name": "libicu",
"library_version": "70.1",
"collator_version": "153.112",
"collation_algorithm_version": "14"
}
}

but unfortunately didn't help. The problem is still the same 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants