Performance advice #365

igor-zacharov · 2020-06-10T09:06:19Z

Can you share recommendations on configuring akumuli database for performance.

For example, for more performance and more parallelism on data ingest:
is it better to increase nvolumes or the volume_size?
Same question for the queries. It seems that grafana issues queries to akumulid with a lot of tags, the response is "long" (perceptionaly) and I often get "Data outside time range" in the graphs.

While this problem I will solve myself, what is the guidance in general for the db settings?

Lazin · 2020-06-11T08:12:29Z

Both nvolumes and volume_size have little effect on overall performance. You can't have volume_size greater than 4GB. WAL parameters can affect performance if WAL.volume_size is too small. Try set it depending on a workload. For instance if you're writing 1MB/sec on average having WAL.volume_size around 200MB should be OK bit if volume_size is 1MB Akumuli will reallocate WAL volumes every second and it will have performance impact. The default value of 256MB is perfectly practical. The WAL.nvolumes mostly have impact on crash recovery time.

The parameter that affects the ingestion performance a lot is pool_size. If it's too small the database won't use all available cores for ingestion. If pool_size is 0, then the size of the thread pool will be chosen depending on your hardware. Important note here is that Akumuli doesn't allow you to allocate all available cores for ingestion. If you have 1 CPU, Akumuli will be able to use it for ingestion. If you have 2 CPUs it will use only 1 for ingestion leaving second one for queries. Going further it will leave 2 CPUs for queries if you have up to 8 CPUs on your machine. If you have more than 8 CPUs it will reserve 4 CPUs for queries. So, leaving pool_size equal to 0 makes sense performance wise.

Query performance depends on dataset cardinality. Some queries can be slower in this case. For instance autocomplete in query editor will definitely be slower. Also, if you enable 'Raw' checkbox Akumuli will stop aggregating the data and will return huge number of datapoints. And the plugin needs to parse all this datapoints using JS in the browser. If your series names are long and have a lot of tags in them it can also slow down parsing in the browser. Having a lot of series on a single graph can also be slow for the same reason. Using pivot/group by tag functionality can slow things down in the database because it'll have to merge-join a lot of time-series together.

From hardware standpoint it makes sense to run on SSD for both fast ingestion and querying. Basically, the more IOPS you have the faster the queries. The ingestion doesn't depend on IOPS a lot.

Also, you can send me the request that is perseived as slow and dataset parameters (how series look and how many of them out there) and I'll try to find out how this can be helped. Things are a bit slow on my side because of WFH and all that but I'll try to fancy something anyway.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance advice #365

Performance advice #365

igor-zacharov commented Jun 10, 2020

Lazin commented Jun 11, 2020

Performance advice #365

Performance advice #365

Comments

igor-zacharov commented Jun 10, 2020

Lazin commented Jun 11, 2020