Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of the postgres insertion. #4149

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

schorsch1976
Copy link
Contributor

  • The first version commited 7c5ce9f was working ok but as the tables grew the postgres worker for that connection hit constantly 100% cpu usage on my small machine and also data was lost. So the functions needed to be optimized.

  • The function collectd_insert() was split into 2 parts. One part just collecting the data, create the instances and realign the timestamps. Second part: move_data_to_table() that must be called periodically. It also creates the plugin tables and the associated indices.

This was the CPU Load development with my first version. As you can see, it increased steadily.
cpuload

This was one day with the high load. The postgres process of that connection hit 100% of the core usage.
cpuload1

This is the updated version with improves the performance drastically.
cpuload2

- The first version commited 7c5ce9f
  was working ok but as the tables grew the postgres worker for that
  connection hit constantly 100% cpu usage on my small machine and also
  data was lost. So the functions needed to be optimized.

- The function collectd_insert() was split into 2 parts. One part just
  collecting the data, create the instances and realign the timestamps.
  Second part: move_data_to_table() that must be called periodically. It
  also creates the plugin tables and the asociated indices.
@schorsch1976 schorsch1976 requested a review from a team as a code owner November 7, 2023 16:15
Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few comments on the doc typos.

(SQL code has such large changes, that I'm not commenting on them as last time I did anything more serious with SQL was more than two decades ago.)

contrib/postgresql/README.md Outdated Show resolved Hide resolved
contrib/postgresql/README.md Outdated Show resolved Hide resolved
contrib/postgresql/README.md Outdated Show resolved Hide resolved

The current structure creates 3 identifiers and 3 lines for each entry. The identifiers get reused. It reports "192.168.myping.ip" as type.
The current structure creates 3 identifiers and 3 lines for each entry. The identifiers get reused. It reports "192.168.200.123" as type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where the IP value comes from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just choose random IP.... because it seemed better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From RFC5737:

The blocks 192.0.2.0/24 (TEST-NET-1), 198.51.100.0/24 (TEST-NET-2), and 203.0.113.0/24 (TEST-NET-3) are provided for use in documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: spell out what is being reported here and provide an example. The quotes make it look like this is the literal IP address that is being reported each time.

“The ping plugin reports an IP address in the type instance field, e.g. "192.0.2.42".”

contrib/postgresql/README.md Outdated Show resolved Hide resolved
contrib/postgresql/README.md Outdated Show resolved Hide resolved
contrib/postgresql/README.md Outdated Show resolved Hide resolved
contrib/postgresql/README.md Outdated Show resolved Hide resolved
contrib/postgresql/README.md Outdated Show resolved Hide resolved
*/1 * * * * root /opt/collectd/etc/move-data.sh
```

The procedure collectd_cleanup() is the maintenance function. It has as an argument the number of days where to keep the data. It can be called by pgagent or a similar mechanism like "CALL collectd_cleanup(180)". This delete all data that is older than 180 days.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The procedure collectd_cleanup() is the maintenance function. It has as an argument the number of days where to keep the data. It can be called by pgagent or a similar mechanism like "CALL collectd_cleanup(180)". This delete all data that is older than 180 days.
The procedure collectd_cleanup() is the maintenance function. It has as an argument the number of days to keep the data. It can be called by pgagent or a similar mechanism like "CALL collectd_cleanup(180)". This deletes all data that is older than 180 days.

@octo
Copy link
Member

octo commented Nov 24, 2023

This is good to merge with @eero-t 's suggestions. Thank you to the two of you!

@octo
Copy link
Member

octo commented Dec 21, 2023

Hi @schorsch1976, I'd love to merge this. Can you apply the suggested changes?

@octo octo added the Waiting for response - 1st time Waiting for contributor to respond - 1st call label Dec 21, 2023
schorsch1976 and others added 8 commits December 22, 2023 11:08
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
@schorsch1976
Copy link
Contributor Author

Hi @schorsch1976, I'd love to merge this. Can you apply the suggested changes?

Commited :)

Copy link
Member

@octo octo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@octo octo added Maintenance A pull request without immediate user-observable impact and removed Waiting for response - 1st time Waiting for contributor to respond - 1st call labels Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Maintenance A pull request without immediate user-observable impact
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants