Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to identify old and new primary database from repmgr? #820

Open
wasiualhasib opened this issue Jul 17, 2023 · 1 comment
Open

How to identify old and new primary database from repmgr? #820

wasiualhasib opened this issue Jul 17, 2023 · 1 comment

Comments

@wasiualhasib
Copy link

Basically, I am loading balance data using pgpool but I am fear that if primary node(node1) down then another node (node2) will be the new primary then how can I tell pgpool that now new primary node is node1?

When both primary is up then from the database side it is difficult to identify which one is the actual primary using the command
" select * from nodes;" in repmgr schema. Because old primary node1 show he is now primary and node2 is registered as standby but running as primary, but from new primary it says node1 and node2 both are primary but node2 is new primary but node1 !running as primary.

This type of issue happened when the node reboots and takes too long time to reboot. Then repmgr will promote another node as a primary. But when reboot is completed after long time, in that case database will up again as a old primary and keep running. So in that case it is necessary to inform pgpool that node1 is old primary don't connect with it please connect to node2 as new primary.

Is there any way to identify which one is now the new primary from the old primary node (node1)? it could be using a query or script or from a database level.

Need expert explanation/help on this. @ibarwick could you help me on this?

`[postgres@DB1 ~]$ repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+----------------------+----------+----------+----------+----------+-------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 7 | host=10.16.10.17 user=repmgr dbname=repmgr connect_timeout=2 port=5678
2 | node2 | standby | ! running as primary | | default | 100 | 8 | host=10.16.10.18 user=repmgr dbname=repmgr connect_timeout=2 port=5678
3 | node3 | standby | running | ! node2 | default | 100 | 8 | host=10.16.10.19 user=repmgr dbname=repmgr connect_timeout=2 port=5678
4 | Witness | witness | * running | ! node2 | default | 0 | n/a | host=10.16.10.20 user=repmgr dbname=repmgr connect_timeout=2 port=5678

WARNING: following issues were detected

  • node "node2" (ID: 2) is registered as standby but running as primary
  • node "node3" (ID: 3) reports a different upstream (reported: "node2", expected "node1")
  • node "Witness" (ID: 4) reports a different upstream (reported: "node2", expected "node1")

[postgres@DB2 ~]$ repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+-------------------------------------------------------------------------
1 | node1 | primary | ! running | | default | 100 | 7 | host=10.16.10.17 user=repmgr dbname=repmgr connect_timeout=2 port=5678
2 | node2 | primary | * running | | default | 100 | 8 | host=10.16.10.18 user=repmgr dbname=repmgr connect_timeout=2 port=5678
3 | node3 | standby | running | node2 | default | 100 | 8 | host=10.16.10.19 user=repmgr dbname=repmgr connect_timeout=2 port=5678
4 | Witness | witness | * running | node2 | default | 0 | n/a | host=10.16.10.20 user=repmgr dbname=repmgr connect_timeout=2 port=5678

WARNING: following issues were detected

  • node "node1" (ID: 1) is running but the repmgr node record is inactive
    `
@KJlt77
Copy link

KJlt77 commented Aug 31, 2023

I'm familiar with this situation.

First of all I'd recommend not to reboot the primary node during planned maintenance, but switchover to another node first and avoid all of this.

This doesn’t solve the problem during outages, but the lower value of the Timeline column clearly shows node1 is obsolete in this case. I don’t know why the developers of repmgr do not utilize this to improve the functionality of the tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants