Replies: 1 comment
-
we found the problem: because we updated the replication number after creating the tables, some partitions of the tables still only have 1 replica only, causing some queries to fail when 1 BE is down. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In our production environment, we set up 3 follow FE and 4 BE nodes, most tables have 3 replicas. Due to a memory leak issue we are experiencing (#34168), we have to restart our BE nodes often as a temporary fix. We expected that having 4 BE nodes and 3 replicas would minimize the service downtime. However, we see that queries often fail during a BE restart with errors like
No scannodes available
, or even worse,...(BE IP) in blacklist
, and we had to turn off the blacklist option in our config as the IP would be in blacklist even after the BE node is back up and running.We would like to know if this behavior is intended? How can we ensure high availability when some BE nodes may be down? Would simply adding more nodes solve the issue? 请大神请教 :D
Beta Was this translation helpful? Give feedback.
All reactions