Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weblogic Operator-3.3.8 rolling restart not happening when the pod is not in Ready State #4189

Open
jakkoo opened this issue Apr 25, 2023 · 9 comments
Assignees

Comments

@jakkoo
Copy link

jakkoo commented Apr 25, 2023

We use Weblogic operator 3.3.8 with FMW 12.2.1.4. In our environment, we have setup the Healthcheck through Weblogic ReadyApp framework. Due to the health check, sometimes pod status will be changed to NotReady or 1/2.

When the pod is not in Ready State, the domain rolling start is not working as expected. Example Assume a weblogic domain with 1 admin pod and 3 managedservers/pods.
Admin ServerPod in Ready state
MS1 pod in Ready State
MS2 pod in NotReady State [NotReady because of Readiness Failure]
MS3 pod in Ready State

In the above condition, if we update the Weblogic DomainResource with a new image then the rolling restart is kicked off.
Until MS1 rolling restart works fine. When it reaches MS2 pod, the rolling restart is not happening and it get stuck in MS2 pod.

In weblogic operator logs able to see every 5 mins, the rolling restart of MS2 pod is attempted but not happening.

Expectation --Rolling Restart of Pod should occur irrespective of POD status or even when the pod is not in Ready State.

@xiancao
Copy link
Member

xiancao commented Apr 25, 2023

What is your maxUnavailable set? The default value for maxUnavailable is 1. If one server is Unready and unavailable, the operator can't shut down other servers to perform rolling restart. Can you increase the value of maxUnavailable and try again?

@jakkoo
Copy link
Author

jakkoo commented Apr 26, 2023 via email

@jakkoo
Copy link
Author

jakkoo commented Apr 26, 2023

Even the maxUnavailable parameter is taking effect only if the POD is in running state. Assume you set maxUnavailable to 2. In this set up, If a POD is in state Init:ImagePullBackOff and if you update the domain resource with new image then also only admin server is getting restarted. The pods which were in "Init:ImagePullBackOff" were not restarted.

@xiancao
Copy link
Member

xiancao commented Apr 26, 2023

That is by design.

@jakkoo
Copy link
Author

jakkoo commented Apr 26, 2023 via email

@xiancao
Copy link
Member

xiancao commented Apr 26, 2023

We support rolling even when the rolling starts while a server is not ready or not yet ready. The maxUnavailable constraint simply must be honored throughout the process.

@jakkoo
Copy link
Author

jakkoo commented Apr 26, 2023

Hi

I have already documented that updating the maxUnavailable from 1 to 2 also doesn’t help. Domain rolling restart doesn’t occur when the pod is not in Ready state. This is our observation even with maxUnavailable value

That’s the reason for raising this bug.

@xiancao
Copy link
Member

xiancao commented Apr 27, 2023

@rjeberhard ^^^

@rjeberhard
Copy link
Member

@jakkoo, I'm following up to see if this is still an issue for you. I apologize that I didn't see the ping from @xiancao.

If this is still an issue, can you please share your domain YAML. I see the discussion about how the setting for maxUnavailable. My expectation is that the operator will wait for not-ready pods to return to ready so that the setting is honored; however, we may be able to do a better job selecting which pod is restarted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants