Problem with Janus admin port sporadically not responding on clean start #474

brentgriffin · 2021-04-08T15:13:27Z

Running Janus with basic auth using cassandra as the persistence mechanism: Automated scripted deployment of Janus sporadically comes up in a bad state. This bad state is that connections to the admin port are accepted but they block until the client times out (no response is ever sent to the client). Requests through the api gateway port seem to be working properly.

When the system comes up in this state, it never recovers. The only way that I can get it working is to undeploy Janus and to redeploy it.

Not having the admin port available prevents the loading of basic user credentials.

Frequency: No hard numbers here but estimating it fails once every five to six deployments.

Possible cause: Looking at the logs, I see a timeout on accessing cassandra. Does not appear to ever retry the cassandra request.

Janus log when in bad state:

➜ kubectl logs janus-deployment-6bfccd676-v7qgd -c janus
time="2021-04-08T14:09:44Z" level=info msg="Janus starting..." version=dev-9fa15f6
[StatsGo] 2021/04/08 14:09:44 Stats counter incremented	metric=app.init.janus-deployment-6bfccd676-v7qgd.janus
[StatsGo] 2021/04/08 14:09:44 Stats counter incremented	metric=total.app
[StatsGo] 2021/04/08 14:09:52 Stats counter incremented	metric=error-log.error.-.-
[StatsGo] 2021/04/08 14:09:52 Stats counter incremented	metric=total.error-log
{"level":"error","msg":"error getting all definitions: gocql: no response received from cassandra within timeout period","time":"2021-04-08T14:09:52Z"}

Janus log when the admin port works correctly:

➜ kubectl logs janus-deployment-6bfccd676-qssw9 -c janus
time="2021-04-08T14:57:24Z" level=info msg="Janus starting..." version=dev-9fa15f6
[StatsGo] 2021/04/08 14:57:24 Stats counter incremented	metric=app.init.janus-deployment-6bfccd676-qssw9.janus
[StatsGo] 2021/04/08 14:57:24 Stats counter incremented	metric=total.app

The text was updated successfully, but these errors were encountered:

jtesser · 2021-04-08T15:16:55Z

yea I was thinking retry logic on cassandra @tuxranger

brentgriffin · 2021-04-08T15:59:16Z

for whatever reason, this has happened to me three times already today :-(

tuxranger · 2021-04-08T16:25:26Z

in the janus.toml could you change the logging level from info to debug. I don't believe the info level gives warning messages which is what the retry logic messages are. I would like to double check if the logic is even running.

brentgriffin added type: bug priority: High labels Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with Janus admin port sporadically not responding on clean start #474

Problem with Janus admin port sporadically not responding on clean start #474

brentgriffin commented Apr 8, 2021

jtesser commented Apr 8, 2021

brentgriffin commented Apr 8, 2021

tuxranger commented Apr 8, 2021

Problem with Janus admin port sporadically not responding on clean start #474

Problem with Janus admin port sporadically not responding on clean start #474

Comments

brentgriffin commented Apr 8, 2021

jtesser commented Apr 8, 2021

brentgriffin commented Apr 8, 2021

tuxranger commented Apr 8, 2021