Liquibase leaves dangling lock on startup #517

mook-as · 2019-08-22T17:30:55Z

Hi there!

We're getting an error on startup, where autoscaler-metrics can't start because liquibase attempts to acquire a lock, but its locking implementation means there's a chance of it dangling with no owner:

consul agent is not needed
Starting Liquibase at Fri, 16 Aug 2019 07:06:19 UTC (version 3.6.3 built at 2019-01-29 11:34:48)
Unexpected error running Liquibase: Could not acquire change log lock.  Currently locked by autoscaler-metrics-1.autoscaler-metrics-set.scf.svc.cluster.local (10.244.2.10) since 8/16/19, 5:26 AM
liquibase.exception.LockException: Could not acquire change log lock.  Currently locked by autoscaler-metrics-1.autoscaler-metrics-set.scf.svc.cluster.local (10.244.2.10) since 8/16/19, 5:26 AM
	at liquibase.lockservice.StandardLockService.waitForLock(StandardLockService.java:230)
	at liquibase.Liquibase.update(Liquibase.java:184)
	at liquibase.Liquibase.update(Liquibase.java:179)
	at liquibase.integration.commandline.Main.doMigration(Main.java:1220)
	at liquibase.integration.commandline.Main.run(Main.java:199)
	at liquibase.integration.commandline.Main.main(Main.java:137)

Steps to reproduce:

(This was on SCF, but as far as I can tell this is an issue with liquibase, not how you run it.)

Start a CF cluster, and have autoscaler installed.
Restart the metricscollector job, and watch its output. (In my case, the autoscaler-metrics pod.)
Once it shows Starting Liquibase, forcibly terminate the process.
Let it restart, and see that it gets stuck.

Details

It appears that liquibase implements locking by inserting a row in a database, instead of taking a database lock (row-level or table-level). See this stackoverflow question for a similar situation, and recovery where manual intervention in the DB is required. I think it might actually be trying to use transactions, though; just removing the "lock" got it to recover.

I'm not sure what you could do to fix this (other than trying to fix it upstream in liquibase — unfortunately my Java is terrible). But I figured I should file it at the minimum.

Thanks!

The text was updated successfully, but these errors were encountered:

cf-gitbot · 2019-08-22T17:30:56Z

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/168061254

The labels on this github issue will be updated when the story is started.

cdlliuy · 2019-08-23T01:46:52Z

@mook-as ,thanks!yes,we noticed this problem and tracked through cloudfoundry/app-autoscaler-release#207.
A PR cloudfoundry/app-autoscaler-release#209 is raised to resolve this issue, but still under review.

cdlliuy · 2019-08-23T15:56:42Z

Furthermore, @mook-as , we did the force termination as well to reproduce the issue, but do you happen to notice any other approach to trigger the failure?
The liquibase related pre-start job is there for a quite long time, but it never failed in bosh or scf deployment previously...
If "force termination" is the only trigger, I am quite curious which change in scf will kill on pre-start job ... Any cue from your side?

mook-as · 2019-08-26T16:42:04Z

Yeah, I'm not sure why we're seeing forced termination (actually, I think it's just that the DB connection died); I suspect it's just us bumping cf-deployment leading to a larger footprint in the rest of the deployment, overloading the system slightly and making the autoscaler bits slower?

cf-gitbot added the unscheduled label Aug 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liquibase leaves dangling lock on startup #517

Liquibase leaves dangling lock on startup #517

mook-as commented Aug 22, 2019

cf-gitbot commented Aug 22, 2019

cdlliuy commented Aug 23, 2019 •

edited

cdlliuy commented Aug 23, 2019

mook-as commented Aug 26, 2019

Liquibase leaves dangling lock on startup #517

Liquibase leaves dangling lock on startup #517

Comments

mook-as commented Aug 22, 2019

Steps to reproduce:

Details

cf-gitbot commented Aug 22, 2019

cdlliuy commented Aug 23, 2019 • edited

cdlliuy commented Aug 23, 2019

mook-as commented Aug 26, 2019

cdlliuy commented Aug 23, 2019 •

edited