Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dropping schema cache reload notifications #2810

Merged
merged 4 commits into from
Jun 2, 2023

Conversation

steve-chavez
Copy link
Member

@steve-chavez steve-chavez commented Jun 1, 2023

Closes #2791

Tasks

  • Reproduce the failure by introducing a internal-schema-cache-sleep config that calls pg_sleep at the end of the schema cache load.
  • Fix by introducing a debouncer instead of a semaphore for the connection worker

Notes

  • To introduce the debounce(which has to store state in AppState), I had to move the connectionWorker(and others) to the AppState module. It had to be done this way otherwise it caused a dependency cycle and compiling failed.
    • I think this move makes sense. Workers.hs was filled with AppState logic.
    • In doing so the code shortens a bit bc the AppState prefix is removed.
    • State is easier to see and this will help to solve LISTEN/NOTIFY doesn't work on read replicas #2781.
    • A clearer Admin.hs module is introduced.

Comment on lines 951 to 984

# If DB_POOL=1, then the second request(/rpc/migrate_function) will just wait(PGRST_DB_POOL_ACQUISITION_TIMEOUT=10) for the schema cache reload to finish.
# This is bc the only pool connection will be busy with the PGRST_INTERNAL_SCHEMA_CACHE_SLEEP(does a pg_sleep)
# So this must be tested with a DB_POOL size of at least 2. That way the second request will pick the other pool connection and proceed.

env = {
**defaultenv,
"PGRST_INTERNAL_SCHEMA_CACHE_SLEEP": "1",
"PGRST_DB_CHANNEL_ENABLED": "true",
"PGRST_DB_POOL": "2",
}

internal_sleep = int(env["PGRST_INTERNAL_SCHEMA_CACHE_SLEEP"])

with run(env=env, wait_for_readiness=False) as postgrest:
time.sleep(internal_sleep + 0.1) # wait for readiness manually

response = postgrest.session.post("/rpc/create_function")
assert response.status_code == 204

time.sleep(
internal_sleep / 2
) # wait to be inside the schema cache reload process

response = postgrest.session.post("/rpc/migrate_function")
assert response.status_code == 204

time.sleep(
2 * internal_sleep
) # wait enough time to ensure the broken state remains

response = postgrest.session.get("/rpc/mult_them?c=3&d=4")
assert response.text == "12"
assert response.status_code == 200
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was easy to test manually but really tricky in the io tests 😓

@steve-chavez
Copy link
Member Author

@steve-chavez steve-chavez changed the title Fix bad schema cache reload Fix dropping schema cache reload notifications Jun 2, 2023
@steve-chavez
Copy link
Member Author

Codecov failing bc of all the missing tests mentioned on #1766

let oneSecond = 1000000 in
mkDebounce defaultDebounceSettings
{ debounceAction = logPgrstError appState SQL.AcquisitionTimeoutUsageError
, debounceFreq = 5*oneSecond
, debounceEdge = leadingEdge -- logs at the start and the end
}

return appState { debounceLogAcquisitionTimeout = deb }
debWorker <-
let decisecond = 100000 in
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is enough to ensure schema cache reloading is still fast(io tests pass) but we still debounce.

Also tried manually with a bash loop for ((;;)) { kill -s SIGUSR1 $(pgrep -n postgrest); } and it's working fine.

@steve-chavez steve-chavez merged commit a852b76 into PostgREST:main Jun 2, 2023
27 of 28 checks passed
@steve-chavez
Copy link
Member Author

Tests in main are flaky now. Will fix on a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

"reload schema" notifications are dropped when a schema reload is already in progress
1 participant