Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes closing of Cache and Policy to be concurrent safe #286

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

josephschorr
Copy link

@josephschorr josephschorr commented Oct 8, 2021

Before this change, calling Close on the cache while a Set call is in progress could occasionally result in a panic when the Set method tried to write to the (now closed) setBuf channel. This commit changes Cache and Policy to keep references to the written-to channels in an internals struct, which is nil-ed out on close and checked in each call site, ensuring that nothing is attempting to write to a closed channel.


This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented Oct 8, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@josephschorr
Copy link
Author

josephschorr commented Oct 8, 2021

One open question: Should we still try to close out the setBuf channel post Close, perhaps in a goroutine, for full cleanup?

@josephschorr
Copy link
Author

I can't see the TeamCity build results (no access), but go test ./... passes locally, for reference.

@danielmai
Copy link
Contributor

Thanks for the PR @josephschorr. CI is failing with a data race.

==================
WARNING: DATA RACE
Read at 0x00c0001df0d0 by goroutine 90:
  github.com/dgraph-io/ristretto.(*Cache).setInternal()
      /home/agent/work/7f32719a4b7aa00/cache.go:296 +0x5d
  github.com/dgraph-io/ristretto.(*Cache).SetWithTTL()
      /home/agent/work/7f32719a4b7aa00/cache.go:281 +0x18f
  github.com/dgraph-io/ristretto.(*Cache).Set()
      /home/agent/work/7f32719a4b7aa00/cache.go:273 +0x13e
  github.com/dgraph-io/ristretto.TestConcurrentSetAfterClose.func1()
      /home/agent/work/7f32719a4b7aa00/cache_test.go:240 +0x128

Previous write at 0x00c0001df0d0 by goroutine 79:
  github.com/dgraph-io/ristretto.(*Cache).Close()
      /home/agent/work/7f32719a4b7aa00/cache.go:427 +0xcf
  github.com/dgraph-io/ristretto.TestConcurrentSetAfterClose()
      /home/agent/work/7f32719a4b7aa00/cache_test.go:260 +0x2d1
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1050 +0x1eb

Goroutine 90 (running) created at:
  github.com/dgraph-io/ristretto.TestConcurrentSetAfterClose()
      /home/agent/work/7f32719a4b7aa00/cache_test.go:233 +0x20a
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1050 +0x1eb

Goroutine 79 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1095 +0x537
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1339 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1050 +0x1eb
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1337 +0x594
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1252 +0x2ff
  main.main()
      _testmain.go:200 +0x223
==================
--- FAIL: TestConcurrentSetAfterClose (0.20s)
    testing.go:965: race detected during execution of test
FAIL
FAIL	github.com/dgraph-io/ristretto	64.962s
?   	github.com/dgraph-io/ristretto/contrib/demo	[no test files]
?   	github.com/dgraph-io/ristretto/contrib/memtest	[no test files]
ok  	github.com/dgraph-io/ristretto/sim	0.032s
ok  	github.com/dgraph-io/ristretto/z	95.954s
ok  	github.com/dgraph-io/ristretto/z/simd	0.029s
FAIL

Try running the tests with the -race flag (that's what ./test.sh runs).

go test -race ./...

@josephschorr
Copy link
Author

D'oh... I forgot the race detector doesn't run automatically. Will address now.

@josephschorr
Copy link
Author

josephschorr commented Oct 8, 2021

@danielmai So looking more deeply, the race is more or less the "expected behavior" on that variable, which is why I keep a local copy. I can easily fix this by adding a mutex guard around accessing it, but that does mean that there will be some slight slowdown on all the function calls.

Thoughts?

Edit: Decided to add it for safety

Before this change, calling Close on the cache while a Set call is in progress could occasionally result in a panic when the Set method tried to write to the (now closed) `setBuf` channel. This commit changes Cache and Policy to keep references to the written-to channels in an `internals` struct, which is nil-ed out on close and checked in each call site, ensuring that nothing is attempting to write to a closed channel.
@danielmai
Copy link
Contributor

I'll defer to @manishrjain to chime in for this PR.

@josephschorr
Copy link
Author

@manishrjain Thoughts on the use of a lock vs, say, a compare and swap?

In terms of timing, we're trying to decide whether to have SpiceDB make use of my branch or wait for this to merge and release, so guidance in that area would also be appreciated.

@manishrjain
Copy link
Contributor

Hey @josephschorr , for now, my recommendation would be to use your fork. Meanwhile, I'll try to find some time to review this PR and understand it's performance implications.

josephschorr added a commit to authzed/spicedb that referenced this pull request Oct 14, 2021
josephschorr added a commit to authzed/spicedb that referenced this pull request Oct 14, 2021
… the parser

Requires dgraph-io/ristretto#286 before it can be rebased and merged

Signed-off-by: Joseph Schorr <josephschorr@users.noreply.github.com>
josephschorr added a commit to authzed/spicedb that referenced this pull request Oct 14, 2021
…etto#286 is merged

Signed-off-by: Joseph Schorr <josephschorr@users.noreply.github.com>
josephschorr added a commit to authzed/spicedb that referenced this pull request Oct 14, 2021
…etto#286 is merged

Signed-off-by: Joseph Schorr <josephschorr@users.noreply.github.com>
josephschorr added a commit to authzed/spicedb that referenced this pull request Oct 14, 2021
… the parser

Depends upon dgraph-io/ristretto#286

Signed-off-by: Joseph Schorr <josephschorr@users.noreply.github.com>
josephschorr added a commit to authzed/spicedb that referenced this pull request Oct 14, 2021
…etto#286 is merged

Signed-off-by: Joseph Schorr <josephschorr@users.noreply.github.com>
@gaby
Copy link

gaby commented Dec 11, 2022

Any updates regarding this PR ?

@gaby
Copy link

gaby commented Jan 11, 2023

Friendly ping @josephschorr @manishrjain

@manishrjain
Copy link
Contributor

I'm no longer managing this repo. I have my own fork of ristretto that I manage and use.

https://github.com/outcaste-io/ristretto

@akon-dey akon-dey requested review from joshua-goldstein and removed request for manishrjain January 12, 2023 17:27
@akon-dey
Copy link

@josephschorr would you please accept the CLA and resolve the conflict. @joshua-goldstein, @pandeyshubham25, @matthewmcneely could one of you review the code and help with resolving the conflict. Also, let's make sure that the version is updated appropriately.

@josephschorr
Copy link
Author

@akon-dey I'd be happy to rebase my changes, but be aware we've moved to the github.com/outcaste-io/ristretto fork, which fixed the issue as well. This is partially why we stopped asking for updates on our fork containing this branch.

I'll see if I can get this rebased this evening

@gaby
Copy link

gaby commented Jan 14, 2023

@josephschorr I was wondering what's the reasoning for using the fork, is not like this repo was abandoned?

@josephschorr
Copy link
Author

@gaby This PR sat for quite a while, so we used our own fork during that time. Once github.com/outcaste-io/ristretto became available, we tested it for the fix, found it worked, and so moved to it rather than relying on a custom fork that would not get updates otherwise.

@gaby
Copy link

gaby commented Jan 14, 2023

@gaby This PR sat for quite a while, so we used our own fork during that time. Once github.com/outcaste-io/ristretto became available, we tested it for the fix, found it worked, and so moved to it rather than relying on a custom fork that would not get updates otherwise.

Makes sense, thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants