Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigcache GetBig/SetBig methods correctness bug #79

Open
sivukhin opened this issue Mar 14, 2024 · 1 comment
Open

bigcache GetBig/SetBig methods correctness bug #79

sivukhin opened this issue Mar 14, 2024 · 1 comment

Comments

@sivukhin
Copy link

Description

There are different keys and different values such that sequence of operations SetBig(k1, v1), SetBig(k2, v2), GetBig(k1) will return v2 for last GetBig operation. This is definitely unexpected cache behaviour which is purely result of the implementation of bigcache.

This behaviour should be either fixed, or clearly documented in the code (and maybe even public method names should explicitly reflect guarantees).

Details

fastcache methods for storing large payloads heavily rely on the xxhash digests. More precisely, SetBig(k, v) method will put following entries in the underlying cache:

  • key: k, value: xxhash64(v) || len(value)
  • key: xxhash(v) || 0, value: value_block_0
  • ...
  • key: xxhash(v) || n, value: value_block_n

As xxhash is not cryptographically strong hash functions, it's pretty easy to create several different byte sequences of the same length with same digests. In this case we can encounter very bad cache behaviour, when SetBig(k2, v2) can overwrite value for previously written key SetBig(k1, v1).

You can consider following test which were created by reversing xxhash function (more specifically, we can reverse single round and then for any two different payloads b1 and b2 append 32 bytes to each of them in order to make internal state of xxhash to be the same for both cases):

func TestBigCacheCorrectness(t *testing.T) {
	key1 := []byte("1")
	value1 := []byte{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x91, 0x45, 0x91, 0x21, 0x81, 0x13, 0x26, 0x82, 0x72, 0x46, 0x9d, 0xa8, 0x3c, 0xc3, 0x6e, 0xbd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xe1, 0x0, 0xc, 0x87, 0xbb, 0xaf, 0x48, 0x3b}

	key2 := []byte("2")
	value2 := []byte{0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x7, 0x16, 0x7f, 0xf7, 0x39, 0xb5, 0x33, 0xdd, 0xe8, 0x16, 0x8b, 0xfe, 0x99, 0x64, 0xec, 0x7b, 0x76, 0xd0, 0xed, 0xd5, 0xb8, 0xa1, 0xd, 0x5b, 0x57, 0xd1, 0xf9, 0x5c, 0x74, 0x51, 0x56, 0x96}

	cache := New(1)
	cache.SetBig(key1, value1)
	cache.SetBig(key2, value2)
	if value := cache.GetBig(nil, key1); value[0] != value1[0] {
		t.Fatalf("key=%v, got=%v, expected=%v", string(key1), value, value1)
	}
	if value := cache.GetBig(nil, key2); value[0] != value2[0] {
		t.Fatalf("key=%v, got=%v, expected=%v", string(key2), value, value2)
	}
}
/* 
=== RUN   TestBigCacheCorrectness
    fastcache_test.go:70: key=1, got=[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 7 22 127 247 57 181 51 221 232 22 139 254 153 100 236 123 118 208 237 213 184 161 13 91 87 209 249 92 116 81 86 150], expected=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 145 69 145 33 129 19 38 130 114 70 157 168 60 195 110 189 0 0 0 0 0 0 0 0 225 0 12 135 187 175 72 59]
--- FAIL: TestBigCacheCorrectness (0.00s)
*/

Unfortunately, I don't know any easy fix for this issue. There is an option to use k || block_index as a block key instead of xxhash(v) || block_index which will provide more guarantees (clashes between different keys will be impossible with this scheme) - but still there will be problems with concurrent updates of single key which can lead to "mix" of values from different calls. Also, this change may degrade performance a bit (but given that keys are usually small - it should be pretty negligible).

But, I suggest to at least add explicit warning in the documentation which definitely will help users of the library to make right & correct decisions about fastcache usage (because current issue can be very critical for cases when fastcache is used for storage of sensitive data, e.g. session information for web application of something similar)

@sivukhin sivukhin changed the title fastcache SetBig correctness bigcache GetBig/SetBig methods correctness issue Mar 14, 2024
@sivukhin sivukhin changed the title bigcache GetBig/SetBig methods correctness issue bigcache GetBig/SetBig methods correctness bug Mar 14, 2024
@sivukhin
Copy link
Author

CC @valyala

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant