Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential corruption bug in Badger indexes #34

Open
KyleMaas opened this issue Jan 24, 2023 · 0 comments
Open

Potential corruption bug in Badger indexes #34

KyleMaas opened this issue Jan 24, 2023 · 0 comments

Comments

@KyleMaas
Copy link

So, overall, the Badger indexes seem to be the best guarded against race conditions of any I've seen. However, there's one spot where I can see a possibility for data corruption.

Set() uses a batch queue for writes:

func (idx *index) Set(ctx context.Context, addr indexes.Addr, v interface{}) error {
var (
raw []byte
err error
)
if m, ok := v.(encoding.BinaryMarshaler); ok {
raw, err = m.MarshalBinary()
if err != nil {
return fmt.Errorf("error marshaling value using custom marshaler: %w", err)
}
} else {
raw, err = json.Marshal(v)
if err != nil {
return fmt.Errorf("error marshaling value using json marshaler: %w", err)
}
}
idx.l.Lock()
defer idx.l.Unlock()
dbKey := append(idx.keyPrefix, addr...)
batchedOp := setOp{
addr: dbKey,
val: raw,
}
idx.nextbatch = append(idx.nextbatch, batchedOp)
if n := uint32(len(idx.nextbatch)); n > idx.batchFullLimit {
err = idx.flushBatch()
if err != nil {
return fmt.Errorf("failed to write big batch (%d): %w", n, err)
}
}
obv, ok := idx.obvs[addr]
if ok {
err = obv.Set(v)
if err != nil {
return fmt.Errorf("error setting value in observable: %w", err)
}
}
return nil
}

While SetSeq() does not:

func (idx *index) SetSeq(seq int64) error {
idx.l.Lock()
defer idx.l.Unlock()
dbKey := append(idx.keyPrefix, currentSeqAddr...)
raw := make([]byte, 8)
binary.BigEndian.PutUint64(raw, uint64(seq))
err := idx.db.Update(func(txn *badger.Txn) error {
err := txn.Set(dbKey, raw)
if err != nil {
return fmt.Errorf("error during setSeq update: %w", err)
}
return nil
})
if err != nil {
return err
}
idx.curSeq = seq
return nil
}

Since Set() keeps an in-memory state for the things it puts in the queue, which gets used first in Get() before attempting to read from Badger, it's unlikely this would cause a race condition. But if the program is terminated after a SetSeq() and before the batch is flushed, due to crash for example, the index could contain a sequence number which surpasses the actual data in the index.

And we would likely never see this in testing because of "nasty testing hack" #33 bypassing the write queue for the test suite. It would probably only show up downstream in actual use. So we should probably be doing some more testing with the nasty testing hack disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant