Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

commitlog recovery time #129

Open
tylertreat opened this issue Jun 5, 2018 · 7 comments
Open

commitlog recovery time #129

tylertreat opened this issue Jun 5, 2018 · 7 comments

Comments

@tylertreat
Copy link
Contributor

The recovery time of a commitlog is currently linear with the number of entries in the log. As a result, a log with many messages (e.g. millions) can take several seconds or more to recover. The reason is the index is rebuilt by reading back the log:

// SetupIndex creates and initializes an Index.
// Initialization is:
// - Sanity check of the loaded Index
// - Truncates the Index (clears it)
// - Reads the log file from the beginning and re-initializes the Index
func (s *Segment) SetupIndex() (err error) {

It's unclear to me what the purpose of truncating the index and rebuilding it is. AFAICT, it just initializes the segment's nextOffset and position. However, the index sanity check is already reading the last entry which we can recover nextOffset from. Is there another purpose for rebuilding the index?

@tylertreat
Copy link
Contributor Author

tylertreat commented Jun 6, 2018

So it appears one issue is this does not actually seem to do what it says:

jocko/commitlog/index.go

Lines 200 to 208 in 1c69c5c

//read last entry
entry := new(Entry)
if err := idx.ReadEntryAtFileOffset(entry, idx.position-entryWidth); err != nil {
return err
}
if entry.Offset < idx.baseOffset {
return ErrIndexCorrupt
}
return nil

The issue is that the memory-mapped index file is 10MB, so reading at idx.position-entryWidth actually reads an empty entry (since relEntry adds the base offset, the offset comparison passes). It seems this precludes the optimization I was thinking about. I wonder if we could just do a binary search to find the real last entry though...

@travisjeffery
Copy link
Owner

Yeah we shouldn't have to rebuild every time. Initially, I didn't and then someone hit a bug where the index wasn't setup properly and submitted a PR doing this to hack a fix and I left it to optimize later.

I can't quite remember what Kafka does here but binary searching for the last entry sounds good to me.

@tylertreat
Copy link
Contributor Author

I have it working with a binary search. Will submit a PR shortly.

@travisjeffery
Copy link
Owner

Awesome - you da man.

@tylertreat
Copy link
Contributor Author

@travisjeffery One snag I've run into: I'm using a fork of the commitlog package which I've modified index entries to also contain a Size field which is the size of the log message that I'm using for another project. This allowed me to find the first empty entry doing the following:

i := sort.Search(n, func(i int) bool {
	if err := idx.ReadEntryAtFileOffset(entry, int64(i*entryWidth)); err != nil {
		panic(err)
	}
	return entry.Position == 0 && entry.Size == 0
})

Of course, Jocko doesn't have the Size field. We could do something like the following:

i := sort.Search(n, func(i int) bool {
	if err := idx.ReadEntryAtFileOffset(entry, int64(i*entryWidth)); err != nil {
		panic(err)
	}
	return entry.Position == 0 && i > 0
})

However, the problem is handling the special case where the index has a single entry. We have no way of distinguishing between an empty index and an index with one entry in it since in both cases the first entry will have a Position of 0. In my case, the Size field is always non-zero for valid entries and 0 for "empty" entries, but we don't have that luxury here.

Any ideas?

@travisjeffery
Copy link
Owner

travisjeffery commented Jun 11, 2018

I think in NewIndex we can check if the index file exists and if it does assume the index is legit and that the length of the file marks the position of the next entry. And we can tell the number of entries based on this, cause if there is no entry the file will be 0 bytes, with 1 entry it'll be 8 bytes.

We gotta change the code in here: https://github.com/travisjeffery/jocko/blob/master/commitlog/index.go#L86-L89. Cause right now the index file itself is initialized to the full size, when we should leave index file as is and only initialize the mmap's length by passing the length into the MapRegion or MapAt apis.

@travisjeffery
Copy link
Owner

Not sure if the gommap lib which is pretty jank will support that though, which is what Kafka is able to do I think. Alternatively when the index is created we could write -1 as the first offset and overwrite that on the first entry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants