Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(query): Read just the latest value for scalar types #8966

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

harshil-goel
Copy link
Contributor

@harshil-goel harshil-goel commented Aug 22, 2023

Description: Right now, when we read some data, we read the entire posting list. With this we populate List structure in memory, which has two layers. These two layers are then merge sorted. All of this takes a lot of time. In this diff, we are updating the query processor to just read the latest value in case of scalar types.

LDBC test for IC09.
Before: 35 seconds
After: 26 seconds

@dgraph-bot dgraph-bot added area/testing Testing related issues area/core internal mechanisms go Pull requests that update Go code labels Aug 22, 2023
@harshil-goel harshil-goel changed the title Read just the lastest value for atomic types Read just the latest value for scalar types Sep 6, 2023
@harshil-goel harshil-goel force-pushed the harshil-goel/single_key branch 3 times, most recently from 8e0dc79 to 191f210 Compare September 15, 2023 09:08
@harshil-goel harshil-goel changed the base branch from main to harshil-goel/algo-bin September 15, 2023 09:12
Copy link

@jairad26 jairad26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added inline comments

posting/lists.go Outdated

pl, err := getPostings()
if err == badger.ErrKeyNotFound {
err = nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return empty posting with nil error

worker/task.go Outdated
@@ -376,6 +376,9 @@ func (qs *queryState) handleValuePostings(ctx context.Context, args funcArgs) er

outputs := make([]*pb.Result, numGo)
listType := schema.State().IsList(q.Attr)
hasLang := schema.State().HasLang(q.Attr)
getMultiplePosting := q.DoCount || q.ExpandAll || listType || hasLang
//getMultiplePosting := true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented line

@@ -195,6 +195,59 @@ func (lc *LocalCache) getInternal(key []byte, readFromDisk bool) (*List, error)
return lc.SetIfAbsent(skey, pl), nil
}

func (lc *LocalCache) GetSinglePosting(key []byte) (*pb.PostingList, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since its a public function could you add a description of what this function's doing/why its necessary? basically what you explained yesterday in text format would be helpful for future reference

posting/lists.go Outdated
return pl, nil
}
}
lc.RUnlock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not defer unlock till the end of the function?

@harshil-goel harshil-goel force-pushed the harshil-goel/single_key branch 2 times, most recently from d572cae to 29cb753 Compare September 29, 2023 14:25
worker/task.go Outdated
@@ -376,6 +376,9 @@ func (qs *queryState) handleValuePostings(ctx context.Context, args funcArgs) er

outputs := make([]*pb.Result, numGo)
listType := schema.State().IsList(q.Attr)
hasLang := schema.State().HasLang(q.Attr)
getMultiplePosting := q.DoCount || q.ExpandAll || listType || hasLang
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some explanation

require.NoError(t, err)
checkValue(t, ol, string(k.Postings[0].Value), uint64(j))
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove newline

@harshil-goel harshil-goel force-pushed the harshil-goel/single_key branch 2 times, most recently from 80102bb to e893711 Compare October 12, 2023 12:20
@harshil-goel harshil-goel changed the title Read just the latest value for scalar types perf(query): Read just the latest value for scalar types Oct 12, 2023
mangalaman93
mangalaman93 previously approved these changes Oct 13, 2023
Base automatically changed from harshil-goel/algo-bin to main October 13, 2023 14:28
@harshil-goel harshil-goel dismissed mangalaman93’s stale review October 13, 2023 14:28

The base branch was changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core internal mechanisms area/testing Testing related issues go Pull requests that update Go code
Development

Successfully merging this pull request may close these issues.

None yet

4 participants