Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce critical section on inserts #150

Open
2 tasks
Ngalstyan4 opened this issue Sep 16, 2023 · 8 comments
Open
2 tasks

Reduce critical section on inserts #150

Ngalstyan4 opened this issue Sep 16, 2023 · 8 comments
Labels
core Core Database

Comments

@Ngalstyan4
Copy link
Contributor

Currently we hold a lock on the header page of the index for much much longer than necessary on insertions.

  • Benchmark this and see if concurrent inserts will achieve significantly higher throughput if we move to more granular locking
  • Implement more granular locking- lock the header page to only create necessary new pages and then use fine grained locking for the rest of modifications
@Ngalstyan4 Ngalstyan4 added the core Core Database label Sep 16, 2023
@ezra-varady
Copy link
Collaborator

Could this be at play in #221?

@broccoliSpicy
Copy link
Contributor

broccoliSpicy commented Mar 28, 2024

after reading a few postgres source code and documentations,
GenericXLogRegisterBuffer(),
GenericXLogFinish()
I think we can UnlockReleaseBuffer(hdr_buf); right after hdr_page = GenericXLogRegisterBuffer(state, hdr_buf, LDB_GENERIC_XLOG_DELTA_IMAGE); since GenericXLogRegisterBuffer will copy the hdr_buf contents.

code:
insert.c
code diff

However, I couldn't see any performance improvement in this modified code during the concurrent insertion.

postgres is just not using the lantern extension concurrently

the python script for concurrent insertion(modified using Varik's https://gist.github.com/var77/dbf9b8c9845356d51a5588f6c7020cd1)

some primitive benchmark results:
the original code:
Concurrent insertion (10 concurrent insertion tasks)
image

image

Concurrent insertion (4 concurrent insertion tasks)

image image

After critical section shorten:
Concurrent insertion (10 concurrent insertion tasks)

image image

Concurrent insertion (4 concurrent insertion tasks):

image image

my postgres setting:
Screenshot 2024-03-28 at 5 56 37 PM

my data set:
wget https://storage.googleapis.com/lanterndata/openai/openai_wikipedia_emb_arr_100k.csv -O ~/wiki100.csv

there is something I am missing here

@Ngalstyan4
Copy link
Contributor Author

You have to be careful here, as the header must remain under lock as long as there is a chance we will modify it.
When we release the lock, someone else may come and modify it, in which case our copy becomes stale.

I think there still is lock contention around modifying blockmap pages. Consecutive inserts will try to modify consecutive blockmap pages resulting in lock contention similar to the one you removed in the diff above.

I have not looked into optimizing those since the whole concept of blockmaps will go away in a future release.

@broccoliSpicy
Copy link
Contributor

I was under the impression that the header is also a log structure, I guess this was wrong.
I actually don't know much blockmaps in lantern and postgres.

@Ngalstyan4
Copy link
Contributor Author

Thanks so much for looking into this, though!
Feel free to reach out to me via email at my name at lantern dot dev directly, in case there are ways we can unblock you as you go through the codebase

re: blockmaps:
It is a lantern concept and is not very well documented. There is a lantern index diagram here that has some details around it, in case it is useful.

@broccoliSpicy
Copy link
Contributor

Thanks so much for looking into this, though! Feel free to reach out to me via email at my name at lantern dot dev directly, in case there are ways we can unblock you as you go through the codebase

re: blockmaps: It is a lantern concept and is not very well documented. There is a lantern index diagram here that has some details around it, in case it is useful.

I actually don't have access to this diagram haha, that's fine

@Ngalstyan4
Copy link
Contributor Author

inline image below:

image

@broccoliSpicy
Copy link
Contributor

inline image below:

image

thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core Database
Projects
None yet
Development

No branches or pull requests

3 participants