Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save indexes directly to Lambda Function #1

Open
rlingineni opened this issue Sep 30, 2018 · 5 comments
Open

Save indexes directly to Lambda Function #1

rlingineni opened this issue Sep 30, 2018 · 5 comments
Labels
good first issue Good for newcomers optimization Doing this might make search faster

Comments

@rlingineni
Copy link
Owner

You will have to update the indexing function to store the indexes directly in S3 Bucket where the Lambda function is stored.

It can almost ~2 seconds to get all the virtual indexes from S3, but considering, each file is only about ~1MB, if we save the index onto the lambda function directly, we can shave that time off.

Of course, this could make the architecture a bit dirty, but performance gains will be great.

@rlingineni rlingineni added good first issue Good for newcomers optimization Doing this might make search faster labels Sep 30, 2018
@rlingineni rlingineni changed the title In-Memory Lambda Function Caching Save indexes directly to Lambda Function Sep 30, 2018
@seriousme
Copy link

Hi,

Another way to speed up the search might be to use S3 Select
https://aws.amazon.com/blogs/aws/s3-glacier-select/

This reduces the need to fetch the whole indexes.

Cheers,
Hans

@rlingineni
Copy link
Owner Author

We would still have to load an entire index into the function's memory since we don't want just a subset of an index.

@seriousme
Copy link

It might work for larger datasets but then you need to alter the query algorithm as well. Standard lunrjs would not be able to work with that.

@seriousme
Copy link

btw: if updates are infrequent (e.g. only during nightly batches) and the index does not need to be super current then you might include the index with the lambda bundle so that with every time the index is updated a new version of the lambda is deployed.

@rlingineni
Copy link
Owner Author

Right, yeah that's what I was thinking. Upload it with the lambda bundle. Even if it was frequent, I don't think it would matter. It doesn't cost us anything to update Lambda functions, and usually, from experience, a new bundle doesn't mean downtime.

As far as lunrjs goes, I agree, changes should be made to the core. There should be a way in lunrjs to load multiple indexes for a server-side user case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers optimization Doing this might make search faster
Projects
None yet
Development

No branches or pull requests

2 participants