Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamoDBContext.FromScanAsync scans more than the provided limit before returning a result #3054

Open
JManou opened this issue Sep 20, 2023 · 4 comments
Labels
bug This issue is a bug. dynamodb needs-investigation p2 This is a standard priority issue

Comments

@JManou
Copy link

JManou commented Sep 20, 2023

Describe the bug

I have the following code that should return the first 30 items, but it takes a few minutes to return the result.
Is there another way to specify the max items?

var search = DbContext!.FromScanAsync<DynamoPropertyRecord>(
            new ScanOperationConfig()
            {
                Limit = 30,
                AttributesToGet = new List<string>(){ "ShortId", "Content" },
                Select = SelectValues.SpecificAttributes
            },
            _config);

        var results = await search.GetRemainingAsync();

Expected Behavior

The result should be returned and the scan stopped once the limit is reached

Current Behavior

It looks like it scan all table and returns the last 30 items. Looking at the code of the DocumentSearch, as long as there is a LastEvaluatedKey the scan will continue. Shouldn't the GetRemainingAsync method take into account the configure limit ? It doesn't seem intuitive to use GetNextSetAsync

Reproduction Steps

var search = DbContext!.FromScanAsync<DynamoPropertyRecord>(
            new ScanOperationConfig()
            {
                Limit = 30,
                AttributesToGet = new List<string>(){ "ShortId", "Content" },
                Select = SelectValues.SpecificAttributes
            },
            _config);

        var results = await search.GetRemainingAsync();

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

AWSSDK.DynamoDBv2 3.7.201.12

Targeted .NET Platform

.NET 7

Operating System and version

Windows 11

@JManou JManou added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 20, 2023
@ashishdhingra ashishdhingra added needs-reproduction This issue needs reproduction. dynamodb and removed needs-triage This issue or PR still needs to be triaged. labels Sep 20, 2023
@ashishdhingra ashishdhingra self-assigned this Sep 20, 2023
@ashishdhingra
Copy link
Contributor

ashishdhingra commented Sep 25, 2023

@JManou Good afternoon.

Looking at the DynamoDB API Reference:

  • Scan where it mentions that The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index. To have DynamoDB return fewer items, you can provide a FilterExpression operation.. Please confirm if you have tried using FilterExpression property of ScanOperationConfig.

  • Limit parameter mentions that it specifies the maximum number of items to evaluate (not necessarily the number of matching items).

Are you noticing different behavior using other SDK(s) or AWS CLI?

Thanks,
Ashish

@ashishdhingra ashishdhingra added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-reproduction This issue needs reproduction. labels Sep 25, 2023
@ashishdhingra ashishdhingra removed their assignment Sep 25, 2023
@JManou
Copy link
Author

JManou commented Sep 25, 2023

@ashishdhingra when using the aws cli I can specify --max-items to get a limited number of elements without doing any filtering. Is it possible to achieve the same behavior using the SDK ?

@ashishdhingra
Copy link
Contributor

ashishdhingra commented Sep 25, 2023

@ashishdhingra when using the aws cli I can specify --max-items to get a limited number of elements without doing any filtering. Is it possible to achieve the same behavior using the SDK ?

@JManou Thanks for your response. Kindly advise on the following:

  • How you came into conclusion that AWS SDK for .NET if not stopping after evaluating the number of items specified by Limit property.
  • Do you see the delay when executing the DynamoDBContext.FromScanAsync() first time, or, delay is noticed every time this call is made?
  • Have you tried using AmazonDynamoDBClient.Scan (ScanRequest)? Is the behavior different?

Thanks,
Ashish

@ashishdhingra ashishdhingra added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Sep 25, 2023
@JManou
Copy link
Author

JManou commented Sep 27, 2023

@ashishdhingra

From what I understand from the doc limit is used for batching/pagination scan operation. Maybe a new attribute could be introduced in ScanOperationConfig to support --max-items as in the cli
https://docs.aws.amazon.com/cli/latest/reference/dynamodb/scan.html

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Sep 28, 2023
@ashishdhingra ashishdhingra added p2 This is a standard priority issue needs-investigation labels Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. dynamodb needs-investigation p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants