Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching paginated data may return truncated results #111

Open
necabo opened this issue Oct 10, 2023 · 0 comments
Open

Fetching paginated data may return truncated results #111

necabo opened this issue Oct 10, 2023 · 0 comments

Comments

@necabo
Copy link

necabo commented Oct 10, 2023

Describe the bug
Fetching data that is paginated, e.g. users = umapi_client.UsersQuery(conn).all_results() may result in only a subset of all users being returned.

To Reproduce
Steps to reproduce the behavior:
I did not manage to reproduce this issue. It has only occurred once after 4+ years of running the aforementioned line of code every 30 minutes.

However, I do have a theory of what happened that I'll explain in more detail below.

Expected behavior
Fetching data using .all_results() should always return all data or raise an appropriate exception in error cases.

Environment (please complete the following information):

  • Python version: 3.7.3
  • umapi-client version: 3.0.1

Additional context
I'm using this client to synchronize users/accounts between an internal identity management and Adobe. The sync is bi-directional in the sense that if users vanish on the Adobe site, they are deleted on our side and vice-versa.

For some reason, on 2023-09-13 at 6pm umapi_client.UsersQuery(conn).all_results() replied with only exactly 2000 users (the current pagination page size). This lead to 6000+ users being deleted on our side. During the next scheduled run half an hour later, the same call correctly returned all 8000+ users. Since those 6000+ users were now missing on our side, the sync scheduled their deletion on the Adobe side. A rather unfortunate situation as one might imagine. While apparently a very rare issue, I'd thus consider it potentially harmful.

Reading the code, I suspect that the first call to the API correctly returned the first 2000 users, indicated that there are more pages to fetch but responded with status code 404 when trying to fetch the second page. query_multiple() specifically handles 404 responses by returning an empty result and True to indicate that the response was the last page.
Unfortunately, I don't have debug-level umapi-client logs or detailed HTTP request/response logs to confirm this.

Here's a test case to illustrate what I believe happened using code:

def test_qm_user_iterate_partial_404(mock_connection_params):
    with mock.patch("umapi_client.connection.requests.Session.get") as mock_get:
        mock_get.side_effect = [MockResponse(200, {"result": "success",
                                                   "lastPage": False,
                                                   "users": [{"name": "n1", "type": "user"},
                                                             {"name": "n2", "type": "user"}]}),
                                MockResponse(404, text="404 Object not found"),

                                # NOTE: this is the expected response fetching the second page
                                # MockResponse(200, {"result": "success",
                                #                    "lastPage": True,
                                #                    "users": [{"name": "n3", "type": "user"},
                                #                              {"name": "n4", "type": "user"}]})
                                ]
        conn = Connection(**mock_connection_params)
        qm = QueryMultiple(conn, "user")
        assert qm.all_results() == [{"name": "n1", "type": "user"},
                                    {"name": "n2", "type": "user"},

                                    # NOTE: these last two user entries are truncated
                                    # failing the test case
                                    {"name": "n3", "type": "user"},
                                    {"name": "n4", "type": "user"}]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant