Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Data Quality Dashboard new historical data retrieval API routes #181945

Closed
semd opened this issue Apr 29, 2024 · 10 comments
Closed

Comments

@semd
Copy link
Contributor

semd commented Apr 29, 2024

Description

Create new API routes for historical results so they can be retrieved in the UI:

We need 2 new routes:

  • Route to search results, it should allow filtering by a time range on the createdAtfield (confirm if other filters like pattern, index, createdBy are also needed). To render the results table.
  • Unitary historical "check result" retrieval by id. To render the "check result" detail page.

Current state

The GET /results route already exists, right now it returns the latest result of each index in a given pattern received in the request query parameters:

GET /internal/ecs_data_quality_dashboard/results?pattern=logs-*

This is needed to render the last "check result" on the main Data Quality Dashboard page. So, we'll need to cover the 3 types of historical "check result" retrieval.

Proposal

Add:

  • GET /results/id/:id -> Retrieves only one result by id
  • GET /results/search?startDate=[date]&endDate=[date] -> Retrieves multiple results. The time range parameters are required. If we have extra filters or the KQL bar on the historical result search screen, then we'll need to add optional parameters to this route.

Keep:

  • GET /results?pattern=[string] -> Retrieves the latest result of each index in the "pattern" parameter. Since this is an internal route, we can update the path to /results/latest?pattern=[string] to make it more consistent with the other two new routes.
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-threat-hunting (Team:Threat Hunting)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-threat-hunting-explore (Team:Threat Hunting:Explore)

@angorayc
Copy link
Contributor

angorayc commented Apr 29, 2024

Looking good! Thanks for writing these down. Some questions from my side:

  1. Does the current pattern in GET /internal/ecs_data_quality_dashboard/results?pattern=logs-* takes multiple patterns?
  2. Would we allow multiple patterns / indices for the new routes?
e.g. : 
patterns=logs-*,packetbeat-* 
indices: indices=my-logs-1,my-beats-1
  1. If the new api is going to take both patterns and indices as filters at the same time, how does it work when searching? is it and / or
e.g. :
filebeat-* and my-test-index
filebeat-* or my-test-index
  1. Would we allow filtering by incompatible field exists?

@semd
Copy link
Contributor Author

semd commented Apr 29, 2024

@angorayc

  1. No, the "latest" request only supports one index pattern, we trigger multiple simultaneous requests when the DQD page renders, one for each index pattern. We could allow multiple patterns but then we'd have to change the response format and group the results by index pattern.

  2. the pattern parameter won't make sense for the "id" request, but it would for the "search" request, which depends on the UI requirements, that would be part of the "extra filters" mentioned.

  3. We tend to use ands by default so we narrow the search with every filter, but if it makes more sense to use or because of a Product decision, that's easy to change.

  4. That's interesting, it would make a lot of sense from a UX perspective to be able to filter by "red" (incompatible), "yellow"(same family) or "green" (ok) results.

@kapral18
Copy link
Contributor

kapral18 commented Apr 29, 2024

@semd can you clarify

GET /results/search?startDate=[date]&endDate=[date] -> Retrieves multiple results. The time range parameters are required. If we have extra filters or the KQL bar on the historical result search screen, then we'll need to add optional parameters to this route.

Current /results returns a list of results, do you mean with this proposal that it will return a list of lists? basically every nested list being a date range list of results from a different index ? [{}, {}] -> [[{},{}],[{},{},{}]]?

@kapral18
Copy link
Contributor

@semd can you clarify

are pattern and index conceptually mutually exclusive in /results/search?

@semd
Copy link
Contributor Author

semd commented Apr 30, 2024

@kapral18

Current /results returns a list of results, do you mean with this proposal that it will return a list of lists? basically every nested list being a date range list of results from a different index ? [{}, {}] -> [[{},{}],[{},{},{}]]?

No, the current /results route will remain unchanged, I only suggested updating the path to /results/latest so it is more consistent with the 2 new routes paths (/results/search and /results/id/:id), and this way also denotes it returns the "latest" result of each index in the pattern parameter.

To clarify, this route returns a list of results, one (latest) result per index that matches the pattern, it is used by the existing page of the DQD, to display the latest check on each index, the other 2 routes will be needed for the new Historical results page.

The /results/search is a new API route, that will return a list of results that match the filter criteria. The only filter criteria @dhru42 confirmed is the time range, but we may also need to filter by other fields (index pattern, index name, result output...), it ultimately depends on the UI, but that's only speculation until we have the final design requirements, so let's focus on what we know (time range), and worry about the other filters later when we have the final specs (index pattern, or index name, or both). I hope this answers this:

are pattern and index conceptually mutually exclusive in /results/search?

About the nested list, I don't think the search API should make any decision on the field in which the search results need to be grouped, that decision is on the UI, we may need to show results grouped by a field other than the index name, so I think it's better to return all the results in a flat list and delegate the grouping to the UI.

Hope it makes sense.

@kapral18
Copy link
Contributor

kapral18 commented Apr 30, 2024

@semd thanks for your thorough response.

Sorry for confusing you with my initial question, after rereading it I noticed it didn't express what I had in mind. My question was actually not about /results but about shape of /results/search and whether it was implied to be of a specific shape. But you answered it already. And I agree with you, API by default should make little assumptions about response shape because UI requirements can change pretty often.

Got 2 questions and 1 suggestion:

  1. Just for my understanding new /results/search do you think conceptually we want to be able to search multiple patterns and filter down with every single new filter? like /results/search? pattern=index1*,index2*,index3&*&filter1=valu1&filter2=value2 ?

  2. Do we want search api to support pagination?

  3. What do you think about revamping /results to following format to be more RESTful and more intuitively understandable and encompass all aforementioned use cases:

  • use /results/:id instead of /results/id/:id because /results/id doesn't represent a resource, nor an operation, which means it's not cacheable and violates RESTful design.
  • make /results by default return all available historical results and not just the latest per index, because latest per index is already an application of a filter on top of a resource that's a collection of pure results, conceptually...
  • and finally use combination of query param filters to reproduce current behavior of latest per index, or range of historical results per index or whatever. For example with this proposed assumption to reproduce the current behavior of latest per index, we could do something like /results?index_limit=1&index_sort=latest while maintaining an ability to have sort and limit for the entire collection. Sort in this situation is ASC, DESC by date of result.

But of course this might not be applicable and I might miss a giant piece of picture, so take this with a jar of salt :D

@semd
Copy link
Contributor Author

semd commented Apr 30, 2024

@kapral18

  1. Just for my understanding new /results/search do you think conceptually we want to be able to search multiple patterns and filter down with every single new filter? like /results/search? pattern=index1*,index2*,index3&*&filter1=valu1&filter2=value2 ?

We still don't know what we want to be able to filter. I don't think right now we should assume we will have a multi-valuated index pattern input in the UI, and that we'll need to send those patterns to be filtered on the API side, that's a lot of assumptions. But, if we need to do it, I think the index pattern can be treated as any other field in the mapping, right?

  1. Do we want search api to support pagination?

Well, I don't discard paginating on the API as a possibility, but I won't do it from the start. Simply because we are querying DQD checks, they are done manually, and even if we implement the automatic check executions someday, customers probably won't have a significant number of executions/documents. It's much easier and faster to implement pagination on the UI for now, and if someday we see the need to improve performance we can consider API-side pagination.

What do you think about revamping /results to following format to be more RESTful and more intuitively understandable and encompass all aforementioned use cases

There are interesting ideas, but I don't think this issue is the correct place to discuss each one of them, let's discuss them synchronously with the team and come back here with the conclusions.

@semd
Copy link
Contributor Author

semd commented May 8, 2024

Close in favor of #182868

@semd semd closed this as completed May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants