-
-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new index for BHA #34602
Add a new index for BHA #34602
Conversation
…t=5 -c case_search_bha:case-search-bha-2024-05-10
I'm particularly interested to see how this integrates with pillows and other places that write. We spoke offline about this a bit already, but I was imagining this to live within the standard case search adapter, so nothing else needs to know about writing to multiple places. The other thing I was thinking about while reviewing this is how this would work with more than one dedicated index. Like, in my mind it'd be ideal if there'd be a config option in DEDICATED_CASE_SEARCH_INDICES = {
# domain_name: index_name
'mydomain': 'case-search-mydomain',
'anotherdomain': 'case-search-anotherdomain',
} And then this could be used to direct writes as appropriate in the adapter. Otherwise it feels awfully boilerplate-y to have to make a replica adapter for each. We could also do likewise for reads (rather than the db based approach I floated in #34601), though I think we'll want the two independently configurable in any case. It would also be nice if there's a way to do likewise for the migrations framework. This index applies only to one environment, so we shouldn't build and write to it in other settings. I'm not certain how best to manage that, but one thing that sounds appealing is treating this as a second case search index, in the sense that all migrations applied to the main case search index are also applied to this one by default. Then you'd want an initial action (in |
…iplex deletes if required
this will be the case when we have called bulk_delete and passed it ids, bulk_delete calls bulk. We only use it at one place in HQ which is to delete case search data for a entire domain in HQ which I don't think we will do
Like @esoergel I was also expecting more of a settings approach rather than a new adapter but its possible that we would need a new adapter anyway because of how they are tied to migrations etc. Maybe we could create them dynamically though? |
Yeah that is correct atleast for the writes. ab2ba07 7013b42 7cb3b24 subsequently ovverrides index and bulk operations For deletes I ended up modifying pillows 813a664 5cb8706 as we already have domain information here.
I have kept a plug ab2ba07#diff-7ba2ac2cd9c42581b497ab0c7cb5ea77288d3a8d7e8583cfc2c35a4da9e063b6R196-R199 which can essentially return an adapter based on any condition that seem fit. I think we should use settings to figure out the adapter and And regarding having it for multiple dedicated indices, it should be quite possible by the approach but yes it will add the overhead of the boilerplate and I think it should be okay. My idea of these indices is as of SQL tables having their own separate model classes. As per @snopoke's point we can definitely add more tooling to auto-generate the adapter classes automatically but the question again is how often will it be used.
I have this on my radar and will implement it part of sepearte PR where our migration framework will support the option of having environment specific indices.
This is a great point. I have a vague idea of how it could be done without changing things in commcare-cloud. Will have to think more about it. |
@@ -70,20 +70,21 @@ def process_change(self, change): | |||
|
|||
if change.deleted and change.id: | |||
doc = change.get_document() | |||
domain = doc.get('domain') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How safe is this? Will domain
be None
sometimes? I thought we didn't get the docs on deletions and would have to pull from change meta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very sure about this change and I was also not very sure how to add tests for it. Any inputs here would be super helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the document that is to be indexed in ES, it should always have a domain. But based on your suggestion I have added 3417fef#diff-308591ed80f437c63d421731473742a6d8fec03d88c9410b46d1d39704dcc66dR73-R75
@@ -179,6 +179,8 @@ def _get_domain_from_doc(self, doc): | |||
`doc` can be CommcCareCase instance or dict. This util method extracts domain from doc. | |||
This will fail hard if domain is not present in doc. | |||
""" | |||
if not doc: | |||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to explicitly whitelist the the paths we know to be safe? This will result in skipping the multiplexing, so we should only do it if we know we can. Like if we know we're deleting the whole domain, then maybe we could skip the multiplexing logic (and delete the dedicated index through another means). That way we wouldn't accidentally skip writes if someone adds another code path that triggers this condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not actually sure if I can confidently say I am aware of all the paths. But if we delete a domain which is on sub-index then we will have a couple of manual steps to do first i.e stop writes on that index by updating the config and index deletion can be taken care of then rather than automating it?
Product Description
Initial work to create a new index for BHA based of case search index.
This might not be functional, have put the code out there to get initial feedback on the approach.
Technical Summary
Feature Flag
Safety Assurance
Safety story
Automated test coverage
QA Plan
Migrations
Rollback instructions
Labels & Review