Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in way to handle synonyms #37

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Add in way to handle synonyms #37

wants to merge 5 commits into from

Conversation

shjohnson
Copy link
Contributor

@shjohnson shjohnson commented Mar 1, 2024

Jira link

https://transformuk.atlassian.net/browse/HOTT-4489

What?

This is a way of adding synonyms into the training data when generating the model. It will come after the search references and enrich the data so when we allow the model against the querying we have more data to match to commodity numbers

@shjohnson shjohnson force-pushed the HOTT-4489-synonyms branch 3 times, most recently from b9e8d2e to b1cca0d Compare March 6, 2024 10:39
from training.synonym.synonym_expander import SynonymExpander


class EnhanceData:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"data" (in EnhanceData) sounds quite generic,
what about
EnhanceDescriptions or
EnrichDescriptions
?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thats fair, will have a think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to think about this actually 🤔

Copy link
Contributor

@alexdesi alexdesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really good,
the expansion logic is not clear to me, to be honest,
(the code is clear, is why we do it that way is not clear)
I've Just added a few minor comments

@shjohnson shjohnson force-pushed the HOTT-4489-synonyms branch 4 times, most recently from e6e94ac to 50c6a11 Compare March 8, 2024 16:10
@shjohnson shjohnson force-pushed the HOTT-4489-synonyms branch 2 times, most recently from 8a120bf to 0e12b73 Compare March 13, 2024 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants