New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Option to disable cross encoder models #286
base: main
Are you sure you want to change the base?
Conversation
…t the models available need to be hosted on Sagemaker which increases cost significantly. Having an option to disable cross encoder models would be helpful while exploring the chatbot so that Sagemaker costs can be avoided. Added a config to enable/disable cross encoder models. Also added options to selected embedding models, so that sagemakerModels are not created automatically. Persisted enableSagemakerModels config so that it can be used directly instead of relying on sagemakerModels length.
Added basic feedback mechanism for responses generated by the chatbot. The feedbacks are stored in DynamoDB which can be queried to do analysis as required by admin users. In future we can add a UI page to display the feedbacks, but for now these are being stored and manual analysis would be required. The feedbacks are not adding to the learning of the chatbot.
This reverts commit 1c3b8ce.
@massi-ang could you please have a look? |
@bigadsoleiman I have resolved the merge conflicts in the latest commit. |
@azaylamba We have migrated to AppSync. This PR has conflicts. Could you please fix this? And than we can merge |
@spugachev I couldn't find any conflicts in the PR. I have merged the main branch. Note: I have not tested the changes after syncing with main due to some constraints with my AWS account. Any help in testing the changes is appreciated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On top of my comments on the changes, I would expect changes to the front end. If cross encoder models are not enabled the the menu should not be displayed.
Also, if cross encoder models are not selected, it should not be possible to enable hybrid search on the workspaces.
cli/magic-create.ts
Outdated
@@ -146,6 +146,18 @@ async function processCreateOptions(options: any): Promise<void> { | |||
}, | |||
initial: options.bedrockRoleArn || "", | |||
}, | |||
{ | |||
type: "multiselect", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved after the Enable RAG question and skipped if RAG is not enabled.
Modify to a boolean to enable embedding models via SM or not. There is no additional cost in serving all the models, hence choosing which ones to enable does not add any value.
cli/magic-create.ts
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer that Cross Encoders and Embedding models are enabled only if RAG has been enabled therefore questions about enabling and choosing these models are only asked if RAG has been enabled.
cli/magic-create.ts
Outdated
default: true, | ||
}; | ||
} | ||
config.rag.embeddingsModels = embeddingModels.filter((model) => answers.selectedEmbeddingModels.includes(model.name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no additional cost in serving all the SM embedding models. Why should a user select a subset? Would prefer to have a flag to enable embedding models via SM Endpoints - if this is not selected, only embedding models via Bedrock will be available.
cross_encoder_model = genai_core.cross_encoder.get_cross_encoder_model( | ||
cross_encoder_model_provider, cross_encoder_model_name | ||
) | ||
if (config["rag"]["crossEncodingEnabled"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this part of the code below and have a single if
statement
|
||
unique_items = sorted(unique_items, key=lambda x: x["score"], reverse=True) | ||
|
||
if (config["rag"]["crossEncodingEnabled"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this if
statement
ret_items = list(filter(lambda val: val["score"] > threshold, unique_items))[ | ||
:limit | ||
] | ||
if len(ret_items) < limit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this code has been removed?
lambda val: (val["vector_search_score"] or -1) > 0.5, | ||
unique_items, | ||
) | ||
filter(lambda val: val["vector_search_score"] > 0.5, unique_items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this change related to making cross encoders optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regroup all the functionality related to cross encoders and have a single if
statement. Check also for changes that have nothing to do with making cross-encoders optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regroup all the functionality related to cross encoders and have a single if
statement. Check also for changes that have nothing to do with making cross-encoders optional.
@massi-ang It seems syncing with main has caused some unwanted changes, as the original changes were made prior to AppSync migration. I will work on this. |
Hybrid Search won't be available if cross encoding is not enabled.
@massi-ang I have addressed the review comments, please have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why there are 2 settings (Cross Encoder / Embeddings) , since if you enable embeddings models via SM, you can get Cross Encoder for free.
cli/magic-create.ts
Outdated
}, | ||
}, | ||
{ | ||
type: "confirm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see the point to have 2 different questions: if the users enables Embedding models via SM, then enabling Cross Encoders does not cost more. And it is not possible to enable Cross Encoders without enabling SM
cli/magic-create.ts
Outdated
default: true, | ||
}; | ||
config.rag.embeddingsModels = embeddingModels; | ||
if (answers.enableCrossEncoding && answers.enableSagemakerModels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if enableSagemakerModels === true
you can deploy both embeddings and cross encoder models.
@massi-ang I understand your point that if we enable embeddings models via SM, we can get cross encoder for free. I kept the settings separate to have more control for cross encoding and to make the intent clear in the backend code. Excluding execution of cross encoding code based on sagemaker models can be a little confusing there. Please let me know if I am missing something. |
I think the configuration should be simple and meaningful for the user not the backend. You could think of an alternative naming for the parameter if you think that could create confusion, or better, you could create a function called |
@massi-ang I have removed the prompt for |
@massi-ang Would you be able to have a look at this again? |
Default embeddings models was not being set correctly. Also error was thrown related to suppression rules if sagemaker models were not enabled. Used props.config.llms.enableSagemakerModels config to add the NAG suppression rules.
@massi-ang Please let me know if more changes are required on this one. |
Issue #222
Description of changes: Currently cross encoder models are used to rank the search results but the models available need to be hosted on Sagemaker which increases cost significantly. Having an option to disable cross encoder models would be helpful while exploring the chatbot so that Sagemaker costs can be avoided.
Added a config to enable/disable embeddings via Sagemaker which in turn derives cross encoding models.
Persisted enableSagemakerModels config so that it can be used directly instead of relying on sagemakerModels length.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.