Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make RAG embedding model dimension configurable #1241

Merged
merged 4 commits into from May 15, 2024
Merged

Conversation

usamimeri
Copy link
Contributor

@usamimeri usamimeri commented May 1, 2024

User description

For #1213 and #1239. When using text-embedding-3-large(the dimension is 3072) from OpenAI, or some other embedding models deployed by fastchat(e.g. bce-embedding-base_v1 the dimension is 512), it will cause assertion error due to mismatch between query and index.

Consequently,I think it's reasonable to make dimension configurable.

# fastchat API
embedding:
  api_type: "openai"
  base_url: "http://127.0.0.1:8000/v1/embeddings"
  model: "bce-embedding-base_v1"
  dimensions: 512

@usamimeri usamimeri closed this May 1, 2024
@codiumai-pr-agent-pro codiumai-pr-agent-pro bot added documentation Improvements or additions to documentation enhancement New feature or request labels May 1, 2024
Copy link

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Description updated to latest commit (5d2bcbd)

Copy link

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Review

⏱️ Estimated effort to review [1-5]

4, because the PR involves multiple complex changes across various files including configuration updates, API integrations, and provider implementations. The addition of a new model provider (Bedrock) and the handling of different model parameters increase the complexity. The PR also touches on sensitive areas like token counting and cost management which require careful review to ensure accuracy and efficiency.

🧪 Relevant tests

Yes

🔍 Possible issues

Possible Bug: The use of hardcoded "YOUR_MODEL_DIMENSIONS" in the example configurations in 'embedding_config.py' might not be clear to users on what values should be actually provided.

Configuration Error: The 'dimensions' field is added as a string in the examples within 'embedding_config.py', whereas it is treated as an integer in other parts of the application. This inconsistency could lead to runtime errors or misconfigurations.

Error Handling: The method '_get_completion_from_dict' in various provider classes does not handle cases where the expected keys might be missing in the response dictionary, which could lead to KeyError exceptions.

🔒 Security concerns

No


✨ Review tool usage guide:

Overview:
The review tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be added by configuring the tool.

The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.

  • When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:
/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...
[pr_reviewer]
some_config1=...
some_config2=...

See the review usage page for a comprehensive guide on using this tool.

Copy link

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Code Suggestions

CategorySuggestions                                                                                                                                                       
Maintainability
Use a common dictionary to store repeated token cost values.

To avoid duplication and potential errors in the future, consider using a dictionary to
store the common token costs and reference this dictionary when defining the costs for
each model.

metagpt/utils/token_counter.py [205-207]

-"amazon.titan-tg1-large": {"prompt": 0.0008, "completion": 0.0008},
-"amazon.titan-text-express-v1": {"prompt": 0.0008, "completion": 0.0008},
-"amazon.titan-text-express-v1:0:8k": {"prompt": 0.0008, "completion": 0.0008},
+common_cost = {"prompt": 0.0008, "completion": 0.0008}
+"amazon.titan-tg1-large": common_cost,
+"amazon.titan-text-express-v1": common_cost,
+"amazon.titan-text-express-v1:0:8k": common_cost,
 
Bug
Correct a typo in a dictionary key.

Correct the typo in the dictionary key from 'NOT_SUUPORT_STREAM_MODELS' to
'NOT_SUPPORT_STREAM_MODELS' to improve code readability and prevent potential bugs.

metagpt/provider/bedrock/utils.py [4]

-NOT_SUUPORT_STREAM_MODELS = {
+NOT_SUPPORT_STREAM_MODELS = {
 
Add error handling for splitting the model_id.

To avoid potential runtime errors from incorrect model names, add error handling for the
case where the model name does not include a version or is improperly formatted.

metagpt/provider/bedrock/bedrock_provider.py [115]

-provider, model_name = model_id.split(".")[0:2]  # meta、mistral……
+parts = model_id.split(".")
+if len(parts) < 2:
+    raise ValueError(f"Invalid model_id format: {model_id}")
+provider, model_name = parts[0:2]
 
Correct a typo in a variable name.

Correct the typo in the variable name NOT_SUUPORT_STREAM_MODELS to
NOT_SUPPORT_STREAM_MODELS for clarity and to avoid potential bugs due to misspelling.

tests/metagpt/provider/test_bedrock_api.py [6]

-NOT_SUUPORT_STREAM_MODELS
+NOT_SUPPORT_STREAM_MODELS
 
Add a check to prevent potential KeyError when accessing a dictionary.

Add a check to ensure config.embedding.api_type is in _embedding_type_to_dimensions before
accessing it to prevent potential KeyError.

metagpt/rag/schema.py [53]

-self.dimensions = self._embedding_type_to_dimensions.get(config.embedding.api_type, 1536)
+if config.embedding.api_type in self._embedding_type_to_dimensions:
+    self.dimensions = self._embedding_type_to_dimensions[config.embedding.api_type]
+else:
+    logger.info(f"API type {config.embedding.api_type} is not recognized. Defaulting to 1536 dimensions.")
+    self.dimensions = 1536
 
Best practice
Use a constant for default max tokens value.

Instead of using a hardcoded value for unsupported models' max tokens, consider defining a
constant at the module level to improve maintainability and readability.

metagpt/provider/bedrock/utils.py [111]

-max_tokens = 2048
+DEFAULT_MAX_TOKENS = 2048
+max_tokens = DEFAULT_MAX_TOKENS
 
Add a warning log when configured max tokens exceed the model's limit.

To ensure that the 'max_tokens' value does not exceed the model's limit, add a check to
log a warning if the configured 'max_token' is higher than the model's maximum. This helps
in debugging and maintaining correct flow control.

metagpt/provider/bedrock_api.py [83-84]

 if self.config.max_token > model_max_tokens:
+    logger.warning(f"Configured max_token {self.config.max_token} exceeds model's max {model_max_tokens}. Using model's max.")
     max_tokens = model_max_tokens
 
Rename a function to enhance code readability.

Use a more descriptive function name than is_subset to clarify its purpose, such as
is_request_body_subset.

tests/metagpt/provider/test_bedrock_api.py [59]

-def is_subset(subset, superset) -> bool:
+def is_request_body_subset(subset, superset) -> bool:
 
Enhancement
Use a dictionary to manage conditional logic for provider responses.

Replace the hardcoded provider names in mock_invoke_model_stream with a more scalable
approach using a dictionary lookup to avoid repetitive code and improve maintainability.

tests/metagpt/provider/test_bedrock_api.py [38-47]

-if provider == "amazon":
-    response_body_bytes = dict2bytes({"outputText": "Hello World"})
-elif provider == "anthropic":
-    response_body_bytes = dict2bytes(
-        {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello World"}}
-    )
-elif provider == "cohere":
-    response_body_bytes = dict2bytes({"is_finished": False, "text": "Hello World"})
-else:
-    response_body_bytes = dict2bytes(BEDROCK_PROVIDER_RESPONSE_BODY[provider])
+provider_responses = {
+    "amazon": {"outputText": "Hello World"},
+    "anthropic": {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello World"}},
+    "cohere": {"is_finished": False, "text": "Hello World"}
+}
+response_body_bytes = dict2bytes(provider_responses.get(provider, BEDROCK_PROVIDER_RESPONSE_BODY[provider]))
 
Simplify list comprehension with a more direct generator expression.

Use a dictionary comprehension to simplify the messages_to_prompt method in the
BaseBedrockProvider class.

tests/metagpt/provider/test_bedrock_api.py [27]

-return "\n".join([f"{i['role']}: {i['content']}" for i in messages])
+return "\n".join(f"{msg['role']}: {msg['content']}" for msg in messages)
 

✨ Improve tool usage guide:

Overview:
The improve tool scans the PR code changes, and automatically generates suggestions for improving the PR code. The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on a PR.

  • When commenting, to edit configurations related to the improve tool (pr_code_suggestions section), use the following template:
/improve --pr_code_suggestions.some_config1=... --pr_code_suggestions.some_config2=...
[pr_code_suggestions]
some_config1=...
some_config2=...

See the improve usage page for a comprehensive guide on using this tool.

@codiumai-pr-agent-pro codiumai-pr-agent-pro bot added bug_fix and removed documentation Improvements or additions to documentation labels May 1, 2024
Copy link

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Description updated to latest commit (5d2bcbd)

Copy link

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Review

⏱️ Estimated effort to review [1-5]

4, because the PR introduces a significant amount of new functionality across multiple files, including new configurations, provider implementations, and integration with external services like AWS Bedrock. The complexity and breadth of the changes require a thorough review to ensure correctness, security, and maintainability.

🧪 Relevant tests

Yes

🔍 Possible issues

Possible Bug: The handling of dimensions in FAISSRetrieverConfig relies on the external configuration being correctly set. If not, it defaults to 1536, which might not be suitable for all models.

Configuration Dependency: The system's operation now heavily depends on the correct configuration of dimensions in embedding_config.py. Any misconfiguration could lead to runtime errors or incorrect behavior.

🔒 Security concerns

No


✨ Review tool usage guide:

Overview:
The review tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be added by configuring the tool.

The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.

  • When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:
/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...
[pr_reviewer]
some_config1=...
some_config2=...

See the review usage page for a comprehensive guide on using this tool.

Copy link

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Code Suggestions

CategorySuggestions                                                                                                                                                       
Maintainability
Improve naming consistency and clarity in dictionary keys.

Consider using a more descriptive and consistent naming convention for the dictionary keys
to avoid confusion and potential errors in future code maintenance or expansion.

metagpt/utils/token_counter.py [204-246]

 BEDROCK_TOKEN_COSTS = {
-    "amazon.titan-tg1-large": {"prompt": 0.0008, "completion": 0.0008},
+    "amazon_titan_tg1_large": {"prompt": 0.0008, "completion": 0.0008},
     ...
-    "ai21.j2-ultra-v1": {"prompt": 0.0188, "completion": 0.0188},
+    "ai21_j2_ultra_v1": {"prompt": 0.0188, "completion": 0.0188},
 }
 
Use a constant for default token values to enhance maintainability.

Replace the hardcoded token values with a configurable setting or constant to facilitate
easier adjustments and maintainability.

metagpt/provider/bedrock/utils.py [4-10]

-NOT_SUUPORT_STREAM_MODELS = {
-    "ai21.j2-grande-instruct": 8000,
+DEFAULT_MAX_TOKENS = 8000
+NOT_SUPPORT_STREAM_MODELS = {
+    "ai21.j2-grande-instruct": DEFAULT_MAX_TOKENS,
     ...
-    "ai21.j2-ultra-v1": 8000,
+    "ai21.j2-ultra-v1": DEFAULT_MAX_TOKENS,
 }
 
Refactor hardcoded conditions into a scalable function.

Replace the hardcoded provider names in mock_invoke_model_stream with a more scalable
approach using a dictionary or a function to handle different providers, enhancing
maintainability and scalability.

tests/metagpt/provider/test_bedrock_api.py [38-47]

-if provider == "amazon":
-    response_body_bytes = dict2bytes({"outputText": "Hello World"})
-elif provider == "anthropic":
-    response_body_bytes = dict2bytes(
-        {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello World"}}
-    )
-elif provider == "cohere":
-    response_body_bytes = dict2bytes({"is_finished": False, "text": "Hello World"})
-else:
-    response_body_bytes = dict2bytes(BEDROCK_PROVIDER_RESPONSE_BODY[provider])
+response_body_bytes = dict2bytes(get_provider_response(provider))
 
Enhancement
Correct spelling and improve readability in dictionary name.

Refactor the NOT_SUUPORT_STREAM_MODELS dictionary name to correct the spelling error and
improve readability.

metagpt/provider/bedrock/utils.py [4-10]

-NOT_SUUPORT_STREAM_MODELS = {
+NOT_SUPPORT_STREAM_MODELS = {
     "ai21.j2-grande-instruct": 8000,
     ...
     "ai21.j2-ultra-v1": 8000,
 }
 
Add error handling to the model invocation method to improve robustness.

Implement error handling for the invoke_model method to manage exceptions that may occur
during the API call.

metagpt/provider/bedrock_api.py [67-72]

 def invoke_model(self, request_body: str) -> dict:
-    response = self.__client.invoke_model(modelId=self.config.model, body=request_body)
-    usage = self._get_usage(response)
-    self._update_costs(usage, self.config.model)
-    response_body = self._get_response_body(response)
-    return response_body
+    try:
+        response = self.__client.invoke_model(modelId=self.config.model, body=request_body)
+        usage = self._get_usage(response)
+        self._update_costs(usage, self.config.model)
+        response_body = self._get_response_body(response)
+        return response_body
+    except Exception as e:
+        logger.error(f"Failed to invoke model: {e}")
+        return {}
 
Security
Enhance security by avoiding hardcoded AWS credentials.

Use a more secure method to handle AWS credentials instead of hardcoding them in the
client initialization method.

metagpt/provider/bedrock_api.py [29-39]

 def __init_client(self, service_name: Literal["bedrock-runtime", "bedrock"]):
-    self.__credentital_kwargs = {
-        "aws_secret_access_key": self.config.secret_key,
-        "aws_access_key_id": self.config.access_key,
-        "region_name": self.config.region_name,
-    }
-    session = boto3.Session(**self.__credentital_kwargs)
-    client = session.client(service_name)
+    session = boto3.Session()
+    client = session.client(service_name, region_name=self.config.region_name)
     return client
 
Bug
Correct a typo in a variable name.

Correct the typo in the variable name NOT_SUUPORT_STREAM_MODELS to
NOT_SUPPORT_STREAM_MODELS to maintain consistency and avoid potential bugs due to
misspellings.

tests/metagpt/provider/test_bedrock_api.py [6]

-NOT_SUUPORT_STREAM_MODELS
+NOT_SUPPORT_STREAM_MODELS
 
Add checks for dictionary keys before method calls to prevent runtime errors.

Ensure that the usage dictionary keys used in _update_costs method calls are defined as
parameters in the method's signature to avoid runtime errors if the keys are not present.

tests/metagpt/provider/test_bedrock_api.py [27-50]

-self._update_costs(usage, self.config.model)
+if 'prompt_tokens' in usage and 'completion_tokens' in usage:
+    self._update_costs(usage, self.config.model)
+else:
+    logger.error("Required keys are missing in the 'usage' dictionary")
+    raise KeyError("Required keys are missing in the 'usage' dictionary")
 
Error handling
Add specific exception handling for JSON operations.

Use a more specific exception handling for the json.dumps and json.loads operations in
mock_invoke_model_stream and get_choice_text_from_stream to handle potential JSON
serialization and parsing errors.

tests/metagpt/provider/test_bedrock_api.py [34-22]

-return json.dumps(x).encode("utf-8")
-rsp_dict = json.loads(event["chunk"]["bytes"])
+try:
+    return json.dumps(x).encode("utf-8")
+except json.JSONEncodeError as e:
+    logger.error(f"Failed to encode JSON: {e}")
+    raise
 
+try:
+    rsp_dict = json.loads(event["chunk"]["bytes"])
+except json.JSONDecodeError as e:
+    logger.error(f"Failed to decode JSON: {e}")
+    raise
+
Performance
Optimize the is_subset function for better performance.

Replace the is_subset function with a more efficient implementation using set operations
to improve performance.

tests/metagpt/provider/test_bedrock_api.py [69-77]

-for key, value in subset.items():
-    if key not in superset:
+subset_keys = set(subset.keys())
+superset_keys = set(superset.keys())
+if not subset_keys <= superset_keys:
+    return False
+for key in subset_keys:
+    if isinstance(subset[key], dict) and isinstance(superset[key], dict):
+        if not is_subset(subset[key], superset[key]):
+            return False
+    elif subset[key] != superset[key]:
         return False
-    if isinstance(value, dict):
-        if not isinstance(superset[key], dict):
-            return False
-        if not is_subset(value, superset[key]):
-            return False
 return True
 

✨ Improve tool usage guide:

Overview:
The improve tool scans the PR code changes, and automatically generates suggestions for improving the PR code. The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on a PR.

  • When commenting, to edit configurations related to the improve tool (pr_code_suggestions section), use the following template:
/improve --pr_code_suggestions.some_config1=... --pr_code_suggestions.some_config2=...
[pr_code_suggestions]
some_config1=...
some_config2=...

See the improve usage page for a comprehensive guide on using this tool.

@codecov-commenter
Copy link

codecov-commenter commented May 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.20%. Comparing base (f201b2f) to head (553702f).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1241      +/-   ##
==========================================
- Coverage   70.22%   70.20%   -0.03%     
==========================================
  Files         316      316              
  Lines       18860    18864       +4     
==========================================
- Hits        13245    13244       -1     
- Misses       5615     5620       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

config/config2.example.yaml Outdated Show resolved Hide resolved
metagpt/rag/schema.py Outdated Show resolved Hide resolved
metagpt/rag/schema.py Outdated Show resolved Hide resolved
self.dimensions = config.embedding.dimensions or self._embedding_type_to_dimensions.get(
config.embedding.api_type, 1536
)
if config.embedding.api_type not in self._embedding_type_to_dimensions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not config.embedding.dimensions and ...

@seehi
Copy link
Contributor

seehi commented May 15, 2024

LGTM

@geekan geekan merged commit 77c74fb into geekan:main May 15, 2024
0 of 2 checks passed
@usamimeri usamimeri deleted the debug branch May 17, 2024 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants