-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement /v3/chat/completions endpoint - forward to MP graph #2436
Conversation
a639178
to
17e5a86
Compare
+ if (!result) { | ||
+ NET_LOG(ERROR, "Failed to EventLoopSchedule PartialReplyWithStatus()"); | ||
+ Abort(); | ||
+ // TODO(wenboz): should have a forced abort that doesn't write back anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this comes from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mimicking existing parts of the code in net_http module. There are non-partial reply methods with similar comments: https://github.com/tensorflow/serving/blob/83c949cd1c997ada425ac2e30510884f16f62b62/tensorflow_serving/util/net_http/server/internal/evhttp_request.cc#L362
I don't think we should care to remove it, let me know if you think otherwise @atobiszei
|
||
namespace ovms { | ||
|
||
Status deserializeInputSidePacketsFromFirstRequestImpl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intentional to leave it empty or it will be implemented later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now there is no known use case for side packets deserialization from requests, therefore it is enough implementation (returning ok and doing nothing)
|
||
const std::string& getRequestId( | ||
const std::string& request) { | ||
static const std::string _default{""}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this code is going to be executed multithreaded then this is not safe to use static variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is safe due to the fact we return const ref, which will never change. Is it correct? @atobiszei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad you're right
const std::string& endpointName, | ||
const std::string& endpointVersion, | ||
stream_types_mapping_t& inputTypes) { | ||
return StatusCode::OK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this complete implementation? Don't we need to do anything here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is complete implementation, the validation happens before reaching MP graph (in http parser) and will exist in MP calculator (for chat completion parameters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, having this call is required because gRPC KServe path implements it...
src/http_rest_api_handler.cpp
Outdated
@@ -92,6 +103,9 @@ const std::string HttpRestApiHandler::kfs_serverliveRegexExp = | |||
const std::string HttpRestApiHandler::kfs_servermetadataRegexExp = | |||
R"(/v2)"; | |||
|
|||
const std::string HttpRestApiHandler::oai_chatCompletionRegexExp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const std::string HttpRestApiHandler::oai_chatCompletionRegexExp = | |
const std::string HttpRestApiHandler::openai_chatCompletionRegexExp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what do you think about enum class member being named OAI_ChatCompletion? It is aligned with KFS_ TFS_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd go with OpenAI name all the way. It's explicit and doesn't close the path for other APIs starting with O 馃槃
9939a0d
to
159b525
Compare
馃洜 Summary
CVS-138032
Implementation of
/v3/chat/completions
endpoint and forwarding the HTTP message to MediaPipe graph.The data is std::string now, to be adjusted in following tasks (CVS-139240/CVS-140684).
馃И Checklist
Unit tests added.Minimal positive unit test added. Remaining tests will be added in next PR.The documentation updated.No documentation changes required for now.