You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Greetings from the WasmEdge runtime maintenance team,
The WASI-NN API effectively supports OpenVINO, PyTorch, and TensorFlow-Lite backends. However, the design of the set-input and get-output APIs isn't in sync with the standard TensorFlow C API usage.
TensorFlow C API Inference
The TensorFlow backend, like others, employs the TF_SessionRun API for computation. However, in contrast to OpenVINO and PyTorch, this API requires the specification of output tensors during execution.
Whereas in PyTorch the sequence would look like this:
Of course, the original invocation sequence works. But at the implementation level, it will cause repeated computation as TF_SessionRun would have to be invoked during the compute phase.
Additionally, unlike OpenVINO, PyTorch, and TensorFlow-Lite, which support index-based input/output tensor selection, TensorFlow only offers the TF_GraphOperationByName API to obtain input and output operations. Hence, the sequence in TensorFlow would need to include names rather than indexes:
To incorporate the TensorFlow backend and balance developer/user experience with performance, we suggest considering the following functions:
set_input_by_name
Parameters:
ctx: handle
name: string
tensor: tensor-data
Expected result:
expected<unit, error>
set_output_by_name
Parameters:
ctx: handle
name: string
tensor: tensor-data buffer for receiving output
Expected result:
expected<(), error>
unload
A function required to relinquish loaded resources. We have a FaaS use scenario to register and de-register the loaded models.
Parameters:
graph: handle
Expected result:
expected<(), error>
finalize_execution_context
A function is needed to release execution contexts.
Parameters:
ctx: handle
Expected result:
expected<(), error>
Final Thoughts
While the existing WASI-NN API supports the backends as mentioned above, we encountered challenges when trying to implement the TensorFlow backend. Consequently, we advocate for this feature. We would appreciate any suggestions for refining these APIs, as our familiarity with TensorFlow may not be comprehensive. Thank you!
The text was updated successfully, but these errors were encountered:
It is unfortunate that the TF API does not match the wasi-nn API here, but there is a workaround. @brianjjones, in his PR to add TF to Wasmtime, maps a given index to the sorted index of the key names (see here). In other words, for keys {a, b, c} one would use 0 = a, 1 = b, 2 = c. This is not ideal (the user has to be aware of this mapping), but it is better than the alternative: adding set_input_by_name and set_output_by_name would not make sense for other frameworks.
However, in contrast to OpenVINO and PyTorch, this API requires the specification of output tensors during execution.
I think this is also unfortunate, but this may be an artifact of the TF C API that can also be avoided. In @brianjjones' PR, the TensorFlow backend has to maintain more state between the wasi-nn API calls (see here) but is able to avoid recomputation. Take a look at that unfortunate "dance" and see what you think.
unload ... finalize_execution_context
It is not clear to me how these proposed methods differ. My impression was that resource deallocation would happen automatically as a part of the switch to WIT so I have just been waiting for that to happen instead of explicitly specifying those functions. It's been a while, but I think this issue answers that question: #22.
Greetings from the WasmEdge runtime maintenance team,
The WASI-NN API effectively supports
OpenVINO
,PyTorch
, andTensorFlow-Lite
backends. However, the design of theset-input
andget-output
APIs isn't in sync with the standard TensorFlow C API usage.TensorFlow C API Inference
The TensorFlow backend, like others, employs the
TF_SessionRun
API for computation. However, in contrast toOpenVINO
andPyTorch
, this API requires the specification of output tensors during execution.Whereas in
PyTorch
the sequence would look like this:In TensorFlow, it would need to be:
Of course, the original invocation sequence works. But at the implementation level, it will cause repeated computation as
TF_SessionRun
would have to be invoked during thecompute
phase.Additionally, unlike
OpenVINO
,PyTorch
, andTensorFlow-Lite
, which support index-based input/output tensor selection, TensorFlow only offers theTF_GraphOperationByName
API to obtain input and output operations. Hence, the sequence in TensorFlow would need to include names rather than indexes:Proposed Specification Changes
To incorporate the TensorFlow backend and balance developer/user experience with performance, we suggest considering the following functions:
set_input_by_name
Parameters:
ctx
: handlename
: stringtensor
: tensor-dataExpected result:
unit
,error
>set_output_by_name
Parameters:
ctx
: handlename
: stringtensor
: tensor-data buffer for receiving outputExpected result:
error
>unload
A function required to relinquish loaded resources. We have a FaaS use scenario to register and de-register the loaded models.
Parameters:
graph
: handleExpected result:
error
>finalize_execution_context
A function is needed to release execution contexts.
Parameters:
ctx
: handleExpected result:
error
>Final Thoughts
While the existing WASI-NN API supports the backends as mentioned above, we encountered challenges when trying to implement the
TensorFlow
backend. Consequently, we advocate for this feature. We would appreciate any suggestions for refining these APIs, as our familiarity withTensorFlow
may not be comprehensive. Thank you!The text was updated successfully, but these errors were encountered: