Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpmsg: notify the user when the remote address is received #246

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

xiaoxiang781216
Copy link
Collaborator

without this notificaiton, user has to call is_rpmsg_ept_ready in a busy loop

@xiaoxiang781216
Copy link
Collaborator Author

The alternative method to fix the issue reported here: #238

@arnopo
Copy link
Collaborator

arnopo commented Dec 1, 2020

Hi @xiaoxiang781216,

Seems that the callback can be called only in a particular case. Please could you detail your sequence that results in the call of this callback?
as rpmsg_send_ns_message is called only if the destination address is set to RPMSG_ADDR_ANY, seems that one of the processor has to know the remote ept address.

@xiaoxiang781216
Copy link
Collaborator Author

Hi @xiaoxiang781216,

Seems that the callback can be called only in a particular case. Please could you detail your sequence that results in the call of this callback?
as rpmsg_send_ns_message is called only if the destination address is set to RPMSG_ADDR_ANY, seems that one of the processor has to know the remote ept address.

Here is the typical sequence:

  1. Local call rpmsg_create_ept(src=RPMSG_ADDR_ANY, dest=RPMSG_ADDR_ANY) at the start up
  2. Remote call rpmsg_create_ept(src=RPMSG_ADDR_ANY, dest=RPMSG_ADDR_ANY) at the start up
  3. Both side receive RPMSG_NS_CREATE message, but can find the local ept by the channel name
		if (!_ept) {
			/*
			 * send callback to application, that can
			 * - create the associated endpoints.
			 * - store information for future use.
			 * - just ignore the request as service not supported.
			 */
			metal_mutex_release(&rdev->lock);
			if (rdev->ns_bind_cb)
				rdev->ns_bind_cb(rdev, name, dest);
		} else {
			_ept->dest_addr = dest;
			metal_mutex_release(&rdev->lock);
		}

In this case, ns_bind_cb in the above code(if statement) get skipped, so we have to notify the user in the else statement by the new ns_bound_cb callback.

@arnopo
Copy link
Collaborator

arnopo commented Dec 1, 2020

Thanks for the use case clarification
This PR seems a complement or a workaround to #160.
It solves part of the requirement, but it seems that it does not work in all cases. It is therefore difficult to trust the callback, as it will be called based on remote behavior.

For me your point is valid, but as previously suggested i would prefer that we find a generic solution common with Linux implementation, that would perhaps extend the RPMsg protocol.

From my point of view, the best way to proceed is to send an RFC or a patch to the Linux remoteproc mailing list to propose an extension of the protocol to support handshaking. This will trigger a discussion on a global solution.
If no common solution is found, we will try to find one for OpenAMP-OpenAMP communication.
As an entry point, I suggest taking a look at this PR that has been merged on the maintainer branch, so will probably be part of the 5.11 or 5.12 kernel. This series introduces a generic rpmsg_ns that could be used to extend the Rpmsg protocol with a new API to allow RPMsg services to send an asynchronous NS_ACK.

@xiaoxiang781216
Copy link
Collaborator Author

Thanks for the use case clarification
This PR seems a complement or a workaround to #160.

No, this PR isn't related to #160. #160 provide a mechanism to exchange the address automatically. As we disscuss in that PR, the server can send a dummy message to pass the address back to client manually too. But the client need some way to get the notification when the server address is known by either method, otherwise the client has to check the dest_addr in a busy loop like #238.

It solves part of the requirement, but it seems that it does not work in all cases. It is therefore difficult to trust the callback, as it will be called based on remote behavior.

No, it work in all cases. The key point is that OpenAMP need provide a better method to notify the peer address is ready if he/she create an endpoint by rpmsg_create_ept(dest=RPMSG_ADDR_ANY). Otherwise, user has to poll the dest_addr again and again.

For me your point is valid, but as previously suggested i would prefer that we find a generic solution common with Linux implementation, that would perhaps extend the RPMsg protocol.

This PR doesn't change the protocol between Linux and MCU like #160 and #155. It's an implementation detail that how to notify the user when the server address is available, so we don't need(shouldn't) sync so detailed info with Linux.

From my point of view, the best way to proceed is to send an RFC or a patch to the Linux remoteproc mailing list to propose an extension of the protocol to support handshaking. This will trigger a discussion on a global solution.
If no common solution is found, we will try to find one for OpenAMP-OpenAMP communication.
As an entry point, I suggest taking a look at this PR that has been merged on the maintainer branch, so will probably be part of the 5.11 or 5.12 kernel. This series introduces a generic rpmsg_ns that could be used to extend the Rpmsg protocol with a new API to allow RPMsg services to send an asynchronous NS_ACK.

Yes, #155 and #160 need collaborate with Linux, but this PR doesn't need.

@arnopo
Copy link
Collaborator

arnopo commented Dec 7, 2020

Thanks for the use case clarification
This PR seems a complement or a workaround to #160.

No, this PR isn't related to #160. #160 provide a mechanism to exchange the address automatically. As we disscuss in that PR, the server can send a dummy message to pass the address back to client manually too. But the client need some way to get the notification when the server address is known by either method, otherwise the client has to check the dest_addr in a busy loop like #238.

It solves part of the requirement, but it seems that it does not work in all cases. It is therefore difficult to trust the callback, as it will be called based on remote behavior.

No, it work in all cases. The key point is that OpenAMP need provide a better method to notify the peer address is ready if he/she create an endpoint by rpmsg_create_ept(dest=RPMSG_ADDR_ANY). Otherwise, user has to poll the dest_addr again and again.
This is my point. how to determine the method to use, callback or polling? no generic way as it depends on the remote behavior...
So how to implement something generic in projects such as Zephyr or NuttX, as the implementation would work only if the firmware running on the other processor send the RPMSG_NS_CREATE?

For me your point is valid, but as previously suggested i would prefer that we find a generic solution common with Linux implementation, that would perhaps extend the RPMsg protocol.

This PR doesn't change the protocol between Linux and MCU like #160 and #155. It's an implementation detail that how to notify the user when the server address is available, so we don't need(shouldn't) sync so detailed info with Linux.

Both seems linked as the #160 is how to notify the remote processor when available. This PR can be considered as an extension to propagate to the user. As the way to address the #160 could change the way to implement this. We can not 100% sure that this update would be still relevant after the #160 feature is implemented.I think here it would be more reasonable to take a step-by-step approach, first address the #160 in Linux, then address this PR if still relevant, With the objective of having a generic solution that would work in a maximum of use case.

From my point of view, the best way to proceed is to send an RFC or a patch to the Linux remoteproc mailing list to propose an extension of the protocol to support handshaking. This will trigger a discussion on a global solution.
If no common solution is found, we will try to find one for OpenAMP-OpenAMP communication.
As an entry point, I suggest taking a look at this PR that has been merged on the maintainer branch, so will probably be part of the 5.11 or 5.12 kernel. This series introduces a generic rpmsg_ns that could be used to extend the Rpmsg protocol with a new API to allow RPMsg services to send an asynchronous NS_ACK.

Yes, #155 and #160 need collaborate with Linux, but this PR doesn't need.

@xiaoxiang781216
Copy link
Collaborator Author

xiaoxiang781216 commented Dec 7, 2020

Even #160 is implemented, the address exchange is still an asynchronous process. One side who issue the first rpmsg_create_ept(and then has to pass dest=RPMSG_ADDR_ANY), need a way to know that the destination address available(either by a dummy message or #160 approach). one method is checking dest_addr in a busy loop(bad), another is call a function provided by user(better than first).

Copy link
Contributor

@edmooring edmooring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably a good approach. I'd like to see a bit more documentation as to the intent of the callback, and an update to one of the examples to show how to use it.

lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
@xiaoxiang781216
Copy link
Collaborator Author

@arnopo and @edmooring we need a method to know when the dest address is got:

  1. The busy loop like rpmsg: wait endpoint ready in rpmsg_send #238
  2. The notification like rpmsg: notify the user when the remote address is received #246

Either method is workable, but we have to select one. What are you thinking?

@arnopo
Copy link
Collaborator

arnopo commented Jan 6, 2021

@arnopo and @edmooring we need a method to know when the dest address is got:

  1. The busy loop like rpmsg: wait endpoint ready in rpmsg_send #238
  2. The notification like rpmsg: notify the user when the remote address is received #246

Either method is workable, but we have to select one. What are you thinking?

Hi @xiaoxiang781216,
i think I already shared my view in my previous comment (#246 (comment)). This topic can depend on how is addressed the topic linked to #160.
Waiting a generic solution for #160, the is_rpmsg_ept_ready API can been used by application to know when the "dest" address is got.
I recognize that it's not perfect, but it's better, from my perspective, than implementing a solution that we may have to revisit because we haven't dealt with the issues in the right order.

Anyway let's @edmooring shares its point of view

@edmooring
Copy link
Contributor

@arnopo and @edmooring we need a method to know when the dest address is got:

  1. The busy loop like rpmsg: wait endpoint ready in rpmsg_send #238
  2. The notification like rpmsg: notify the user when the remote address is received #246

Either method is workable, but we have to select one. What are you thinking?

Hi @xiaoxiang781216,
i think I already shared my view in my previous comment (#246 (comment)). This topic can depend on how is addressed the topic linked to #160.
Waiting a generic solution for #160, the is_rpmsg_ept_ready API can been used by application to know when the "dest" address is got.
I recognize that it's not perfect, but it's better, from my perspective, than implementing a solution that we may have to revisit because we haven't dealt with the issues in the right order.

Anyway let's @edmooring shares its point of view

Hi all,
I'm in agreement with Arnaud. I'd like to see #160 move forward. It solves an important problem for both Linux and the library.

@xiaoxiang781216
Copy link
Collaborator Author

Ok, let's wait #160

@xiaoxiang781216
Copy link
Collaborator Author

xiaoxiang781216 commented May 10, 2022

Update the document like #376 . @arnopo like we discussed in #376, bind notification is also useful in some case:
https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L450
https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L420-L429
You can see rpmsg socket utilize this to sync the peer window size as soon as possible, which is important to improve the initial handshake performance.

@arnopo
Copy link
Collaborator

arnopo commented May 12, 2022

Update the document like #376 . @arnopo like we discussed in #376, bind notification is also useful in some case: https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L450 https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L420-L429 You can see rpmsg socket utilize this to sync the peer window size as soon as possible, which is important to improve the initial handshake performance.

Agree on the fact that something is needed for the handshake. The point here is that the solution needs to be compliant with Linux implementation.
FYI On Linux side there is a patchset from Qualcomm around the flow control signaling. I 'm currently implementing the service for the rpmsg virtio based on Software flow control. This could be an alternative to address the handshake, in addition of the availability to pause/resume endpoints communication.

  • POC Linux code:

https://github.com/arnopo/linux/commits/signalling

  • openamp library associated code:

https://github.com/arnopo/open-amp/commits/flow_ctrl

  • overview presentation

https://drive.google.com/file/d/1CLU3ybI3oSBGvor18AQ-HOzOJ2nOppEb/view

@xiaoxiang781216
Copy link
Collaborator Author

xiaoxiang781216 commented May 12, 2022

Update the document like #376 . @arnopo like we discussed in #376, bind notification is also useful in some case: https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L450 https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L420-L429 You can see rpmsg socket utilize this to sync the peer window size as soon as possible, which is important to improve the initial handshake performance.

Agree on the fact that something is needed for the handshake. The point here is that the solution needs to be compliant with Linux implementation. FYI On Linux side there is a patchset from Qualcomm around the flow control signaling. I 'm currently implementing the service for the rpmsg virtio based on Software flow control. This could be an alternative to address the handshake, in addition of the availability to pause/resume endpoints communication.

But the endpoint setup is part of the life time cycle management, which is different from follow control. It isn't good to mix them. If Linux side support to create rpmsg device dynamically(instead of waiting the remote creation), the mapping still probe callback:

  1. Create rpmsg_device and send RPMSG_NS_CREATE to remote
  2. Remote create rpmsg_endpoint and send back RPMSG_NS_CREATE_ACK
  3. rpmsg bus driver add the rpmsg_device created in step 1 to the bus
  4. Call rpmsg driver probe callback

You can see the probe callback is delayed until the handshake finish, it's wrong to:

  1. Call probe rpmsg driver in the step 1
  2. And provide a new callback to notify the driver

Since OpenAMP doesn't follow Linux driver model(bus/device/driver), the user may hold a rpmsg_endpoint which isn't capable to communicate with the remote, he has to:

  1. Get the ready notification from bind callback(PR rpmsg: notify the user when the remote address is received #246), equals to probe in Linux world
  2. Call rpmsg_try_send in some custom poll loop
  3. Check the status in rpmsg_send(PR rpmsg: wait endpoint ready in rpmsg_send #238)

All approach is valuable and can be used in the different case. The last two approach isn't needed in Linux since probe mechanism ensure them never happen.

  • POC Linux code:

https://github.com/arnopo/linux/commits/signalling

  • openamp library associated code:

https://github.com/arnopo/open-amp/commits/flow_ctrl

It's two different topic, please don't mix them. This just like nobody consider to replace driver probe callback with the new flow_control in Linux kernel.

  • overview presentation

https://drive.google.com/file/d/1CLU3ybI3oSBGvor18AQ-HOzOJ2nOppEb/view

Several questions:

  1. How to sned/recv FC message reliably with the shared virtio buffer? The normal rpmsg driver may exhaust the rpmsg buffer and then block FC message for a long time. We have saw the similar situation in the pressure test.
  2. What's benefit send XON to FC endpoint in NS announcement? Why the life cycle management need depend on the flow control?
  3. If the remote crash, the remoteproc driver need reload the firmware and start the remote again, why need inform remote through FC endpoint?

@arnopo
Copy link
Collaborator

arnopo commented May 12, 2022

Update the document like #376 . @arnopo like we discussed in #376, bind notification is also useful in some case: https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L450 https://github.com/apache/incubator-nuttx/blob/master/net/rpmsg/rpmsg_sockif.c#L420-L429 You can see rpmsg socket utilize this to sync the peer window size as soon as possible, which is important to improve the initial handshake performance.

Agree on the fact that something is needed for the handshake. The point here is that the solution needs to be compliant with Linux implementation. FYI On Linux side there is a patchset from Qualcomm around the flow control signaling. I 'm currently implementing the service for the rpmsg virtio based on Software flow control. This could be an alternative to address the handshake, in addition of the availability to pause/resume endpoints communication.

But the endpoint setup is part of the life time cycle management, which is different from follow control. It isn't good to mix them. If Linux side support to create rpmsg device dynamically(instead of waiting the remote creation), the mapping still probe callback:

Linux supports that and since 5.18 (https://elixir.bootlin.com/linux/v5.18-rc1/source/drivers/rpmsg/rpmsg_ctrl.c#L98)
it is also possible to create the device from the user space.

  1. Create rpmsg_device and send RPMSG_NS_CREATE to remote
  2. Remote create rpmsg_endpoint and send back RPMSG_NS_CREATE_ACK

This part is missing in Linux and as it implies an evolution of the virtio rpmsg features. So If you want to promote this solution You need to send to the linux mailing list the associated patches

  1. rpmsg bus driver add the rpmsg_device created in step 1 to the bus
  2. Call rpmsg driver probe callback

The step 3. and 4. is mixed with step 1 when Linux creates the rpmsg device. even if RPMSG_NS_CREATE_ACK is implemented in Linux, the legacy behavior has to be supported.

You can see the probe callback is delayed until the handshake finish, it's wrong to:

  1. Call probe rpmsg driver in the step 1
  2. Add provide a new callback to notify the driver

This is the way it is implemented today in Linux. One advantage of this solution is that you announce only if your driver probe does not fail.

Since OpenAMP doesn't follow Linux driver model(bus/device/driver), the user may hold a rpmsg_endpoint which isn't capable to communicate with the remote, he has to:

  1. Provide the bind callback to notify user the endpoint is ready(PR rpmsg: notify the user when the remote address is received #246)
  2. Call rpmsg_try_send in some custom poll loop
  3. Check the status in rpmsg_send(PR rpmsg: wait endpoint ready in rpmsg_send #238)

To be able to discuss these points we need first to move forward on the the endpoint bindings

All approach is valuable and can be used in the different case.

[...]

It's two different topic, please don't mix them. This just like nobody consider to replace driver probe callback with the new flow_control in Linux kernel.

When the ns announcement or ns annoucement ack is called, the driver is already probed in Linux as the endpoint is already created for OpenAMP .

  • The RPMSG_NS_CREATE_ACK provides the endpoint address to the remote side to bind the service
  • The flow control informs the remote side that the endpoint is ready to communicate, but provide also the address. So by default could be used for the bind the endpoints.

I would prefer to have both that's why my #246 (comment) was only to inform you about flow control work. As I already mentioned several times, if you want that the RPMSG_NS_CREATE_ACK mechanism be taken into account, an implementation has to be sent on Linux mailing list.
Be aware about the risk that If the flow control gives an alternative to bind endpoints then you approach could not be accepted on Linux side.

@xiaoxiang781216
Copy link
Collaborator Author

xiaoxiang781216 commented May 16, 2022

But the endpoint setup is part of the life time cycle management, which is different from follow control. It isn't good to mix them. If Linux side support to create rpmsg device dynamically(instead of waiting the remote creation), the mapping still probe callback:

Linux supports that and since 5.18 (https://elixir.bootlin.com/linux/v5.18-rc1/source/drivers/rpmsg/rpmsg_ctrl.c#L98) it is also possible to create the device from the user space.

Thanks for pointing out.

  1. Create rpmsg_device and send RPMSG_NS_CREATE to remote
  2. Remote create rpmsg_endpoint and send back RPMSG_NS_CREATE_ACK

This part is missing in Linux and as it implies an evolution of the virtio rpmsg features. So If you want to promote this solution You need to send to the linux mailing list the associated patches

It is seldom needed to create the device proactively from Linux side from my experience:).

  1. rpmsg bus driver add the rpmsg_device created in step 1 to the bus
  2. Call rpmsg driver probe callback

The step 3. and 4. is mixed with step 1 when Linux creates the rpmsg device. even if RPMSG_NS_CREATE_ACK is implemented in Linux, the legacy behavior has to be supported.

Yes, if the peer doesn't support RPMSG_NS_CREATE_ACK, we have to make rpmsg_device ready immediately and then rpmsg driver has to wait the peer send the message first before it can start the talking.

You can see the probe callback is delayed until the handshake finish, it's wrong to:

  1. Call probe rpmsg driver in the step 1
  2. Add provide a new callback to notify the driver

This is the way it is implemented today in Linux. One advantage of this solution is that you announce only if your driver probe does not fail.

Yes, this is key point: rpmsg_create_ept return rpmsg_endpoint, but it can't work until the peer send RPMSG_NS_CREATE(or RPMSG_NS_CREATE_ACK), this patch provide the callback to notify the endpoint is ready to communicate just like rpmsg_driver:probe callback.

BTW, it's very useful even without RPMSG_NS_CREATE_ACK. For example, we have a SoC with four MCU/DSP run OpenAMP and want to setup a mesh network. Here is how we do:

  1. Each CPU call rpmsg_create_ept proactively on every remoteproc
  2. When OpenAMP receive RPMSG_NS_CREATE, call ept's ns_bind_cb

At this point, the service know it could talk to the peer.

Since OpenAMP doesn't follow Linux driver model(bus/device/driver), the user may hold a rpmsg_endpoint which isn't capable to communicate with the remote, he has to:

  1. Provide the bind callback to notify user the endpoint is ready(PR rpmsg: notify the user when the remote address is received #246)
  2. Call rpmsg_try_send in some custom poll loop
  3. Check the status in rpmsg_send(PR rpmsg: wait endpoint ready in rpmsg_send #238)

To be able to discuss these points we need first to move forward on the the endpoint bindings

All approach is valuable and can be used in the different case.

[...]

It's two different topic, please don't mix them. This just like nobody consider to replace driver probe callback with the new flow_control in Linux kernel.

When the ns announcement or ns annoucement ack is called, the driver is already probed in Linux as the endpoint is already created for OpenAMP .

  • The RPMSG_NS_CREATE_ACK provides the endpoint address to the remote side to bind the service
  • The flow control informs the remote side that the endpoint is ready to communicate, but provide also the address. So by default could be used for the bind the endpoints.

One question: why need a dedicated endpoint for the flow control? The flow controol is very protocol specific and better to integrate with the rpmsg driver self. For example, we have create a new socket family on top of rpmsg API, so the user could talk to peer with the standard socket API just like Unix Domain socket. To control the flow, we use a slide window internally which is very efficient since the window info insert into the normal data flow instead.

If your target is rpmsg_tty driver, it's better to build the flow control into rpmsg_tty protocol self. Yes, we also build a rpmsg tty driver which use a different but simple flow control than rpmsg socket.

I would prefer to have both that's why my #246 (comment) was only to inform you about flow control work. As I already mentioned several times, if you want that the RPMSG_NS_CREATE_ACK mechanism be taken into account, an implementation has to be sent on Linux mailing list.

Since I don't actively develop Linux kernel for a while, it's very hard to prepare the new patch without the test.

Be aware about the risk that If the flow control gives an alternative to bind endpoints then you approach could not be accepted on Linux side.

It's fine to use either method since the result is same: we can get the bind/probe callback.

Let's come back to this patch. I hope the mesh network example give you an example why ns_bind_cb is also useful even without RPMSG_NS_CREATE_ACK.

@github-actions
Copy link

This pull request has been marked as a stale pull request because it has been open (more than) 45 days with no activity.

@github-actions github-actions bot added the Stale label Oct 24, 2023
without this notificaiton, user has to call is_rpmsg_ept_ready in a busy loop

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants