Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move RPMSG buffer address translation out of VirtIO layer #575

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

glneo
Copy link
Contributor

@glneo glneo commented Mar 19, 2024

The allocation and use of the buffers used for passing RPMSGs contents is all handled by the RPMSG layer (rpmsg_virtio.c). The VirtIO layer should only need to deal with guest-physical addresses, as that is what the standard requires stored inside the vring descriptors.

Move the address translations steps out of the VirtIO layer and up into the RPMSG layer. This moves that handling where it belongs and allows for later improvements to both layers that could not be done with this unneeded interdependence.

This should hold a pointer to an metal_io_region, make that the type.

Also fix the comment, this region holds the address of the message
buffers, not the vring's descriptor table nor available/used rings.
It is only used for virt-to-phys/phys-to-vert on the buffers pointed
to by these descriptors.

This comment seems to have cause an issue in virtio_mmio_drv where
this region was used to translate the address of the vring descriptors.
This may have worked if the vring descriptors where part of the same
IO space as the buffers they point to, but this is not guaranteed
to always be the case. Fix that here.

Signed-off-by: Andrew Davis <afd@ti.com>
Have the higher level rpmsg_virtio handle the virt-to-phys and back
conversion. The rpmsg_virtio layer already has the shared buffer memory
metal IO. This removes the need to pass in the buffer region metal_io
struct down to the virtio layer, which otherwise does nothing with the
contents of the buffers.

Signed-off-by: Andrew Davis <afd@ti.com>
@arnopo
Copy link
Collaborator

arnopo commented Mar 29, 2024

Could you provide more details on what you try to fix here. Your update would impose the PA<->VA conversion in each virtio driver/device.

I don't undertsand why the virtio layer can not deal with VA and make the conversion to store in the vrings descriptor.
Looking in Linux, seems that scatterlist is used at Virtio API, so similar approach based on page

@danmilea, any opinion on this proposal?

@glneo
Copy link
Contributor Author

glneo commented Mar 30, 2024

Could you provide more details on what you try to fix here.

Sure, the general idea is cleaner separation of responsibility for each layer. More details on the "why" below.

Your update would impose the PA<->VA conversion in each virtio driver/device.

VirtIO never deals with VA for the messages, the virt queue descriptors store PAs, this is described in the standard. So why should the VirtIO layer convert them to VAs, or even assume that they must be converted at all?

Yes it means that the users of the VirtIO layer (which is just the RPMSG layer today) will have to do that conversion if they need to work with the contents of the buffer. But some users might not need that conversion at all, they may want the PA, for instance in the case of a proxy they might want to take that PA and put it right back into another virt queue descriptor, no translation wanted.

I don't understand why the virtio layer can not deal with VA and make the conversion to store in the vrings descriptor.

So here is the actual problem I am solving with this: in the general case we cannot know how to do the PA<->VA conversion. Today the assumption made by OpenAMP's VirtIO layer is that all the message buffers will come from one single IO region (shm_io). This is an bad assumption. Only when the RPMSG layer on top is the HOST role do we usually allocate all the message buffers from one IO region that we know about up front (the shm_pool). When the RPMSG layer is the REMOTE, then the other side does the allocations, and they can be from anywhere. Luckily this issue can be solved thanks to the HOST side putting all the buffer locations in the descriptor queues up front when building the tables. But I have to implement that solution in two parts, this is the first, move the translation out of the VirtIO layer (done here). The second part is to dynamically build the IO regions from these pre-setup descriptor queues when we are the REMOTE side.

Looking in Linux, seems that scatterlist is used at Virtio API, so similar approach based on page

Scatterlists do not deal with VA either, they deal with DMA addresses (or device physical addresses) exactly like I do here. If we want to match more closely then the VirtIO layer would deal with offsets (pages), which is okay also, anything but VA works here.

static inline void *virtqueue_phys_to_virt(struct virtqueue *vq,
metal_phys_addr_t phys)
{
struct metal_io_region *io = vq->shm_io;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove shm_io from virtqueue too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should, I'm just waiting for this PR to go in first to remove the last (accidental) user: #570

@arnopo
Copy link
Collaborator

arnopo commented Apr 2, 2024

Could you provide more details on what you try to fix here.

Sure, the general idea is cleaner separation of responsibility for each layer. More details on the "why" below.

Your update would impose the PA<->VA conversion in each virtio driver/device.

VirtIO never deals with VA for the messages, the virt queue descriptors store PAs, this is described in the standard. So why should the VirtIO layer convert them to VAs, or even assume that they must be converted at all?

Yes it means that the users of the VirtIO layer (which is just the RPMSG layer today) will have to do that conversion if they need to work with the contents of the buffer. But some users might not need that conversion at all, they may want the PA, for instance in the case of a proxy they might want to take that PA and put it right back into another virt queue descriptor, no translation wanted.
Does the metal io_region with PA =VA is not sufficient? For instance for zephyr is working is this config.

I don't understand why the virtio layer can not deal with VA and make the conversion to store in the vrings descriptor.

So here is the actual problem I am solving with this: in the general case we cannot know how to do the PA<->VA conversion. Today the assumption made by OpenAMP's VirtIO layer is that all the message buffers will come from one single IO region (shm_io). This is an bad assumption. Only when the RPMSG layer on top is the HOST role do we usually allocate all the message buffers from one IO region that we know about up front (the shm_pool). When the RPMSG layer is the REMOTE, then the other side does the allocations, and they can be from anywhere. Luckily this issue can be solved thanks to the HOST side putting all the buffer locations in the descriptor queues up front when building the tables. But I have to implement that solution in two parts, this is the first, move the translation out of the VirtIO layer (done here). The second part is to dynamically build the IO regions from these pre-setup descriptor queues when we are the REMOTE side.

The io_region is just a structure that was created to support complex memory mapping and ensure that you work we expected memory region. So in theory, no problem to initialize it with the entire memory range.
Do you really need to have several shm_io?
It is not clear for me... If you dynamically build the IO regions how do you check the validity of this regions? i suppose that you will need to check validity regarding a local memory regions. This is what metal_io is doing, The difference is thatwe have to declare the memory region first.

Looking in Linux, seems that scatterlist is used at Virtio API, so similar approach based on page

Scatterlists do not deal with VA either, they deal with DMA addresses (or device physical addresses) exactly like I do here. If we want to match more closely then the VirtIO layer would deal with offsets (pages), which is okay also, anything but VA works here.

My point here was that it is not directly the physical address at API level but something else with a conversion(page_to_phys()) done in virtio ring.
Anyway I think an important usecase we have to deal with, is the use of the open-amp lib in Linux User Space. How to manage openamp with HOST role to communicate with the remote processor. what ever the the virtio type, the application will allocate buffers so will need the VA<->PA conversion.
As we have the ambition to extend the list of Virtio services, we should find a solution that match (as much as possible) to the virtio services, not just RPMSG. Having the conversion in the virtio_ring, would avoid conversion duplication in all virtio service driver/devices

@glneo
Copy link
Contributor Author

glneo commented Apr 3, 2024

The io_region is just a structure that was created to support complex memory mapping and ensure that you work we expected memory region. So in theory, no problem to initialize it with the entire memory range.
Do you really need to have several shm_io?

One metal_io_region that covers the whole memory space might be workable, the root issue of building that region is the same.

It is not clear for me... If you dynamically build the IO regions how do you check the validity of this regions? i suppose that you will need to check validity regarding a local memory regions. This is what metal_io is doing, The difference is that we have to declare the memory region first.

That is what I am solving, today we have to have this magic shm_io region that can handle any and all buffer addresses that might be sent. Currently this means we have know the addresses up-front and hard-code them. No matter what we do to solve that, the problem should not be the VirtIO layer's problem as it by design only deals with rings and descriptors. The standard even specifies this by stating the descriptors shall only store physical addresses when address are passed, making sure any needed address translations would not be a vring problem.

Anyway I think an important usecase we have to deal with, is the use of the open-amp lib in Linux User Space. How to manage openamp with HOST role to communicate with the remote processor. what ever the the virtio type, the application will allocate buffers so will need the VA<->PA conversion.

Not sure what this PR changes about that. The application passes in a shm_io that can handle those translations to rpmsg_init_vdev() just like before. The only difference is the rpmsg_virtio layer doesn't pass this down into the virtio layer, instead the same call to metal_io_virt_to_phys() is done in the rpmsg_virtio layer. This PR is transparent to the application.

As we have the ambition to extend the list of Virtio services, we should find a solution that match (as much as possible) to the virtio services, not just RPMSG. Having the conversion in the virtio_ring, would avoid conversion duplication in all virtio service driver/devices.

If the concern is avoiding code duplication there a plenty of ways to solve that. One would be to provide helper functions (not really needed IMHO though since it is just one function call metal_io_virt_to_phys()). Or another option would be a layer on top of the virtio layer that converts the message addresses if your use-case needs translation.

The thing to not do is what we do today and assume that since the one current user (RPMSG) needs it, then all users will always need translation and make the virtio layer do that unconditionally. Which actually adds code (notice this PR removes more lines than it adds 😃), all to avoid code duplication that might exist some day.

@arnopo
Copy link
Collaborator

arnopo commented Apr 9, 2024

The io_region is just a structure that was created to support complex memory mapping and ensure that you work we expected memory region. So in theory, no problem to initialize it with the entire memory range.
Do you really need to have several shm_io?

One metal_io_region that covers the whole memory space might be workable, the root issue of building that region is the same.

It is not clear for me... If you dynamically build the IO regions how do you check the validity of this regions? i suppose that you will need to check validity regarding a local memory regions. This is what metal_io is doing, The difference is that we have to declare the memory region first.

That is what I am solving, today we have to have this magic shm_io region that can handle any and all buffer addresses that might be sent. Currently this means we have know the addresses up-front and hard-code them. No matter what we do to solve that, the problem should not be the VirtIO layer's problem as it by design only deals with rings and descriptors. The standard even specifies this by stating the descriptors shall only store physical addresses when address are passed, making sure any needed address translations would not be a vring problem.

In vrings descriptor we must have the physical address, but I don't see anything in the spec that imposes that we can not have memory translation in the virtio transport driver/device. Could you point it to me?

More that that we can need translation address, In AMP system, between the main processor (physical address) and the coprocessor (device address). A virtio drivers physical address is written in the descriptor and a translation address can be needed on device side. So the notion of physical adresse in such case need to be clarified.
One concrete example is the RETRAM memory in stm32mp15. The coprocessor base address is 0x00000000 while the main processor base adresss is 0x38000000.

Anyway I think an important usecase we have to deal with, is the use of the open-amp lib in Linux User Space. How to manage openamp with HOST role to communicate with the remote processor. what ever the the virtio type, the application will allocate buffers so will need the VA<->PA conversion.

Not sure what this PR changes about that. The application passes in a shm_io that can handle those translations to rpmsg_init_vdev() just like before. The only difference is the rpmsg_virtio layer doesn't pass this down into the virtio layer, instead the same call to metal_io_virt_to_phys() is done in the rpmsg_virtio layer. This PR is transparent to the application.

As we have the ambition to extend the list of Virtio services, we should find a solution that match (as much as possible) to the virtio services, not just RPMSG. Having the conversion in the virtio_ring, would avoid conversion duplication in all virtio service driver/devices.

If the concern is avoiding code duplication there a plenty of ways to solve that. One would be to provide helper functions (not really needed IMHO though since it is just one function call metal_io_virt_to_phys()). Or another option would be a layer on top of the virtio layer that converts the message addresses if your use-case needs translation.

The thing to not do is what we do today and assume that since the one current user (RPMSG) needs it, then all users will always need translation and make the virtio layer do that unconditionally. Which actually adds code (notice this PR removes more lines than it adds 😃), all to avoid code duplication that might exist some day.

Some works are already started today around virtio services :
https://github.com/OpenAMP/open-amp/tree/virtio-exp-v2023.10/lib/virtio_mmio

Every virtio service should manipulate buffer. It can be internally with a copy or external by adding/getting allocated buffers in descriptor.

My concern is more at application level. I want to be sure that for all virtio services, independently of the system we have the same way of managing the buffers at virtio service API. From my perspective the more generic/flexible is the VA. and in this case we need virt to phys translations.

Anyway this topic is not the simple, It would be nice to have more opinions to decide in which direction we move forward.

@glneo
Copy link
Contributor Author

glneo commented Apr 16, 2024

In vrings descriptor we must have the physical address, but I don't see anything in the spec that imposes that we can not have memory translation in the virtio transport driver/device. Could you point it to me?

How we implement our drivers is of course up to us. My point was that the spec was written to give us the freedom of implementing the driver with layers that split responsibilities. Which is what this PR is suggesting we do.

More that that we can need translation address, In AMP system, between the main processor (physical address) and the coprocessor (device address). A virtio drivers physical address is written in the descriptor and a translation address can be needed on device side. So the notion of physical adresse in such case need to be clarified.
One concrete example is the RETRAM memory in stm32mp15. The coprocessor base address is 0x00000000 while the main processor base adresss is 0x38000000.

We have the same in our devices, for instance the M4 cores in AM625 see their TCM at 0x0 but we program them at some higher address from the "main" processors perspective. Even more complex, we have a RAT (Region-based Address Translator) and IOMMU before DRAM. Which means we might have 3(!) levels of translation before hitting the actual buffer in memory. So what we put in the descriptor table as the messages' "physical address" is not well defined at all. Hence why I want to pull that messy translation logic out of the virtio driver.

Every virtio service should manipulate buffer. It can be internally with a copy or external by adding/getting allocated buffers in descriptor.

My concern is more at application level. I want to be sure that for all virtio services, independently of the system we have the same way of managing the buffers at virtio service API. From my perspective the more generic/flexible is the VA. and in this case we need virt to phys translations.

I would also like to see an application level API that is independent of the underlying system. And yes most users may need VA<->PA translations, but hiding these translations in the virtio layer as done today only works for some system. Applications still work with VAs, all that changes is where the translation happens after they pass the VA into the framework.

Right now at least, the change in this PR makes no change to the end applications. If someday an application does want to directly interact with the virtio layer, I would think it should also want the extra level of control over buffer allocations and translations provided by this PR.

@glneo
Copy link
Contributor Author

glneo commented Apr 17, 2024

Continuing to think on this, I wonder if the responsibility for all these buffer allocations should actually belong to the transport layer, not the message layer.. I'll have to think on what that looks like, might be and even cleaner solution there than this PR. (Either way, doesn't belong to the virtio layer, just need to figure out which direction it should go and how to do that)

@xiaoxiang781216
Copy link
Collaborator

xiaoxiang781216 commented Apr 18, 2024

Continuing to think on this, I wonder if the responsibility for all these buffer allocations should actually belong to the transport layer, not the message layer.. I'll have to think on what that looks like, might be and even cleaner solution there than this PR. (Either way, doesn't belong to the virtio layer, just need to figure out which direction it should go and how to do that)

We think so and @CV-Bowen provide a patch here: #541
A more flexible buffer allocation scheme is required, if we want to support non remoteproc based transport on top of OpenAMP virtio layer(e.g. virtio and vhost).

@arnopo
Copy link
Collaborator

arnopo commented Apr 19, 2024

Continuing to think on this, I wonder if the responsibility for all these buffer allocations should actually belong to the transport layer, not the message layer.. I'll have to think on what that looks like, might be and even cleaner solution there than this PR. (Either way, doesn't belong to the virtio layer, just need to figure out which direction it should go and how to do that)

We think so and @CV-Bowen provide a patch here: #541 A more flexible buffer allocation scheme is required, if we want to support non remoteproc based transport on top of OpenAMP virtio layer(e.g. virtio and vhost).

Full aligned with that. we should have some alloc/free/register/unregister ops. Either application allocates buffers or register them, if allocated by another entity. One constraint is to be continue to support static allocation for some systems. but this can be solved by a pre allocated memory area for allocations as we have today with the shmem io.

@glneo
Copy link
Contributor Author

glneo commented Apr 25, 2024

No matter which way we solve the shared message buffer issue, this PR is still valid and needed.

@arnopo
Copy link
Collaborator

arnopo commented May 14, 2024

No matter which way we solve the shared message buffer issue, this PR is still valid and needed.

At this stage, the need for existing code is not evident. With the discussions around the allocator, I would be of the opinion to make sure that we are moving in the right direction.

Having your Step 2 code would help to provide a more comprehensive picture:

I have to implement that solution in two parts, this is the first, move the translation out of the VirtIO layer (done here). >The second part is to dynamically build the IO regions from these pre-setup descriptor queues when we are the REMOTE side.

We should also add the topic at the agenda of next virtio meeting to have more options

@glneo
Copy link
Contributor Author

glneo commented May 15, 2024

Having your Step 2 code would help to provide a more comprehensive picture

Fair enough, I can post an RFC style PR with what I am thinking for the dynamic allocations step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants