Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libtapasco] Exclusive Access and DMA Buffer allocation wear off after one failed attempt #296

Open
zyno42 opened this issue Aug 16, 2021 · 7 comments

Comments

@zyno42
Copy link
Contributor

zyno42 commented Aug 16, 2021

When I'm having one Host application running with exclusive access (in this case tapasco-debug in Debug Mode) and then start another runtime application which tries to acquire exclusive access to the same device this results in the following errors in the first two attempts but then from the third attempt on it succeeds.

$ cargo run --
    Finished dev [unoptimized + debuginfo] target(s) in 0.09s
     Running `target/debug/tapasco_runtime`
An error occurred: Failed to initialize TLKM object: Could not create device: DMA Error: Could not allocate DMA buffer EMFILE: Too many open files
$ cargo run --
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/tapasco_runtime`
An error occurred: Failed to decode TLKM device: Could not acquire desired mode TlkmAccessExclusive for device 0: EBUSY: Device or resource busy
$ cargo run --
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/tapasco_runtime`

The first error comes from libtapasco allocating all 32 DMA Buffers from TLKM at initialization time which also happens only once.

@jahofmann
Copy link
Contributor

jahofmann commented Aug 16, 2021

Could you run the driver in debug mode and post the corresponding dmesg output?

@zyno42
Copy link
Contributor Author

zyno42 commented Aug 17, 2021

Sure. This is the corresponding log:

dmesg_tapasco_issue_296.log

@jahofmann
Copy link
Contributor

jahofmann commented Aug 17, 2021

It might be enough to add a new DMAControl implementation in https://github.com/esa-tu-darmstadt/tapasco/blob/master/runtime/libtapasco/src/dma.rs that simply does nothing and use that by at

allocator.push(Arc::new(OffchipMemory {
and
dma: Box::new(DriverDMA::new(&tlkm_dma_file)),
and
dma: Box::new(VfioDMA::new(&tlkm_dma_file, &vfio_dev)),

Lastly, the correct DMA engine has to be loaded and unloaded in

pub fn change_access(&mut self, access: tlkm_access) -> Result<()> {

Otherwise the DMA engine is initialized even for monitor only applications.

@zyno42
Copy link
Contributor Author

zyno42 commented Aug 30, 2021

Thank you for your suggestions. I've seen through them and I think I haven't stated the problem clearly enough:

The problem is that if a device is exclusively acquired, another application receives the correct EBUSY error only once.

zyno42 added a commit to zyno42/tapasco that referenced this issue Sep 6, 2021
Implement suggestion of @jahofmann in:
esa-tu-darmstadt#296 (comment)

When a new device is created a dummy implementation of the DMA Engine is
used that simply does nothing. The actual DMA Engine is initialized
later when the access mode is changed from monitor to exclusive mode.

This is necessary to prevent a monitoring application like
`tapasco-debug` to allocate all DMA Buffers of the kernel driver.
@zyno42
Copy link
Contributor Author

zyno42 commented Sep 6, 2021

I've implemented your suggestions. However, this produces another error message when tapasco-debug runs in monitor mode and another host application runs in exclusive mode:
Failed to initialize TLKM object: Could not create device: Scheduler Error: PE Error: Error during interrupt handling: Could not register eventfd with driver: EFAULT: Bad address

As in the previous implementation this error wears off after one retry.

@jahofmann
Copy link
Contributor

jahofmann commented Sep 6, 2021

I fear this is a similar problem. When the PEs are created, the runtime will also allocate and register eventfd for interrupt handling. This step needs to be postponed until the access mode is exclusive as monitoring apps should not receive the interrupts.

There are also some guards needed around the wait for PE functions to avoid deadlocks if the interrupts have not been set.

This should make the runtime play fine with the driver, but all these checks should also be in the driver so it does not simply crash if "held wrong". In any case this is more for future reference than your work ;)

zyno42 added a commit to zyno42/tapasco that referenced this issue Mar 18, 2022
Also leave some instructions on how to enable it again.

Debug mode is currently the same as unsafe mode due to libtapasco
not handling this case of access mode.
Additionally, the DMA Buffers, Interrupts, etc. are also allocated in
Monitor Mode which leads to the problem that you have to start your
other runtime twice. For this case I've added a comment in the help
section of `tapasco-debug`.
@cahz
Copy link
Member

cahz commented Mar 22, 2023

Might be fixed in #328?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants