Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Waiting for multiple events to be signaled to protect a critical section #244

Open
Kojox opened this issue Apr 5, 2023 · 2 comments

Comments

@Kojox
Copy link

Kojox commented Apr 5, 2023

In my project I have multiple resources (meaning POD) which should not be modified concurrently. So each task should wait until a specific set of resources it needs is available, lock them all at once, perform some work and unlock them again.

With the use case of only one resource it is easy to do, since I can just wait for an marl::Event and defer the events signal once the task has finished modifying the resource.

Now with multiple resources there is the danger of running into a deadlock if two tasks require the same resources and already acquired the locks for a subset of the required resources without releasing them in a way one tasks gets all resources at once.
Example: Task A and Task B both need Resource A and Resource B. Task A gets Resource A -> Task B gets Resource B -> Task A wants Resource B and Task B wants Resource A.

At first I was looking for something like marl::Event::all similar to marl::Event::any but there doesn't seem to such a thing.
Then I used marl::Event::test to free all previously acquired resources (just calling marl::Event::signal) if the test returned false.

At first this seemed to work but strangely after a while the frame time gets worse and worse. This happens only when the code below is included.
In this case two such tasks are scheduled every frame and as work they just count through a for loop (not even accessing the resources). The only difference which results in ever increasing frame time is if they try to acquire all resources like below. Using the Tracy profiler it looks like the time between the tasks being executed by the scheduler increases.

I wonder if there is a better way to do this with marl or if there are still some errors with this code.

Here is the code I currently use (just rewritten for a fixed subset of two resources):

auto task = [=] {
defer(velocityResourceAvailable.signal());
defer(positionResourceAvailable.signal());
{
	bool allResourcesAcquired = false;
	std::vector<marl::Event> requiredResources = { positionResourceAvailable , velocityResourceAvailable };

	while (allResourcesAcquired == false)
	{
		marl::Event::any(requiredResources.begin(), requiredResources.end()).wait();
		if (positionResourceAvailable.test())
		{
			if (velocityResourceAvailable.test())
			{
				allResourcesAcquired = true;
			}
			else
			{
				positionResourceAvailable.signal();
			}
		}
	}

	// do work ...

@Kojox
Copy link
Author

Kojox commented Apr 6, 2023

Ok, it turned out that the repeated construction of marl::Event::any caused the increased frame time. Pulling it out of the task solved it.

But I still would like to get some feedback if there is a better solution to this problem.

@Kojox Kojox closed this as not planned Won't fix, can't repro, duplicate, stale Apr 6, 2023
@Kojox Kojox reopened this Apr 6, 2023
@ben-clayton
Copy link
Member

Hi @Kojox,

Sorry for the slow reply. I've been busy with other things.

I don't know if this would work for you, but maybe a marl::WaitGroup could be used?

  • The initial WaitGroup count is set to the number of resources that you need to wait on.
  • When a resource becomes available you call WaitGroup::done().
  • If one of the resources is made unavailable again, you call WaitGroup::add().
  • To wait for all the resources to become available, you call WaitGroup::wait().

If resources can be taken at any point by another task, then you may need a new overload of WaitGroup::wait() that calls a callback function with the mutex held - otherwise between the time the function returns and the resources are used, one of the resources could have been taken again.

This solution does require a separate WaitGroup per set of resources. I don't know if that makes the approach unviable for you.

Cheers,
Ben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants