Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeRTOS BLE "CordioM" task not receiving notifications #699

Open
khpeterson opened this issue Aug 6, 2023 · 3 comments
Open

FreeRTOS BLE "CordioM" task not receiving notifications #699

khpeterson opened this issue Aug 6, 2023 · 3 comments
Assignees

Comments

@khpeterson
Copy link
Contributor

khpeterson commented Aug 6, 2023

I am working off of my fork which is based on 136b2f7 + #685

While running repeats of my BLE streaming application (based on BLE_FreeRTOS), occasionally I end up in a state where the CordioM task is no longer able to receive notifications. I traced this to inconsistent state in the FreeRTOS task control block and what seems like a race condition/synchronization bug. I don't know how to fix the problem but I do know for sure that the CordioM task state variable ucNotifyState is not taskWAITING_NOTIFICATION even though the task is on the xSuspendedTaskList when notification is attempted. And because of this the task is never added to the ready list despite the notification.

I instrumented xTaskGenericNotify to check for this condition:

            traceTASK_NOTIFY( uxIndexToNotify );

            /* If the task is in the blocked state specifically to wait for a
             * notification then unblock it now. */
            /********************************************************************/
	    /* tasks on delayed list must be waiting for notification */
	    if ((strcmp(pxTCB->pcTaskName, "CordioM") == 0) &&
		listLIST_ITEM_CONTAINER( &( pxTCB->xStateListItem ) ) == &xSuspendedTaskList) {
		//configASSERT(ucOriginalNotifyState == taskWAITING_NOTIFICATION);
		if (ucOriginalNotifyState != taskWAITING_NOTIFICATION) {
			notify_out_of_sync_original_state = ucOriginalNotifyState;
			notify_out_of_sync_count++;
		}
	    }
           /********************************************************************/          
            if( ucOriginalNotifyState == taskWAITING_NOTIFICATION )

and then detect and report the condition in my app's run loop:

		extern volatile uint32_t notify_out_of_sync_count;
		extern volatile uint8_t notify_out_of_sync_original_state;
		extern uint32_t last_notify_out_of_sync_count;
		if (last_notify_out_of_sync_count != notify_out_of_sync_count) {
			TRACE_ERR2("detected notify out of sync, notify_out_of_sync_count = %d, original_state = %d", notify_out_of_sync_count, notify_out_of_sync_original_state);
			last_notify_out_of_sync_count = notify_out_of_sync_count;
		}

While running repeats, sometimes after just a few minutes, sometimes after several hours, I see in my console log:

khp@Kevins-MacBook-Pro-M1 % tail -f logs/repeats_20230806_ble_hang.log | grep "out of sync"
2023-08-06T15:50:22.828292: runLoop: detected notify out of sync, notify_out_of_sync_count = 1, original_state = 0

Reviewing the code in Libraries/FreeRTOS/Source/tasks.c it's not clear to me how a task can ever be in the taskNOT_WAITING_NOTIFICATION state and be suspended, but I don't have a lot of experience with this code. Once in this state the CordioM task will never respond to notifications and BLE is essentially hung.

@EdwinFairchild
Copy link
Contributor

@khpeterson Hey Kevin, is related to #685 ? Does that PR fix this issues or still persists? Do you have a minimal application you can share so I can try to reproduce?

@khpeterson
Copy link
Contributor Author

khpeterson commented Aug 21, 2023

Hi @EdwinFairchild, thanks for taking a look. For sure this issue is not related to #685 - I am running with my fork which includes that pull request.

Unfortunately, I do not have a pared down app that will easily reproduce the dropped notifications. From what I can tell the issue is timing dependent and given that it sometimes takes hours of continuous BLE streaming with my full app to see, it may be pretty hard to create one.

I have a lot on my plate at the moment but my first thought would be to recreate the Wsf message queue/dispatch and then send/receive messages continuously with 2 tasks with varying load conditions (e.g. random durations via MXC_Delay).

My suspicion is that the critical section protection set up in the port of FreeRTOS is not bullet proof, but I can't prove that.

My current work around is to force both the msg task and the event task back into the waiting notification state when I detect the out of sync condition. With this (and #685) in place I can stream BLE over night without a problem.

@khpeterson
Copy link
Contributor Author

Hi @EdwinFairchild, I was able to reproduce this with a minimal application (attached). Please build with this branch of my fork to pick up the change for the wsfTimer leak as well as an assertion that checks for (ucOriginalNotifyState == taskWAITING_NOTIFICATION) when the task being notified is on the suspended list.

I'm running on a vanilla MAX32666FTHR with a shell logging from the UART console connecting to BLE using ADI Attach (version 1.0.1+2 on an iOS 16.6 iPhone 10) : connect to DATS and enable notifications for the "ARM Proprietary Data Service" and let it run. Timing varies, and it may take a couple of hours, but eventually you will see the assertion I added to FreeRTOS/Source/tasks.c.

BLE_FreeRTOS_v2023_06_notif_out_of_sync.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants