Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When clearing an endpoint stall, reset DTOG #1537

Merged
merged 2 commits into from Apr 16, 2024

Conversation

pigrew
Copy link
Collaborator

@pigrew pigrew commented Jun 29, 2022

According to the USB spec, the DTOG of an endpoint should always be reset to DATA0 when receiving CLEAR_FEATURE(ENDPOINT_HALT), regardless of if the endpoint had been stalled (See citations in #1529).

I believe the attached changes should be made to USBD to facilitate this requirement.

However, some issues to discuss are:

  1. When stalling endpoints (even when not in response to a control request), do we need to purge the event queue from transfers on that endpoint?
  2. When SET_CONFIGURATION() is received with the active configuration, does DTOG need to be reset? Does the class driver need to be reset? (Currently the code treats SET_CONFIG(active config) as a NOOP.)
  3. Is there a better way to tell if an EP is a control endpoint, besides checking its address? (USB devices may have multiple control endpoints) Or perhaps we can remove the address check and unconditionally call usbd_edpt_stall?
  4. Will these changes cause deadlocks or extra bugs in existing class drivers? (e.g., class initiates OUT transfer, host clear(halt) on the EP, class doesn't react to the transaction being cancelled)? Though perhaps the class drivers would already not have properly handled these situations, so it wouldn't be any more broken than it is now?
  5. Or do we need to push this logic into class drivers?
  6. Are there error cases when a class driver should be able to force the endpoint to stay stalled, even when the host requests that it be cleared? (I don't need this at the moment, but this was mentioned in the USBTMC spec).
  7. Perhaps a simpler solution would be to make the usbd_edpt_clear_stall unconditionally clear the stall... but does that run into race conditions?

(If this is merged, then the workaround in #1531 can be removed from the USBTMC driver)

@pigrew
Copy link
Collaborator Author

pigrew commented Jul 4, 2022

(I've force-pushed this branch with a revision to clean up handling of default control pipe.)

I've convinced myself that this patch removes some bugs, but we still need to examine how endpoint halts interact with class libraries and in-progress transfers.

In particular, I'm worried about race conditions like:

  1. Host sends BULK OUT packets (packet 1 through 4 of 8 packet transfer), and device class handles these packets.
  2. Host sends BULK OUT packet (perhaps the 5th packet of an 8 packet transfer).
  3. Host sends CLEAR_FEATURE(endpoint_halt)
  4. DCD queues transfer events in order of EP within interrupt: first the clear feature and then the BULK OUT message
  5. USBD handles the CLEAR_FEATURE (class responds by resetting its state machine).
  6. USBD dispatches the bulk-out transfer to the class driver
  7. Class driver becomes confused because it received a packet without the proper header. (only the 1st packet of the transfer has the correct message header.)

A possible deadlock may be:

  1. Class starts transfer on BULK OUT (EP will ACK)
  2. Host sends CLEAR_FEATURE on EP
  3. Device sets endpoint to NAK, based on clear halt
  4. Class never receives the transfer, because EP is in NAK state, and waits forever.
    (Does class need to re-start the transfer? Should DCD have set EP to ACK when clearing halt since there was a pending transfer?)

Another issue is the case that the class driver wants to reject the endpoint halt from being cleared.

@WilheJo
Copy link

WilheJo commented Feb 7, 2024

@hathach could you possibly review this PR as well?

I process 1,2MSPS (yes, I know, the ADC is a little overclocked) in the background.
This data is streamed via a ISO transfer to the PC.
Besides this, I have usbtmc for other purposes.
Additionally, there's a MSC interface for writing calibration data, update,...
PR #2392 fixes most problems I had so far, but.....

This works ok, but I get timeouts under linux from py-usbtmc during stress-testing.

Applying this change fixes at least for me.

@hathach hathach marked this pull request as ready for review April 16, 2024 03:28
Copy link
Owner

@hathach hathach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the Pr, I am so sorry for late response. I have rebase and update PR to skip the local endpoint stalled status, and just carry the request from host i.e clear stall even if it is not stalled previously. This will pass to the dcd which should reset data toggle. In case dt is not reset, it is bug from dcd and should be fixed there.

@hathach hathach merged commit fcb2df8 into hathach:master Apr 16, 2024
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants