Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command acknowledgement deadlock inside the publish loop #11

Open
nik-markovic opened this issue Sep 17, 2021 · 0 comments · May be fixed by #15
Open

Command acknowledgement deadlock inside the publish loop #11

nik-markovic opened this issue Sep 17, 2021 · 0 comments · May be fixed by #15
Assignees

Comments

@nik-markovic
Copy link
Collaborator

nik-markovic commented Sep 17, 2021

When command is received while telemetry is being published, this deadlock could occur:

  • iotc_nrf_mqtt_publish(telemetry) [ sys_mutex_lock(&mutex_mqtt_pub, K_FOREVER); ]
  • iotc_nrf_mqtt_loop();
  • mqtt_handle_rx()
  • event_notify()
  • mqtt_evt_handler(MQTT_EVT_PUBLISH)
  • config->data_cb()
  • iotc_on_mqtt_data()
  • iotcl_process_event()
  • iotc_process_callback()
  • on_command()
  • iotconnect_sdk_send_packet(ack) [ attempts to get lock which is not released]

Somewhere this ends up releasing with message "get_event_payload: EAGAIN", but only after a long time.

I am also concerned about calling iotc_nrf_mqtt_loop() inside iotc_nrf_mqtt_publish(). We may end up using up too much stack because this could also result in an inbound mqtt message from broker. iotc_nrf_mqtt_loop() should probably only be called from main to be able to better estimate stack usage. This could maybe even cause recursion. I am not sure how much of an impact this change would have on MQTT confirmation code.

@syjen Assigning to you, but feel free to reassign this ticket to someone else on your team.

@nik-markovic nik-markovic changed the title Command acknowledgement deadlock inside a publish loop Command acknowledgement deadlock inside the publish loop Sep 17, 2021
@syjen syjen assigned alanlhc and unassigned syjen Sep 20, 2021
@alanlhc alanlhc linked a pull request Oct 4, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants