Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenThreadBorder router stop working after 3 or 4 days - hard restart needed for solving #3582

Open
jmcollin78 opened this issue Apr 27, 2024 · 1 comment

Comments

@jmcollin78
Copy link

Describe the issue you are experiencing

Regularly I lost all my Thread devices (see logs). They become unresponsive, I have some errors each time I try to active it.
OTBR logs seems no more connected to the zdongle key.

For solving the issue I need a hardware restart of the PI 4 (stop all, disconnect power, wait a few, restart all).

Note: It is may be a hardware issue or more a firmware issue but I'm not able to diagnose this.

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Which add-on are you reporting an issue with?

OpenThread Border Router

What is the version of the add-on?

2.6.0

Steps to reproduce the issue

  1. start HA
  2. wait 3 or 4 days
  3. all thread devices are becoming not operationnal.

System Health information

System Information

version core-2024.4.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.1.73-haos-raspi
arch aarch64
timezone Europe/Paris
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 5000
Installed Version 1.34.0
Stage running
Available Repositories 1407
Downloaded Repositories 46
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 12.2
update_channel stable
supervisor_version supervisor-2024.04.0
agent_version 1.6.0
docker_version 25.0.5
disk_total 457.7 GB
disk_used 28.2 GB
healthy true
supported true
board rpi4-64
supervisor_api ok
version_api ok
installed_addons Home Assistant Google Drive Backup (0.112.1), Samba share (12.3.1), InfluxDB (5.0.0), Glances (0.21.1), Let's Encrypt (5.0.18), NGINX Home Assistant SSL proxy (3.9.0), SQLite Web (4.1.2), AppDaemon (0.16.5), Piper (1.5.0), Whisper (2.0.0), Mosquitto broker (6.4.0), Zigbee2MQTT (1.36.1-1), Studio Code Server (5.15.0), Silicon Labs Flasher (0.2.3), Zigbee2MQTT Edge (edge), Matter Server (5.5.1), OpenThread Border Router (2.6.0)
Dashboards
dashboards 8
resources 26
views 44
mode storage
Recorder
oldest_recorder_run 20 avril 2024 à 05:39
current_recorder_run 27 avril 2024 à 11:03
estimated_db_size 621.57 MiB
database_engine sqlite
database_version 3.44.2
Sonoff
version 3.7.2 (eb0a208)
cloud_online 9 / 9
local_online 9 / 9
debug failed to load:

Anything in the Supervisor logs that might be useful for us?

-

Anything in the add-on logs that might be useful for us?

otbr-agent[172]: 00:49:02.021 [N] MeshForwarder-: Dropping (reassembly queue) IPv6 UDP msg, len:1052, chksum:e78e, ecn:no, sec:yes, error:ReassemblyTimeout, prio:normal, rss:-51.0, radio:15.4
otbr-agent[172]: 00:49:02.021 [N] MeshForwarder-:     src:[fdcb:f913:457:1:f488:240e:dd6d:bbd0]:5540
otbr-agent[172]: 00:49:02.022 [N] MeshForwarder-:     dst:[fdcb:f913:457:1:2767:702e:bcc4:1db6]:46678
otbr-agent[172]: 00:49:02.384 [N] MeshForwarder-: Dropping rx frag frame, error:Drop, len:88, src:0x6400, dst:0xa000, tag:36930, offset:904, dglen:1052, sec:yes
otbr-agent[172]: 00:49:03.102 [N] MeshForwarder-: Dropping rx frag frame, error:Drop, len:60, src:0x6400, dst:0xa000, tag:36930, offset:992, dglen:1052, sec:yes
otbr-agent[172]: 01:18:02.067 [N] MeshForwarder-: Failed to send IPv6 UDP msg, len:129, chksum:f1ea, ecn:no, to:521afc403cb77285, sec:no, error:NoAck, prio:net, radio:15.4
otbr-agent[172]: 01:18:02.067 [N] MeshForwarder-:     src:[fe80:0:0:0:20f2:98c4:22bc:e356]:19788
otbr-agent[172]: 01:18:02.067 [N] MeshForwarder-:     dst:[fe80:0:0:0:501a:fc40:3cb7:7285]:19788

Additional information

No response

@jmcollin78
Copy link
Author

More errors in the Matter server addon:

2024-04-27 11:09:08.810 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:127139758 on exchange 15321i with Node: <0000000000000010, 1> sendCount: 4 max retries: 4
2024-04-27 11:09:19.348 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:127139759 on exchange 15321i with Node: <0000000000000010, 1> sendCount: 4 max retries: 4
2024-04-27 11:09:22.390 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 15321i with Node: <0000000000000010, 1>
2024-04-27 11:11:10.341 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:127139769 on exchange 15324i with Node: <0000000000000010, 1> sendCount: 4 max retries: 4
2024-04-27 11:11:13.623 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 15324i with Node: <0000000000000010, 1>
2024-04-27 11:14:41.879 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:156904424 on exchange 15326i with Node: <0000000000000010, 1> sendCount: 4 max retries: 4
2024-04-27 11:14:44.905 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 15326i with Node: <0000000000000010, 1>
2024-04-27 11:18:52.639 (MainThread) ERROR [matter_server.server.device_controller.node_16] 

ATTENTION: Node 16 (S4) did not complete setup in 15 minutes.
This is an indication of a (connectivity) issue with this device. 
IP-address in use for this device: unknown
Try powercycling this device and/or relocate it closer to a Border Router or 
WiFi Access Point. If this issue persists, please create an issue report on 
the Matter channel of the Home Assistant Discord server or on Github:
https://github.com/home-assistant/core/issues/new?assignees=&labels=integration%3A%20matter&projects=&template=bug_report.yml

2024-04-27 11:19:35.009 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:111154929 on exchange 15328i with Node: <0000000000000010, 1> sendCount: 4 max retries: 4
2024-04-27 11:19:38.590 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 15328i with Node: <0000000000000010, 1>
2024-04-27 11:28:09.996 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:111154936 on exchange 15330i with Node: <0000000000000010, 1> sendCount: 4 max retries: 4
2024-04-27 11:28:12.853 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 15330i with Node: <0000000000000010, 1>
2024-04-27 11:30:55.783 (Dummy-2) CHIP_ERROR [chip.native.DMG] Subscription Liveness timeout with SubscriptionID = 0x0a784f9a, Peer = 01:000000000000000E
2024-04-27 11:30:55.784 (MainThread) INFO [matter_server.server.device_controller.node_14] Previous subscription failed with Error: 50, re-subscribing in 0 ms...
2024-04-27 11:31:09.497 (MainThread) INFO [root] Re-subscription succeeded!
2024-04-27 11:31:09.497 (MainThread) INFO [matter_server.server.device_controller.node_14] Re-Subscription succeeded
2024-04-27 11:33:52.641 (MainThread) ERROR [matter_server.server.device_controller.node_16] 

ATTENTION: Node 16 (S4) did not complete setup in 30 minutes.
This is an indication of a (connectivity) issue with this device. 
IP-address in use for this device: fdcb:f913:457:1:f488:240e:dd6d:bbd0
Try powercycling this device and/or relocate it closer to a Border Router or 
WiFi Access Point. If this issue persists, please create an issue report on 
the Matter channel of the Home Assistant Discord server or on Github:
https://github.com/home-assistant/core/issues/new?assignees=&labels=integration%3A%20matter&projects=&template=bug_report.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant