Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySensors entities are not properly set when activating scenes/automations in Home Assistant #1529

Open
feanor-anglin opened this issue Jul 8, 2022 · 43 comments

Comments

@feanor-anglin
Copy link

feanor-anglin commented Jul 8, 2022

I have the same problem as described here: home-assistant/core#59388

When sending multiple messages at the same time from HA to MySensors nodes I frequently experience some messages remain undelivered. The author of the issue I've linked to has managed to identify the problem is on MySensors side, but haven't provided any details. In my opinion this is a really serious disadvantage, probably the most important I experience using MySensors in my project. Can we do something about it? :)

Additional details: I use Ethernet Gateway and RS485; the problem is also present when using Serial Gateway and wireless communication.

@virtual-maker
Copy link
Contributor

See also issue #1512.

@feanor-anglin Hi, do you also use signing? And do you use NRF24 radios for wireless?

@feanor-anglin
Copy link
Author

No, I don't use signing, since I use RS485 wired communication. The guy from home assistant issue said he used wireless.

@virtual-maker
Copy link
Contributor

I assume it has to do with RS485.
Can you provide debug logs from gateway and node? I think best is to start a discussion in MySensors forum for this.

@feanor-anglin
Copy link
Author

Unfortunately, I can't provide debug from gateway, since there is too little memory to enable it. But, the author of the issue I've linked has provided some.

2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 6;0;1;1;41;ffff60b8                                                                                                                           
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 7;0;1;1;2;1                                                                                                                                   
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 7;0;1;1;3;89
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 7;0;1;1;41;ffff00d6
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736274 TSF:MSG:SEND,0-0-6-6,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736274 TSF:MSG:SEND,0-0-6-6,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736282 !MCO:PRO:RC=1
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736282 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736321 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736321 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736328 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736328 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736332 TSF:MSG:READ,6-6-0,s=255,c=3,t=17,pt=6,l=25,sg=0:<NONCE>
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736332 TSF:MSG:READ,6-6-0,s=255,c=3,t=17,pt=6,l=25,sg=0:<NONCE>
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736339 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736339 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736378 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736378 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736385 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736385 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736389 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736389 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736428 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736428 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736435 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736435 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736439 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736439 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736442 !MCO:PRO:RC=1
...
2021-11-08 18:56:36 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107741282 !TSF:MSG:SIGN FAIL                                                                                                  
2021-11-08 18:56:36 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107741282 !TSF:MSG:SIGN FAIL

@mfalkvidd
Copy link
Member

Those messages show signing failures, so the cause of those messages is probably not related to your setup.

@feanor-anglin
Copy link
Author

feanor-anglin commented Jul 10, 2022

Ok, so it seems there are 2 kinds of problems with sending messages with short time interval: signing problem and RS485 problem. Shortly I will try to check if the issue is also applicable to wireless setup with no signing and RS485 setup under domoticz (serial and Ethernet).

Do you guys think we should move this discussion to mysensors forum?

@mfalkvidd
Copy link
Member

Github is better if the problem is related to the library code.
The forum is better for general troubleshooting, as there are many more people with practical experience of MySensors in the forum.

Since we're really at the troubleshooting stage, we might get better help in the forum. Hopefully someone with a RS485 setup can chime in.

@chytreg
Copy link

chytreg commented Sep 21, 2022

I've debugged exact same problem, I did not use signing. I've used PJON with gateway-node communication, but it's not on that layer IMHO. I could reproduce the issue even on the serial gateway without any node too.

You can simply flood the gateway with a lot of messages, and only part of the messages gets executed and the others are lost. That's why in many places you recommend using wait() to not flood the gateway. Unfortunately, the Home Assistant sends messages in async nearly at the same time.

In comparison, I've noticed that OpenHab configuration has the sendDelay property https://github.com/tobof/openhab-addons/wiki/Configuration#configuring-the-gateway, but didn't manage to test it.

Does the gateway implement any buffer for received messages?

@feanor-anglin
Copy link
Author

@chytreg I came to the same conclusion about this issue, very nice to hear you support it. The sendDelay you are talking about seems very nice workaround. It should be proposed to HA-MySensors integration devs.

@chytreg
Copy link

chytreg commented Sep 21, 2022

@feanor-anglin Looks like Domoticz implements sendDelay too.

As for HomeAssistant they are using pymysensors with the asyncio.
I'm not sure if this send method is blocking or not, but it's worth to try put some sleep there and check.

@feanor-anglin
Copy link
Author

@chytreg I've reported this issue to HA and included your findings. Hope something will be finally done with this.

@feanor-anglin
Copy link
Author

It seems HA dev requires the sendDelay parameter to be put into the official MySensors documentation. @mfalkvidd, are you the right person to ask for it?

@mfalkvidd
Copy link
Member

@feanor-anglin sorry, I don't understand. MySensors does not have anything called sendDelay. According to a comment above, openHab does have such a property. But would it really make sense for MySensors to document a property that exists in openHab?

@chytreg
Copy link

chytreg commented Sep 25, 2022

I did some tests, with HA and Domowicz here are my findings, https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031
My setup: ESP8266 as TCP Gateway + 1 Node on Arduino Mega, with 16 relays, I use PJON.

You can simply flood the gateway with a lot of messages, and only part of the messages gets executed and the others are lost. That's why in many places you recommend using wait() to not flood the gateway. Unfortunately, the Home Assistant sends messages in async nearly at the same time

It's not about flooding the gateway, it's about that node requires some time to process the messages. I think it's not even a HA vs Domowicz thing. In my simple node sketch when I remove wait(100) I receive a lot of NACK on node part.

The same happens in HA, because it sends all the command in the same time to the Node.
Domoticz on the other hand send command in the queue with some delay so the Node has time to process them.

There is one thing I don't understand why I need to use wait(100) is it hardware limitation or software one? Thus I believe Domowicz / OpenHub implemented send in synchronous / queued way. cc @mfalkvidd

I did some recordings:
Domoticz: Turn ON/OFF all relays => NO NACKs on Node

IMG_2361.1.mov

HA: Turn ON all relays => NACKs on Node

IMG_2362.1.mov

@chytreg
Copy link

chytreg commented Sep 25, 2022

@MartinHjelmare Could you take a look, I'm trying to understand where is the problem, is it hardware limitation of message processing or it's software issue with MySensors. Why other controllers like Domoticz or OpenHub implemented communication with MySensors a bit different than HA.

@virtual-maker
Copy link
Contributor

@mfalkvidd, @feanor-anglin
Hello, I'm also completely unclear who @mfalkvidd is supposed to ask for what! @feanor-anglin a little more explanation would be helpful.

I did some googling and found a parameter send_delay in the Pilight integration of HA. I think from HA point of view Pilight is a similar usecase as MySensors.

HA Pilight send_delay
@feanor-anglin Do you mean something like this?

@chytreg
Copy link

chytreg commented Sep 25, 2022

@virtual-maker Smth like that, could solve the problem. Domoticz and OpenHub, implemented it that way. My main question is why it's even needed? Why I have to wait(100) between send messages to avoid NACKs on the Node level.

@virtual-maker
Copy link
Contributor

@chytreg I have no experience with PJON transport. I'm using NRF24. And there is a problem when two or more nodes send at almost the same time which produces collisions of the telegrams with NACKs for both nodes.

But what I see from your video is you have a gateway and only one node with 16 relays connected to it. Is this right?

@chytreg
Copy link

chytreg commented Sep 25, 2022

@virtual-maker

I have no experience with PJON transport. I'm using NRF24

Imho transport doesn't matter, I was able to reproduce the same problem on USB serial and HA which trigger the scene.

But what I see from your video is you have a gateway and only one node with 16 relays connected to it. Is this right?

Yes

And there is a problem when two or more nodes send at almost the same time which produces collisions of the telegrams with NACKs for both nodes

Hmm, imho HA sends turn_on message to all 16 relays in the same time, maybe that's the problem? Domoticz does one by one message. The messages goes nicely though gateway, but the NACKs pop up on Node https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031 all the logs in gist.

@virtual-maker
Copy link
Contributor

@chytreg Thank your for your logs. This is helpful!

This is from your HA log:

22:39:31.564 > 477355 TSF:MSG:READ,0-0-3,s=14,c=1,t=2,pt=0,l=1,sg=0:1
22:39:31.570 > 477360 TSF:MSG:ECHO REQ
22:39:31.573 > 477362 !PJON:SND:FAIL
22:39:31.576 > 477365 !TSF:MSG:SEND,3-3-0-0,s=14,c=1,t=2,pt=0,l=1,sg=0,ft=4,st=NACK:1

You can see there is a separate telegram from gateway to node for each relay with additional echo request. This echo tells HA the new state of the relay (to display this actual new state in HA screen). But this echo send fails often for your setup.

I assume that the gateway try to send the next telegram for the next relay and at same time the node tries to send the echo.
And I further assume that PJON transport layer is not able to handle telegram collisions but only says "NACK".

@virtual-maker
Copy link
Contributor

With "handle" I mean automatically resolve the collision.

@chytreg
Copy link

chytreg commented Sep 25, 2022

@virtual-maker This is part of the logs when I comment out wait from https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031#file-arduino_mega_pjon_multi_realy_node-cpp-L58 & https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031#file-arduino_mega_pjon_multi_realy_node-cpp-L27

22:28:28.877 > 5048 TSF:MSG:SEND,3-3-0-0,s=4,c=1,t=2,pt=2,l=2,sg=0,ft=0,st=OK:0
22:28:28.923 > 5083 !PJON:SND:FAIL
22:28:28.923 > 5085 !TSF:MSG:SEND,3-3-0-0,s=5,c=1,t=2,pt=2,l=2,sg=0,ft=0,st=NACK:0
22:28:28.948 > 5120 !PJON:SND:FAIL
22:28:28.948 > 5122 !TSF:MSG:SEND,3-3-0-0,s=6,c=1,t=2,pt=2,l=2,sg=0,ft=1,st=NACK:0
22:28:28.984 > 5156 TSF:MSG:SEND,3-3-0-0,s=7,c=1,t=2,pt=2,l=2,sg=0,ft=2,st=OK:0
22:28:29.022 > 5190 !PJON:SND:FAIL
22:28:29.022 > 5192 !TSF:MSG:SEND,3-3-0-0,s=8,c=1,t=2,pt=2,l=2,sg=0,ft=0,st=NACK:0
22:28:29.056 > 5227 !PJON:SND:FAIL
22:28:29.056 > 5229 !TSF:MSG:SEND,3-3-0-0,s=9,c=1,t=2,pt=2,l=2,sg=0,ft=1,st=NACK:1

In your opinion, is this also a collision?
Anyway, is it possible to handle it somehow with a queue on a Gateway level, or Node level to throttle messages? (@mfalkvidd)
HA does not have throttling option implemented (that's why we have collisions), I suppose Domoticz and OpenHub kinda have one, thus scene works.

@virtual-maker
Copy link
Contributor

No, this is not a collision. But when you send two or more messages one after other then you shall add a wait() to give the receiver some ms to process the message. With NRF24 I use 10ms but this depends from the data transfer rate and will differ with PJON for sure. In your code (without wait()) you flood the gateway from the node side. Basically to send a telegram takes less time than the gateway is able to process it.

@feanor-anglin
Copy link
Author

feanor-anglin commented Oct 4, 2022

@feanor-anglin sorry, I don't understand. MySensors does not have anything called sendDelay. According to a comment above, openHab does have such a property. But would it really make sense for MySensors to document a property that exists in openHab?

As we managed to determine, sendDelay exists in openHab and Domoticz and seems to provide smooth operation with MySensors, while Home Assistant usually fails when trying to send more than one message to MySensors gateway at once. My proposition was to put sendDelay (or however you call it) as a requirement to controllers into the official MySensors documentation.

@feanor-anglin
Copy link
Author

Hey, @mfalkvidd have you seen my answer above? Does it make sense to you now?

@mfalkvidd
Copy link
Member

Thanks for the reminder @feanor-anglin

Yes I have seen it, but I still have trouble understanding.

Is the sendDelay a setting configurable by the user, or is it a fixed value?

If it is a fixed value, what value do you recommend?
If it is a user setting, what would the default value be (maybe 0, to not cause unnecessary problems?) How would the user determine if they need to change the setting, and how does the user know what value to set?

@mfalkvidd
Copy link
Member

btw, is anyone aware of any existing MySensors requirement documentation for Controllers? I think it would make sense to put this new requirement in the same place, but I don't know where that place is.

@feanor-anglin
Copy link
Author

@mfalkvidd these are all very good questions, since sendDelay is more like a workaround, than a real solution.

As far as I understand it (please let me know if you agree), while there is no queue implemented in gateway, the real solution should be like this:

  1. Controller sends a message to node.
  2. Gateway passes the message.
  3. Node receives the messages and send back an ack.
  4. The ack is then passed to the controller.
  5. Only after receiving an ack the controller is allowed to send another message.*

*Clearly, with a defined (or definable) timeout period.

sendDelay is a more primitive solution in which the controller waits a given period of time between following messages hoping everything works as expected.

Referring the details you asked about:
sendDelay: configurable by the user in ms (it depends from many factors, e.g. transmission baud rate, how large the value should be), default may be 0 or few ms; 2 bytes should be enough I suppose, 0-65536.
The same rules could apply to the timeout I described as the preferred option.

If there is no such thing as 'MySensors requirements for controllers documentation', it should certainly come to existence, giving the limitations of gateway capabilities.

@mfalkvidd
Copy link
Member

Yes, the acknowledgement method could be better, if MySensors supported acknowledgement. Adding support for acknowledgement has been discussed extensively, but I don't think such support will be added in the near term so the sendDelay solution is the realistic way to go.

I have created https://www.mysensors.org/controller/requirements as a placeholder for controller requirements.

@feanor-anglin
Copy link
Author

I don't entirely understand what you mean by "MySensors doesn't support ack". I always use ack e.g. under MYSController and nodes always send it. Maybe you mean that ack is not implemented on transport layer? But this is not really an issue here.

@mfalkvidd
Copy link
Member

I have no knowledge of MYSController. But yes, MySensors does not support ack on the transport layer, which means that it is impossible for any layer to reliably acknowledge that a message reached its destination.

@mfalkvidd
Copy link
Member

There are some special cases where ack is possible, but that is not something a Controller can rely on.

@feanor-anglin
Copy link
Author

Hello again @mfalkvidd, could you please put the info about controller requirements to MySensors website?
home-assistant/core#79028 (comment)

@mfalkvidd
Copy link
Member

mfalkvidd commented Nov 1, 2022

@feanor-anglin sure. Have we determined what those requirements are? They are not clear to me, so I would need help understanding if I am supposed to write them.

@feanor-anglin
Copy link
Author

Until now, we determined that asynchronous method of sending commands to the gateway is bad. We also determined that acknowledgement method is not possible right now due to MySensors limitations. The workaround is to define a sendDelay parameter, which should be configurable by the user and which should work as a delay between subsequent messages from controller to gateway.

@mfalkvidd
Copy link
Member

@feanor-anglin how about the text present on https://www.mysensors.org/controller/requirements now? (added below in case the site has caching issues that prevent you from seeing the updated version)


From github discussion:
In some cases[1], asynchronous method of sending commands to the gateway is bad. MySensors does not support acknowledgement. The workaround is to define a sendDelay parameter, which should be configurable by the user. The default value should be 0, to not introduce unnecessary delays for users that do not need this feature. A controller should wait sendDelay milliseconds between each consecutive message sent from the controller to the gateway. The queue should be implemented per gateway. The controlley may allow the user to set separate sendDelay values for each gateway.

See https://github.com/tobof/openhab-addons/wiki/Configuration#configuring-the-gateway for how OpenHab has implemented this feature.

[1]: When (enough of) the following parameters are fulfilled:

  • There are multiple messages (how many depends on a lot of factors)
  • The target node needs some time to process the message (more time might make the problem worse)
  • Repeater nodes in the MySensors network might make the problem worse
  • Radio interference might make the problem worse (for wireless MySensors networks)
  • Competing traffic on the MySensors network (for example node to node traffic or echo messages) might make the problem worse (applies to wired and wireless MySensors networks)
  • Unstable power supply to nrf24 radios might make the problem worse

@MartinHjelmare
Copy link
Contributor

MartinHjelmare commented Nov 1, 2022

I'm not in favor of adding a user configurable option for the send delay to Home Assistant UI. If a delay is needed a recommendation for that time should be defined by MySensors.

Alternatively a config option should be added to the MySensors API, in I_CONFIG, so the user can configure this in the gateway sketch and the controller can query the gateway for the configured delay.

@mfalkvidd
Copy link
Member

@MartinHjelmare a generic recommendation is impossible. The appropriate delay will vary depending on all the bullet points listed in the recommendation.

Using I_CONFIG might make sense, but that is data FROM the controller, not from the gateway. isMetric (which is the only I_CONFIG setting) is set in the Controller. The gateway fetches this setting from the Controller. There is no way to set isMetric from the Gateway.

Documentation for I_CONFIG: https://www.mysensors.org/download/sensor_api_20#controller-configuration

@MartinHjelmare
Copy link
Contributor

Yeah, I know I_CONFIG is from the controller to gateway. Either extend it to make it go both ways or add a new type similar to I_CONFIG for the other way.

@feanor-anglin
Copy link
Author

I'm not in favor of adding a user configurable option for the send delay to Home Assistant UI

But why? Adding it to the same dialog box where user chooses gateway variant and library version would be a neat solution in my opinion.

@MartinHjelmare
Copy link
Contributor

Adding options increases maintenance and the complexity to set up the controller. This option isn't required for setting up the communication with the gateway so it should not be part of the controller options.

@feanor-anglin
Copy link
Author

This option isn't required for setting up the communication with the gateway

Actually it is, at least if we are talking about setting it up accurately and reliably. Let me be more descriptive though. Imagine you have a set of all possible combinations of messages from controller to gateway. It would be something like this:
{1 message at a time, 2 messages at a time, 3 messages at a time, ..., n messages at a time}

Ignoring the simple fact that MySensors gateway needs a sendDelay results in the controller would be able to handle only first few cases from this set. I've determined in my setup, that 3 messages at a time is maximum what HA can handle quite reliably. For full reliability 3 is too much though. This seems like quite poor performance to me.

@mfalkvidd
Copy link
Member

Yes, that is quite bad performance. See https://www.youtube.com/watch?v=Mq9ad47P9cI for a demo where thet keypad is a MySensors node sending keypresses to the MySensors gateway so the situation might not be 100% comparable but I don't think the direction of the messages matter that much - the radio link will probably be the bottleneck in both cases. This demo is also a good example of when setting a non-zero delay would completely destroy the user experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants