Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wifi connection loss (possible memory leak?) #844

Open
alessiomiele opened this issue May 1, 2024 · 20 comments
Open

Wifi connection loss (possible memory leak?) #844

alessiomiele opened this issue May 1, 2024 · 20 comments

Comments

@alessiomiele
Copy link

Hi,

I have a problem using homespan. After some hours (~7/9 hours) wifi connection is lost. In order to narrow what the problem could be, I tried the example 1 unmodified and the issue persists. From the serial monitor I don't get the "*** WiFi Connection Lost!" message but I can't ping the device and, from the router page, I can't see the device so the connection is actually lost. I also noticed that free heap size decreases over time. I read somewhere in the internet that Wifi library uses heap someway, so could the problem be related to a memory leak somewhere?
During my debug I also noticed that the free heap size decreases when a paired Apple device is connected to wifi: if none of my Apple devices is connected to wifi the free heap size is constant over time.
So my idea is that heap is saturated after some hours causing wifi problems.
I'm currently using homespan 1.9.0, I've also tried version 1.9.0dev but same issue.
Rebooting the device solves the problem. Device runs for some hours and disconnects again.

@Liaco123
Copy link

Liaco123 commented May 8, 2024

I encountered the same issue as well.

@HomeSpan
Copy link
Owner

HomeSpan commented May 9, 2024

Can you provide a snapshot of some of the log (level L2) that is printed to the serial monitor showing what occurs during those 7 hours (not the whole 7 hours, just a sample). Is the device fully disconnecting from WiFi, such that it won't respond to a ping, to is it just disconnecting from HomeKit? One way to check is to enable the HomeSpan WebLog. If the WebLog does not respond to HTTP requests, then the device is likely not connected to WiFi at all.

@Liaco123
Copy link

Liaco123 commented May 9, 2024

  1. Through serial monitoring(using the command "m" in CLI ), I noticed that the free memory gradually decreases.
  • begin(21:36:47):
    2024-05-09 21_36_47-main cpp - HomeSpan-h-l - Visual Studio Code

  • end(21:47:32):

    2024-05-09 21_47_32-main cpp - HomeSpan-h-l - Visual Studio Code
  1. After losing the connection with HomeSpan, I am unable to access the WebLog page, which I could access before.

@HomeSpan
Copy link
Owner

HomeSpan commented May 9, 2024

And there is absolutely no other information provided in the log files with log level set to 2?

The decrease you have shown is very minimal. What version of the Arduino-ESP32 board manager are you using?

@alessiomiele
Copy link
Author

I can provide you a full log but I can't test it right now. All I can say is that the board loses wifi connection after some hours. I can't ping the device and, using WebLog, it's not reachable. My full code has many stuff to handle so I simply tried the example 1 untouched and I noticed the free memory decreasing I described in my first post. By the way, I'll provide you a full L2 log as soon as I can. As I said I noticed that memory seems to not increase if none of my Apple devices is connected to wifi. But I'll check this thing in my test.

@Liaco123
Copy link

Liaco123 commented May 9, 2024

=======================================
** Client #0 Connected: (7664 sec) 192.168.0.104 on Socket 4/16

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: Home\032Bridges._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------

PUBKEY(32) 922ED26D3E6B9332A9EDCAB3838FCFD02F90D4A38A519219713774BE8DDAB244
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.104)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) 0C9F3C417520A97918FBDA453044CFD594AA86D2E39932768E5371C7253E895323845472BB07DD46F71A3B7E0723A87868EA797FBD75D239E2636078A8CA300E
IDENTIFIER(17) 37433A42373A43423A31443A30453A4546
---------- END SUB-TLVS! ----------

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) BBBC950A4A427DA3F83D2B1696A942A15BCD2BFDAB667477A2B8BD8514EEAE66
STATE(1) 02
ENC.DATA(101) 966F52530C7FF1479EE17992DE533E9F6C44E0471DF658D94EB31A2B64AB0EA642B8D674D24F5C214A048F044AB9D9EBEA9C792359B3EFC24AA55F9E22FD124CCD80B0E6832704C07C4B28CB273F43333448A28BC0CAA53B9781777232988A2ED8988E7C6A
------------ SENT! --------------

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: Home\032Bridges._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) C718CBF3645653DEF8F2F387D2AF9043ADCA3ADBD811A516027671C39E03DA8912966765667EA7BBA0489F307B1F47C7CFE9197D3B0F4E5B29F92ECB04FDE543DD56976F1EBB08A4BFC97DCA31AF40E2A02DFEE719DFE6096032EC058B67F6227CC95B30CFAC376964C77CC95EAC3F474799F0B352DAC678
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.104)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) 9526D8C7AF6A3E20FAA8B65EB09DA62D2664111283343BF60FB1DDDFA4274429144F7CB596604D06A96EC65571A701D1424D2CDCD3C7A5DC1141569FC2145305
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.104 #### <<<<
GET /accessories HTTP/1.1
Host: Home\032Bridges._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #0 (192.168.0.104)...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 2134

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]}]},{"aid":2,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"LED1","format":"string","perms":["pr"]}]},{"iid":4,"type":"43","characteristics":[{"iid":5,"type":"25","value":1,"format":"bool","perms":["pr","pw","ev"]},{"iid":6,"type":"13","value":30,"format":"float","perms":["pr","pw","ev"]},{"iid":7,"type":"2F","value":76,"format":"float","perms":["pr","pw","ev"]},{"iid":8,"type":"8","value":100,"format":"int","minValue":5,"maxValue":100,"minStep":1,"perms":["pr","pw","ev"]}]}]},{"aid":3,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"LED2","format":"string","perms":["pr"]}]},{"iid":4,"type":"43","characteristics":[{"iid":5,"type":"25","value":1,"format":"bool","perms":["pr","pw","ev"]},{"iid":6,"type":"CE","value":500,"format":"uint32","perms":["pr","pw","ev"]},{"iid":7,"type":"8","value":100,"format":"int","perms":["pr","pw","ev"]}]}]},{"aid":4,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"Sensor","format":"string","perms":["pr"]}]},{"iid":4,"type":"8A","characteristics":[{"iid":5,"type":"11","value":21.2402,"format":"float","perms":["pr","ev"]},{"iid":6,"type":"75","value":1,"format":"bool","perms":["pr","ev"]}]},{"iid":7,"type":"82","characteristics":[{"iid":8,"type":"10","value":66.013,"format":"float","perms":["pr","ev"]},{"iid":9,"type":"75","value":1,"format":"bool","perms":["pr","ev"]}]}]},{"aid":5,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"风扇","format":"string","perms":["pr"]}]},{"iid":4,"type":"B7","characteristics":[{"iid":5,"type":"B0","value":0,"format":"uint8","perms":["pr","pw","ev"]},{"iid":6,"type":"29","value":0,"format":"float","minValue":0,"maxValue":100,"minStep":50,"perms":["pr","pw","ev"]}]}]}]}
-------- SENT ENCRYPTED! --------

The above is a part of the LOG2.
And the free memory is decreasing now.

        Allocated      Free   Largest       Low
        --------- --------- --------- ---------

Total Heap: 106552 151536 110580 143288
Internal: 106552 151536 110580 143288
PSRAM: 0 0 0 0

Lowest stack level: 3756 bytes (loopTask)
NVS Flash Partition: 200 of 504 records used

And this is the info showing in the WebLog of my device,
20240509-234634

By the way, the device have 4 accessories. And i will test the example 1 later.

@HomeSpan
Copy link
Owner

HomeSpan commented May 9, 2024

All that looks fine. The memory should not continually decrease just by being on. If HomeKit is not sending continuous updates, nothing in HomeSpan actually does anything. And having WiFi on does not drain memory on a cumulative basis.

But if HomeKit continually connects and disconnects, this could leave a memory leak. Hence, running just example 1 we need to see if there is some sort of repeated HomeKit connect/disconnect over the course of a few hours that gradually eroded all the memory.

@Liaco123
Copy link

2024-05-10 09_06_57-main cpp - HomeSpan-h-l - Visual Studio Code

The num of free memory when lossed the connecting showed at above

And the Device is trying to connect to Wi-Fi. but it does not work.

LOG2:

** Client #1 Connected: (24024 sec) 192.168.0.104 on Socket 3/16

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: Home\032Bridges._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
PUBKEY(32) 475C28506C0A10B87C4F0C76907DC1AA17D5AFF559EC5F1CFB4B0F1C7A21D73A
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #1 (192.168.0.104)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) 6B4DCD16AEEF550BBCCD4CD3AFBF0AB071FFA535A36C90C0E6B647210E98BD65748046F160AE7328A39318D8F54CA56E7C5C9B09F3974F2DE3579BCB107DC606
IDENTIFIER(17) 37433A42373A43423A31443A30453A4546
---------- END SUB-TLVS! ----------

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) F6404515CCA4980E248AD6FBD66C95365110FFF6E71C42C3EDD2142E34E48E75
STATE(1) 02
ENC.DATA(101) 7C474AEEE5D9F333A635BE422088DA298B78F863A4A5E3832C17A21425EC4D51115609AA3F9D6DF379FFBB3F3EB3C95667A68F25A37E44EF74247CB2A560A26E2CA97A00C9DD633BAD25E928886DEECDACAB4F7C4AF0ECA1ED39A3B5E7D8ADBD33F398A8E0
------------ SENT! --------------

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: Home\032Bridges._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) E1230093830E015920C4F43E04A684DB1A2A1B2E52BDC0EBD0324602235009F5993D26A8FF21CFCCA297DE7AC4DBDF1833DEAB91F268D257CCEA93C27E2405058C94483EE9817765330E19908F375CA26F48D7E7BDCEBB4BE5324C690D9733CD0F0369F96F341887D7AA49D29EE7FD0AFAA0CF774DBD9256
------------ END TLVS! ------------
In Pair Verify #1 (192.168.0.104)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) 114460B36EF8B89369F3D660A8D063903828E40494D167D4187D973E1D5634BE846C87FF648DA01E361A79F493E42323E4BF47DF87AF5590E1825F883AACBB04
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.104 #### <<<<
GET /accessories HTTP/1.1
Host: Home\032Bridges._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #1 (192.168.0.104)...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 2136

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]}]},{"aid":2,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"LED1","format":"string","perms":["pr"]}]},{"iid":4,"type":"43","characteristics":[{"iid":5,"type":"25","value":1,"format":"bool","perms":["pr","pw","ev"]},{"iid":6,"type":"13","value":160,"format":"float","perms":["pr","pw","ev"]},{"iid":7,"type":"2F","value":100,"format":"float","perms":["pr","pw","ev"]},{"iid":8,"type":"8","value":10,"format":"int","minValue":5,"maxValue":100,"minStep":1,"perms":["pr","pw","ev"]}]}]},{"aid":3,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"LED2","format":"string","perms":["pr"]}]},{"iid":4,"type":"43","characteristics":[{"iid":5,"type":"25","value":0,"format":"bool","perms":["pr","pw","ev"]},{"iid":6,"type":"CE","value":500,"format":"uint32","perms":["pr","pw","ev"]},{"iid":7,"type":"8","value":100,"format":"int","perms":["pr","pw","ev"]}]}]},{"aid":4,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]},{"iid":3,"type":"23","value":"Sensor","format":"string","perms":["pr"]}]},{"iid":4,"type":"8A","characteristics":[{"iid":5,"type":Trying to connect to TP-LINK_2581. Waiting 1 sec...
Trying to connect to TP-LINK_2581. Waiting 2 sec...
Trying to connect to TP-LINK_2581. Waiting 4 sec...
Trying to connect to TP-LINK_2581. Waiting 8 sec...
Trying to connect to TP-LINK_2581. Waiting 16 sec...

*** Can't connect to TP-LINK_2581. You may type 'W ' to re-configure WiFi, or 'X ' to erase WiFi credentials. Will try connecting again in 60 seconds.

@HomeSpan
Copy link
Owner

The lack of memory you indicate above is very likely the reason why the WiFi is failing. However, we need to figure out why the memory is eroding. This can either be because of something in your sketch that using memory but not freeing it up, or an edge-case in HomeSpan itself. To determine the cause, you must use example 1 since that is a known sketch.

Before pairing example 1, make sure to unpair any existing sketches you used from the Home App and then erase all the stored info on the ESP32 using the 'E' command before you re-pair the sketch with Example 1.

Once you verify Example 1 is working as expected, note the memory usage using the 'm' command and set log level to L2. Then monitor the serial log output for the full day until the device fails. You should seed almost nothing in the log file with the exception of some potential periodic calls from HomeKit to the the device to check version numbers (sometimes the Home app does this, other times it does not). What we need to look for are unexpected requests. Assuming you are running a Home Hub (which is required for HomeKit and HomeSpan), you should not see a lot of new Verifications as you see above. Once a connection is completed and verified, it should stay open indefinitely. However, it could be that your router is dropping the connection (or the MDNS records, which is VERY common for some routers), which causes HomeKit to re-connect to the ESP32 with a fresh verification process. I have found the re-verification process has a small memory leak of a few hundred bytes, which eventually is reclaimed. If the Home App tries to re-verify a connection every 20 minutes, you may see a slow drain on the memory, but after a full day it will only be something like 10K. However, if the Home App is re-verifying connections every minute or less, this might explain the issue.

That's what we need to find out to identify the problem.

@Liaco123
Copy link

I tried the method you suggested. The free memory is constantly decreasing, and the serial output verifies the message about every 2-5 minutes.

21:31:07

2024-05-10 21_31_07-PIO Home - Memory_Test - Visual Studio Code

21:32:15

2024-05-10 21_32_15-PIO Home - Memory_Test - Visual Studio Code

As shown in the snapshots, free memory decreased 252 in only 1 minute and 8 seconds.

A part of LOG2:

** Client #1 Connected: (2010 sec) 192.168.0.100 on Socket 2/16

<<<<<<<<< 192.168.0.100 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
PUBKEY(32) 9A4D0EA02BBCFAD44D4DF74372FEDBB9FE77E00848170F9A744F657C6052793C
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #1 (192.168.0.100)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) 2D37197B19CAE887DFDC48A65AA397CAEEFB89110F65C854220F17C2FFB135F1532E4112ED64F15463DCEEF8C96F015C121C8EA4EDA8FFB646DA173DD6C31605
IDENTIFIER(17) 35373A34393A35353A38453A33443A4244
---------- END SUB-TLVS! ----------

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) 1223B640505D63E86F7A1FAA97C2554FC9CA34A87738A5DCC896AF5FB720A05E
STATE(1) 02
ENC.DATA(101) FBB73D1897EB15C031C8FB0DFDEC35D95B27AFB726DE76C9A19F0D95498E5B20B865F4879663E68679EE38A62071B502671E25A1E0F6A99EE903B07DCE8E2E5064B8688FDF781D658EE359E40C2941BD2781AC0B80046AD6F90E8646BE9D8228696C0BF90E
------------ SENT! --------------

<<<<<<<<< 192.168.0.100 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) 484B44EC616A012684D96F84BD4750C2B44A7DA7077C0B02C7A30452A353F4B3E5EDAB5987C36A506CFBCA0A227AE719A74687C33BB3FB757E040BBF2B7A48EDB1DDA1CD3E998EFCD9E4681BB16F09B749F794B07A6C8142CD9C003C52A28FF69001CF123BEFAD7CB0CE414317695EB7AF9BB0F094ECD2BD
------------ END TLVS! ------------
In Pair Verify #1 (192.168.0.100)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) 99449378F2534410C105CAD80162714F2C9C2D28332739364E1529CA220DABFA0E4A3B160D3504298D1B70248D51BA0F6493F3EA39B9E156B6DCC7C552D8A107
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.100 #### <<<<
GET /accessories HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #1 (192.168.0.100)...

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 250

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]},{"iid":3,"type":"43","characteristics":[{"iid":4,"type":"25","value":0,"format":"bool","perms":["pr","pw","ev"]}]}]}]}
-------- SENT ENCRYPTED! --------

<<<< #### 192.168.0.104 #### <<<<
PUT /characteristics HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 50
Content-Type: application/hap+json
------------ END BODY! ------------
{"characteristics":[{"aid":1,"iid":4,"ev":false}]}
------------ END JSON! ------------
In Put Characteristics #0 (192.168.0.104)...
Notification Request for aid=1 iid=4: false

192.168.0.104 >>>>>>>>>>
HTTP/1.1 204 No Content

-------- SENT ENCRYPTED! --------

=======================================
** Client #0 Connected: (2032 sec) 192.168.0.104 on Socket 3/16

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
PUBKEY(32) DB6EA3E5BF65EE8E763B1353B72EC73B43F82A015EEF2D7355A94AF425DC9534
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.104)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) BCF16C985187673E132F9EFB25CE9AD245157044DF3451AFACDC112170426C45F5F6DF4785D064116993EA172C7F5711C67C8BDD28ACFF9293DCB94BC9280804
IDENTIFIER(17) 35373A34393A35353A38453A33443A4244
---------- END SUB-TLVS! ----------

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) 85E133F270146DE4AD4297D8BC04BEF34E6C3FDC6E0A696C3F5AD0899AF40076
STATE(1) 02
ENC.DATA(101) F72D5BEDEBB101D01E24A95BA526CA4DDA0D079F53766390BE13A9208CA3FBA12492B821A43708EFAD3412AEEAB36BD44BCA0460D5E89FFFF1BB661B829C96938EBAED48867ABA38FB2CFC0D608DC02D47F574E2FEDB18FA3585513DE34916D976ECD6A300
------------ SENT! --------------

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) E9F7768BCFBCC5B257E5DD22C2ED8A52D49A6277B347C824FA7FB92024100006FFE1812C06A7F09C68FD2D2EE95680F88206CBCCD542DB806B41FC6659CE7F1C789C0E01E1FDFF7451EE471779E788A03C69878D955FBBF0EDF4F99BC3DF4CDDA28D09604180D428ACFEDAF722EA7076A2CB8DD42B6989DC
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.104)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) DB39AE03061FBD23C0DA44A9C7B14B16B701AD1A513C1A39906AFA69C9673F3FC82D852F08B5FE980C83D2622DA9151ADA4A983A16361E8FD5180C3609709902
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.104 #### <<<<
GET /accessories HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #0 (192.168.0.104)...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 250

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]},{"iid":3,"type":"43","characteristics":[{"iid":4,"type":"25","value":0,"format":"bool","perms":["pr","pw","ev"]}]}]}]}
-------- SENT ENCRYPTED! --------

=======================================
** Client #0 Connected: (2055 sec) 192.168.0.104 on Socket 4/16

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
PUBKEY(32) 5302E4D61D39AB4A48A69A503D2820AF27F8572AA43782E71169B0BF70A91377
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.104)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) 3F76107BCCA401E0A86EB01EB8B4BE4F4E64C3AD171F4071300BE5EA31261DECAA00392776D699E5691A08EF7386C4430308B46507AF09A21DA56A99BC18A204
IDENTIFIER(17) 35373A34393A35353A38453A33443A4244
---------- END SUB-TLVS! ----------

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) EBD5E27E03E1312B065C896F4FA76A6EF6A2661D3B10281610DFDC78CD5D9767
STATE(1) 02
ENC.DATA(101) CF1B41DBA3B31E1B084001B14F52C7860A449EA9D7A925BAE3B6CAE0E5CFBBE8035051FBFC750207122FD2DEEB5DFAB3BD985CB63EC27C07EFA83B7AC6B922AD4DE0C60A1ADAAB87AE75760B34BCEFF5471FE1782696F4329CDBF4965BFF505D2D5486A80E
------------ SENT! --------------

<<<<<<<<< 192.168.0.104 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) F726C841637B525FFF60FB092DF3DEFD1A4573D57FCE54DD5E0AB28105391A0359B09128014C05E4761EAC65575457354D098EF7D620B95B0297BD4344C0DF4EA0B568A59C3D8BC989C1377E58060BF9CB6EABA6A6BD32C2DF0C54822E33CFEA26FA99711B31586E291CE3F9BA13C7E51C0826930E2ADD97
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.104)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) C239A473C2B6FE976C38AEF98CC6884E229FBA83936052CCBECE40C28718320D10F548A5B73D879BA377F97DDA08D1BB5BC4DFC706B0D85CFC4F490C04690304
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.104 #### <<<<
GET /accessories HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #0 (192.168.0.104)...

192.168.0.104 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 250

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]},{"iid":3,"type":"43","characteristics":[{"iid":4,"type":"25","value":0,"format":"bool","perms":["pr","pw","ev"]}]}]}]}
-------- SENT ENCRYPTED! --------

=======================================
** Client #1 Connected: (2058 sec) 192.168.0.100 on Socket 3/16

<<<<<<<<< 192.168.0.100 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
PUBKEY(32) B636965FB9B74ABF09D7DE66F3A9F304CE746A7F7D23D11C33FFACCE19085933
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #1 (192.168.0.100)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) 9F5B6F966BA9E772915D8FE20C83814342030C91D999DB3993F309E81F3094B5C6FF2F860E6BC878B6F8EF4CBB4091BF8BB372B89A28FDAD906215517E814B0D
IDENTIFIER(17) 35373A34393A35353A38453A33443A4244
---------- END SUB-TLVS! ----------

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) F00556F77492B029A722FC889BF5AA6F5CADFAF7CB6FBE564EC509FEADC20B0D
STATE(1) 02
ENC.DATA(101) D5ADEDAA63B76CDD246E6720E235C78F90B2313DC1B57F5D8C981017318FF110B22625BD76B2A5F979D0660BC965E06925A42FF8FE564245E7A79599B09BE74DECD815EC7CE908F155F35D3ADD00B4029F6679ADE2CC154FBC4064B7B343172CEF632E7D85
------------ SENT! --------------

<<<<<<<<< 192.168.0.100 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) F2351DE7789093808A3888E49FF2A0E6139AB64A471BA6B735C43F5539AA82DB6AE45CB6DA2FB7FDFD07C10231CB56059A1D26477A69CED935122BEC3A02EA91EB5DF30DEDF25DFFB3466B5014B5841274FF1F9B4A78D162176F4D09EBB794E6D8430A6CB048AC4A1A361138499C974059C22EA9D7E889E3
------------ END TLVS! ------------
In Pair Verify #1 (192.168.0.100)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) 77963D890081E8F6F0EF4909D6A6194745E8E81098294A8E8FDD6A3E09077E01B30028F3B5156F4AC9856A1CC011C4052B79D07A272DB258EE42E192D2B8870D
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.100 #### <<<<
GET /accessories HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #1 (192.168.0.100)...

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 250

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]},{"iid":3,"type":"43","characteristics":[{"iid":4,"type":"25","value":0,"format":"bool","perms":["pr","pw","ev"]}]}]}]}
-------- SENT ENCRYPTED! --------

=======================================
** Client #0 Connected: (2097 sec) 192.168.0.100 on Socket 2/16

<<<<<<<<< 192.168.0.100 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 37
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
PUBKEY(32) DBDD1025099044EC2E4406521602A077803D364170886EBCB75C27F459D7AF32
STATE(1) 01
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.100)...Found
------- ENCRYPTING SUB-TLVS -------
SIGNATURE(64) C0CF0D713C6B234A35EE74B308C1154C5B1870F83EF3FB1F1585E084F04DE85CA4E28B113862E8506137BF1C816B4D138C0921A21833FBB9246DA7731D821F03
IDENTIFIER(17) 35373A34393A35353A38453A33443A4244
---------- END SUB-TLVS! ----------

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 140

PUBKEY(32) C3FC485112CC3393F4E57C75A5121658ECDC1BEA644FC968D6240483056E0A5C
STATE(1) 02
ENC.DATA(101) 7A4027378B9E784211ED1E44641D42DEAC684C73168A8E82EBC855F1ED9E0D976B611782DC118C65FAA0888F53F871D8C2D9F3D0EB4859A7D6511CADDD658900FDB539317DF1421DE7B04D2F78BA0010B06D7D11C4886A2E343EAB3645300F24DA8E950EE6
------------ SENT! --------------

<<<<<<<<< 192.168.0.100 <<<<<<<<<
POST /pair-verify HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
Content-Length: 125
Content-Type: application/pairing+tlv8
------------ END BODY! ------------
STATE(1) 03
ENC.DATA(120) DA9902B2CCC9601F2045ADF02348787A80A3BC525BEA9B32C6EF0D6EF47666521C65C698CC50067B07FAAADC039967B1949F9F8206820811E2BA375FED6244F2C227FDD1EBF50D66BE63F3BE91675CB180F1E8E3E44FBA030770A5FE99AC2D9A52A6CDDA9391E45ACC2B7665F2C72416114900CBA321795D
------------ END TLVS! ------------
In Pair Verify #0 (192.168.0.100)...Found
------- DECRYPTING SUB-TLVS -------
SIGNATURE(64) EAF34BBA808A73242DD683F4F2DCD6A40E93943C1CEA8010CB838DAA0FC3BA85A0A4B54C124D17FFA7D81B6F4B562484AECA5F405D7AF21DBF65DB7773336604
IDENTIFIER(36) 46343641333739362D343342412D343341392D394341322D304645353343383746413732
---------- END SUB-TLVS! ----------

*** Verifying session with Controller ID: F46A3796-43BA-43A9-9CA2-0FE53C87FA72...

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/pairing+tlv8
Content-Length: 3

STATE(1) 04
------------ SENT! --------------

*** SESSION VERIFICATION COMPLETE ***

<<<< #### 192.168.0.100 #### <<<<
GET /accessories HTTP/1.1
Host: HomeSpan\032LightBulb._hap._tcp.local
------------ END BODY! ------------
In Get Accessories #0 (192.168.0.100)...

192.168.0.100 >>>>>>>>>>
HTTP/1.1 200 OK
Content-Type: application/hap+json
Content-Length: 250

{"accessories":[{"aid":1,"services":[{"iid":1,"type":"3E","characteristics":[{"iid":2,"type":"14","format":"bool","perms":["pw"]}]},{"iid":3,"type":"43","characteristics":[{"iid":4,"type":"25","value":0,"format":"bool","perms":["pr","pw","ev"]}]}]}]}
-------- SENT ENCRYPTED! --------

@alessiomiele
Copy link
Author

The lack of memory you indicate above is very likely the reason why the WiFi is failing. However, we need to figure out why the memory is eroding. This can either be because of something in your sketch that using memory but not freeing it up, or an edge-case in HomeSpan itself. To determine the cause, you must use example 1 since that is a known sketch.

Before pairing example 1, make sure to unpair any existing sketches you used from the Home App and then erase all the stored info on the ESP32 using the 'E' command before you re-pair the sketch with Example 1.

Once you verify Example 1 is working as expected, note the memory usage using the 'm' command and set log level to L2. Then monitor the serial log output for the full day until the device fails. You should seed almost nothing in the log file with the exception of some potential periodic calls from HomeKit to the the device to check version numbers (sometimes the Home app does this, other times it does not). What we need to look for are unexpected requests. Assuming you are running a Home Hub (which is required for HomeKit and HomeSpan), you should not see a lot of new Verifications as you see above. Once a connection is completed and verified, it should stay open indefinitely. However, it could be that your router is dropping the connection (or the MDNS records, which is VERY common for some routers), which causes HomeKit to re-connect to the ESP32 with a fresh verification process. I have found the re-verification process has a small memory leak of a few hundred bytes, which eventually is reclaimed. If the Home App tries to re-verify a connection every 20 minutes, you may see a slow drain on the memory, but after a full day it will only be something like 10K. However, if the Home App is re-verifying connections every minute or less, this might explain the issue.

That's what we need to find out to identify the problem.

I'm running the test right now but I can see I have lots of verification requests. Is this related to me not using a Home Hub? A Home Hub is only required to communicate with HomeKit from a connection that is external to the local one, is it correct? Or am I missing something?

By the way, the board is running example 1 and I will post the full log when wifi fails. I added just a few lines of code just to print the memory usage every 60 seconds in order to not type 'm' every time, so you will see how the memory usage increases over time. But I think we found out it is related to verification process.

@VictorrFang
Copy link

I also conducted a similar test. I created two homes, each with a Homespan device. One of them was connected to a HomePod mini. After one night (7-9 hours), the Homespan device connected to the HomePod mini operated normally. However, the Homespan device in the home without the connected HomePod mini showed a Wi-Fi connection in progress LED indicator, but it remained unable to connect. Manual reconnection was required for it to operate normally.

@alessiomiele
Copy link
Author

log_L2.log

Here's my full log. After less than 4 hours the wifi failed. The log shows many verification requiests, so I think we found the issue. Is it related to the fact that I am not using a Home Hub?

@HomeSpan
Copy link
Owner

Thanks @VictorrFang, I think this confirms the issue. The problem stems from using HomeSpan without a Home Hub. Though Apple says it is possible to use a HomeKit device without a Home Hub (though with much more limited functionality), it seems Apple changed something in the Home App that causes near-continuous requests for a connection. This should never occur as HomeKit connections are supposed to stay open indefinitely. It used to be that if you did not use a Home Hub the Home App would disconnect after it was closed and only reconnect when opened again. But if you are seeing continued requests for a connection, Apple likely changed something.

It is also possible that the problem lies in the particular router being used, but your test strongly suggests that it is the Home App making the requests not the router.

I'll update the main HomeSpan page to indicate that a Home Hub is now required for any use of HomeSpan.

As for the memory leak itself, Ive been able to debug the issue using my own kit and it seems like the Home App is not properly disconnecting from the ESP32 interface, which leave the socket open. I've looked at the ESP32 code in detail, and though it recognizes the connection has been terminated, it does not seem to reclaim the memory used for the socket. I can check for this and ensure the socket is completely closed whenever HomeSpan detects a connection has abruptly terminated, which I think should solve the memory leakage (though I need to test this to confirm).

But regardless, even without the memory problem caused by abrupt disconnects, HomeKit is not supposed to operate with continued requests for a connection, so as noted above, I think this means a Home Hub needs to be used.

@alessiomiele
Copy link
Author

Thanks @HomeSpan. If you can solve the memory leakage let us know. I think this should allow to use HomeSpan even without a Home Hub even if the Home app keeps sending continuous requests.

@HomeSpan
Copy link
Owner

Sure thing - If I can resolve I'll add to next release and preview in the dev branch for you to test.

@Nezaemmy
Copy link

Thanks @VictorrFang, I think this confirms the issue. The problem stems from using HomeSpan without a Home Hub. Though Apple says it is possible to use a HomeKit device without a Home Hub (though with much more limited functionality), it seems Apple changed something in the Home App that causes near-continuous requests for a connection. This should never occur as HomeKit connections are supposed to stay open indefinitely. It used to be that if you did not use a Home Hub the Home App would disconnect after it was closed and only reconnect when opened again. But if you are seeing continued requests for a connection, Apple likely changed something.

It is also possible that the problem lies in the particular router being used, but your test strongly suggests that it is the Home App making the requests not the router.

I'll update the main HomeSpan page to indicate that a Home Hub is now required for any use of HomeSpan.

As for the memory leak itself, Ive been able to debug the issue using my own kit and it seems like the Home App is not properly disconnecting from the ESP32 interface, which leave the socket open. I've looked at the ESP32 code in detail, and though it recognizes the connection has been terminated, it does not seem to reclaim the memory used for the socket. I can check for this and ensure the socket is completely closed whenever HomeSpan detects a connection has abruptly terminated, which I think should solve the memory leakage (though I need to test this to confirm).

But regardless, even without the memory problem caused by abrupt disconnects, HomeKit is not supposed to operate with continued requests for a connection, so as noted above, I think this means a Home Hub needs to be used.

For me this issue started when I changed my old iPhone to the new iPhone running IOS 17 because it support new Homekit, my old iPhone was using IOS 15 with old Homekit I didn't get any issue. It seems like this change came with new Homekit on IOS 16.

@HomeSpan
Copy link
Owner

Are you using a Home Hub?

@Nezaemmy
Copy link

Nezaemmy commented May 11, 2024

Are you using a Home Hub?

yes now I'm using home hub everything is ok, but without home hub even reflesh of accessories take time, you need to close the home app and open it again. old homekit with ios 15 was running good without home hub

@HomeSpan
Copy link
Owner

Yes, Apple completely changed the way communication works around iOS 16.3 or so. But they changed it for usage with a Home Hub as well as without a Home Hub. So if you did not use a Home Hub you lost a lot of functionality as well as needed to often re-open the Home App to get an update.

In addition, there is a memory leak issue, which I've identified and will be fixing for the next release. This leak only impacts the case where you don't use a Home Hub under the new HomeKit architecture (this was not an issue prior to iOS 16).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants