Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconnect Problem. #1047

Open
tmsd2001 opened this issue Mar 21, 2024 · 11 comments
Open

Reconnect Problem. #1047

tmsd2001 opened this issue Mar 21, 2024 · 11 comments

Comments

@tmsd2001
Copy link

I've been seeing this problem for a long time and I'm surprised it hasn't been mentioned yet, but maybe I'm thinking wrong.
If the MQTT connection is lost and a reconnect is carried out, there is a loop until the connection is back, but if the connection is interrupted because the WLAN connection is missing, you cannot get out of the loop.
I ask within the reconnect whether the WiFi connection is established and if not, I first jump into the WiFi reconnect.

@jescarri
Copy link

in the sketch loop() function
I've seen it happening, I assumed it was a check the program had to do:

  1. Check if WiFi is active then carry on with network operations
if Wireless is connected then: 
    connect_to_mqtt with X retries
    if mqtt_connected: then
       do all sketch operations that require network and mqtt.
   end if
else
 deep_sleep for some time
endif

@PhySix66
Copy link

PhySix66 commented Apr 4, 2024

I've a simmilar issue.
When I want to change my MQTT settings via HTTP and reenable/restart the service my ESP8266 reboots.

I've traced the source of the crash to this:

boolean PubSubClient::connected() 
{
	...
	else {	
		rc = (int)_client->connected(); // ERROR Here If ReConnecting ->(?) Client::connected()
		// this is the code where it crashes
		
		//Fix Test 1 
		// rc = (int)_client[0].connected(); // Tryed this as well, not good
		//Fix Test 2							// Not good, fails
		/*
		if((int)_client->connected() > 0){
			rc = true;
			//Serial.println(F("_client[0].connected() == true"));
		}else{
			rc = false;
			//Serial.println(F("_client[0].connected() == false"));
		}*/
	...
	}
	return rc;
}

Specificly:
rc = (int)_client->connected();

If my deductions are corrent than this points to this Library:

In Client.h

class Client: public Stream {
	public:
	...
	virtual uint8_t connected() = 0;	//	No idea where this points to
	// this is the end for my rabithole
	...

};

But after this I was unable to go further.

Had a thought that I may have overused the number of avaliable TCP/UDP connections, so I've disabled DDNS and my RUDP connection but no change.

@abdosn
Copy link

abdosn commented Apr 16, 2024

@PhySix66
in my sketch i use this

WiFiClient G_espClient; 
PubSubClient G_PubSubClient(G_espClient);

when you create PubSubClient object, you pass the 'WiFiClient' object

So

virtual uint8_t connected() = 0; // No idea where this points to

This not the function that is called
In WiFiClient class there is a function uint8_t WiFiClient::connected() that overrides the function in Client class

which is

uint8_t WiFiClient::connected()
{
    if (!_client || _client->state() == CLOSED)
        return 0;
  return _client->state() == ESTABLISHED || available();
}

@PhySix66
Copy link

PhySix66 commented Apr 16, 2024 via email

@abdosn
Copy link

abdosn commented Apr 17, 2024

I think when you're restarting mqtt something happen to _client inside PubSubClient object - maybe gets destroyed or becomes null -
try to check on that variable

if it's available share the code where you're restarting MQTT

I suggest using stack dump tool to know on what line your code stopped before reset

@PhySix66
Copy link

PhySix66 commented Apr 18, 2024 via email

@abdosn
Copy link

abdosn commented Apr 18, 2024

It's not necessary to be null to give an exception
could be some malloced memory then freed or something

Also according to Kolban book - which is a very good reference if you're dealing with ESP - Exception 9 is LoadStoreAlignmentCause so it's memory thing i guess

the problem is _client as you commented
i think when you're restarting service the pointer to WiFiClient object is changed or get freed

so give this a try

recreate an object to WiFiClient and use the function PubSubClient::setclient() in your reconnecting function

@PhySix66
Copy link

PhySix66 commented Apr 22, 2024 via email

@PhySix66
Copy link

PhySix66 commented Apr 26, 2024 via email

@PhySix66
Copy link

PhySix66 commented Apr 27, 2024 via email

@PhySix66
Copy link

Problem Solved!
Problem Source: USER (a.k.a ME)

I've made a bad, untested/unverified assumption based from the Arduinos Examples, namely:
example/home-assistaint-integration/multi-switch

From this as the basis:

void setup() {
    // you don't need to verify return status
    Ethernet.begin(mac);

    switch1.setName("Pretty label 1");
    switch1.setIcon("mdi:lightbulb");
    switch1.onCommand(onSwitchCommand);

    switch2.setName("Pretty label 2");
    switch2.setIcon("mdi:lightbulb");
    switch2.onCommand(onSwitchCommand);    

    mqtt.begin(BROKER_ADDR);
}

I made these Foo()-s:

void InitMQTT_Device()
{
          device.setUniqueId(esp_mac.b, sizeof(esp_mac));
          Serial.print(F("MQTT UniqueID :")); Serial.println(device.getUniqueId());
          
          // set device's details (optional)
          //device.setName(ESP_HostName.c_str());
          device.setName(ESP_HostName);
          Serial.print(F("ESP_HostName :")); Serial.println(ESP_HostName);
          device.setManufacturer("PhySix66");
          device.setModel(esp_model_name);
          device.setSoftwareVersion("0.0.1");
          
          // This method enables availability for all device types registered on the device.
          // For example, if you have 5 sensors on the same device, you can enable
          // shared availability and change availability state of all sensors using
          // single method call "device.setAvailability(false|true)"
          //device.enableSharedAvailability();
      
          // Optionally, you can enable MQTT LWT feature. If device will lose connection
          // to the broker, all device types related to it will be marked as offline in
          // the Home Assistant Panel.
          //device.enableLastWill();
}

void InitMQTT_Switches()
{        
          // handle switch state (multi-switch) - OutPut via MCP23x17
          switch0.setName("Switch0");
          //switch0.setIcon("mdi:lightbulb");
          switch0.setIcon("mdi:toggle-switch");
          //switch0.setRetain(true);              //  Sets retain flag for the switch command. If set to true the command produced by Home Assistant will be retained.
          switch0.onCommand(onSwitchCommand);
          
          switch1.setName("Switch1");
          switch1.setIcon("mdi:toggle-switch");
          switch1.onCommand(onSwitchCommand);    
          //....
          switch7.setName("Switch7");
          switch7.setIcon("mdi:toggle-switch");
          switch7.onCommand(onSwitchCommand);
}

void startMQTT()
{
     if(mqtt_flags & MQTT_FLAG_EN)
     {      
       if(!(AreBitSet(mqtt_flags, (MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT))))
       {
         Serial.println(F("mqtt.h: startMQTT() Not Inited()"));
         
         IPAddress tempIP(0,0,0,0);
         if(!WiFi.hostByName(BROKER_ADDR, tempIP, 800)) { // Get the IP address of the NTP server
          Serial.print(F("MQTT Req DNS lookup failed for "));   Serial.println(BROKER_ADDR);
          mqtt_flags &= ~MQTT_FLAG_SERVER_AV;
         }
         else
         {
            Serial.print(F("MQTT Req DNS lookup Success for "));   Serial.println(BROKER_ADDR);
            Serial.print(F("IP Addr is "));                        Serial.println(tempIP);
            mqtt_flags |= MQTT_FLAG_SERVER_AV;
         }
  
        if(mqtt_flags & MQTT_FLAG_SERVER_AV)
        {        
		  **InitMQTT_Device();	// <<-	One of these caused the ERROR.
		  InitMQTT_Switches();	// <<-	One of these caused the ERROR.**
		  
		  if(mqtt_flags & MQTT_FLAG_USE_CREDENTIALS)
          {
            // use this for mqtt-with-credentials
            mqtt.begin(MQTT_ServerName, mqtt_port, mqtt_server_user_name, mqtt_server_password);
          }
          else
          {
            if(mqtt.begin(MQTT_ServerName) == true)
            {
              Serial.println(F("mqtt.begin == true"));
            }
            else
            {
              Serial.println(F("mqtt.begin == false"));
            }
          }
     
          mqtt_flags |= MQTT_FLAG_INIT;
          Serial.print(F("mqtt_flags: "));    Serial.println(mqtt_flags);
          Serial.println(F("MQTT is Inited"));
        }
      }
    }
    else if(mqtt.isConnected())
    {
      mqtt_flags &= ~(MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT);
      mqtt.disconnect();
      Serial.println(F("MQTT DisConnected"));
    }
}

And every time, when I reenabled the MQTT via startMQTT(), then I've also "reinitialized" it's device and switches... Or so I thought.
Not sure about the details of what's happening, but I only assume, that during the second call of InitMQTT_Device() and InitMQTT_Switches() some memory allocation/curroption occurs, that messed up the Client* _client (var/pointer?) in the PubSubClient.cpp.

Note to noobs like me:
To find the problem, I did a reinstall of the librarys where I poked arround (added some extra code): home-assistant-integration and the pubsubclient, and went back to square one. After the reflashing, with basicly the same code as in the exmaple, it still crashed. This got me thinking and lead me to the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants