Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic HTTP class: Thingspeak and Async OTA #1909

Open
wants to merge 13 commits into
base: dev
Choose a base branch
from

Conversation

mcspr
Copy link
Collaborator

@mcspr mcspr commented Sep 12, 2019

Reuse thingspeak http enhancements with ota module: handle lwip2 low memory, handle status errors, handle lifetime (WIP)

It is pretty basic, only "real" HTTP parsing is for status and body. Expected way to properly do that is by using llparse // llhttp instead of rolling our own http parsing: #1835 (comment)
But I want to try to integrate it with async web server first.

Thingspeak should work, OTA was not tested at all. Using draft PR as a means to track this.

@mcspr mcspr changed the title WIP Generic HTTP Generic HTTP class: Thingspeak and Async OTA Sep 18, 2019
@mcspr mcspr marked this pull request as ready for review September 18, 2019 14:57
@mcspr mcspr force-pushed the util/http branch 2 times, most recently from 0880125 to 1653786 Compare September 19, 2019 04:45
@mcspr
Copy link
Collaborator Author

mcspr commented Sep 19, 2019

3076bc9 addresses the continuous sending, but buffer is held by the api user.
https://www.nongnu.org/lwip/2_1_x/group__tcp__raw.html#ga6b2aa0efbf10e254930332b7c89cd8c5
notes that the pointer needs to live until ack is received, but I am a bit fuzzy how this works without it 🤔 , but because Arduino lwipopts.h enables LWIP_NETIF_TX_SINGLE_PBUF for both lwip 1 and 2, FLAG_COPY is always added to the apiflags argument: lwip-src/src/core/tcp_out.c:tcp_write(...)

Anyhow, maybe it is better to follow the ESPAsyncTCP buffered client and keep window-sized / mss-sized buffer until ack is received and only then call the client's callback for the next chunk. At least that way client does not need to allocate anything and directly write to such storage (maybe even as a Stream input)

edit: Current Core also has a more up-to-date code regarding buffering:
https://github.com/esp8266/Arduino/blob/master/libraries/ESP8266WiFi/src/include/DataSource.h
https://github.com/esp8266/Arduino/blob/5d609fd29475c44c7b752b6cbf4128527f4977d3/libraries/ESP8266WiFi/src/include/ClientContext.h#L478-L533
edit2: Another thing is API... Condition for client == nullptr is looking really wonky. Perhaps some other data structure can manage body callbacks. Main issue is chunked encoding, which is not hard to implement now, but in that case it needs to handle 0 as a success return value.

Tested with Thingspeak API on low-mem LWIP locally and connecting to "api.thingspeak.com"
ota <url> finally works with python -mhttp.server

@mcspr
Copy link
Collaborator Author

mcspr commented Sep 19, 2019

@mhightower83 sorry to "summon" you and bringing up other pr, ref me-no-dev/ESPAsyncTCP#115
As far as I understood, error-related changes there are only required for the SyncClient operation i.e. intermixing loop & sys contexts? It feels like I am silently avoiding this specific issue by not touching asynclient object at all and keeping it part of the parent object as a simple member (and not cleaning that promptly too)
But I don't think I've seen tcp_err callback crashing like me-no-dev/ESPAsyncTCP#94 describes

And also thanks for the comment there regarding LWIP_NETIF_TX_SINGLE_PBUF, finally that explains no need for data copy flag.

@mhightower83
Copy link

mhightower83 commented Sep 22, 2019

@mcspr Sorry, I missed your "summon" :) I have xoseperez/espurna in my watch list so your message got drowned out by the other topics. I am surprised there is no additional annotation for the @ reference.

me-no-dev/ESPAsyncTCP#94 was interesting. I think I was just lucky :) I was using the fauxmoESP (wemo version) with around 10 devices configured. The fewer the devices the better it worked. The other element was a Windows system with some additional software that wanted to do UPnP discovery, a lot! With the Windows system off, my sketch stayed up most of the day. With it on, a few hours at most. The crashes appeared to be related to TCP RSTs sent to idle TCP connections on the ESP8266. Back then fauxmoESP did not have client->setRxTimeout(), so there were more lingering connections.

With this fixed, the problem with NULL _client emerged in SyncClient type interfaces that may have done a yield, etc.

Another corner case was in fauxmoESP::_onTCPClient when all of the _tcpClients were used up and control passed down to client->close(true);. This could result in generating an ERR_ABORT which must be returned to lwIP by the callback that did the abort. However, there was no place in ESPAsyncTCP to preserve this result and pass it back. Because, in the client's onDisconnect callback handler, it would delete AsyncClient.

While my modified version of faumoESP diverged a bit from its roots, I don't remember anything specific I had to change for these issues. These issues were dealt with in the ESPAsyncTCP. It was some effort to preserve the API calling interface and make things work.

Correction I remembered one bug: You may want to change these lines in fauxmoESP:
https://bitbucket.org/xoseperez/fauxmoesp/src/f60c46d80f9b8a400319ab8af9af2957eeb276aa/src/fauxmoESP.cpp#lines-345,346
Reverse lines 345 and 346, i in the DEBUG_MSG_FAUXMO statement becomes invalid after delete c;

It feels like I am silently avoiding this specific issue by not touching asynclient object at all and keeping it part of the parent object as a simple member (and not cleaning that promptly too)

One thought, if you are not ready to clean the AsyncClient object from the parent object, don't delete AsyncClient in the onDisconnect callback. Keep it so you can reference the object's state. I would then think it would be safe. IMHO deleting AsyncClient in onDisconnect is tricky and makes my head hurt. :)

Edited: Added @

@mcspr
Copy link
Collaborator Author

mcspr commented Sep 23, 2019

Thank you! Telnet server uses the same technique btw, but I hope that causes less problems than more aggressive WeMo / Hue discovery. I would probably re-do it though, checking for connection state later in loop, sort of like wifiserver does. fauxmo stuff is more tricky, because I am both not an active user and not really sure which one emulation layer should've stayed at all (if not both at the same time).

Using updated ESPAsyncTCP bumps .bin size by ~6kb and is currently failing for ssl builds, however those broken, https://travis-ci.org/mcspr/espurna/jobs/587857201#L522 so this still bums me out :/ (why it wasn't esp/netconn underneath in the first place...)

@mhightower83
Copy link

mhightower83 commented Sep 23, 2019

@mcspr I am really sorry about the broken SSL builds. There are so many build permutations. I really thought I had done that one; however, I can tell by the remaining unused variable warnings that I had missed it. I have to get rid of those things or I cannot see my errors. I didn't have any examples for ESPAsyncTCP with SSL to build against; however, I thought I had at least set the flag and ran compile. Again I am really sorry.

Here is a drop-in replacement for the function that call is in.

err_t AsyncServer::_poll(tcp_pcb* pcb){
  err_t err = ERR_OK;
  if(!tcp_ssl_has_client() && _pending){
    struct pending_pcb * p = _pending;
    if(p->pcb == pcb){
      _pending = _pending->next;
    } else {
      while(p->next && p->next->pcb != pcb) p = p->next;
      if(!p->next) return 0;
      struct pending_pcb * b = p->next;
      p->next = b->next;
      p = b;
    }
    ASYNC_TCP_DEBUG("### remove from wait: %d\n", _clients_waiting);
    AsyncClient *c = new (std::nothrow) AsyncClient(pcb, _ssl_ctx);
    if(c){
      c->onConnect([this](void * arg, AsyncClient *c){
        (void)arg;
        _connect_cb(_connect_cb_arg, c);
      }, this);
      if(p->pb) {
        auto errorTracker = c->getACErrorTracker();
        c->_recv(errorTracker, pcb, p->pb, 0);
        err = errorTracker->getCallbackCloseError();
      }
    }
    // Should there be error handling for when "new AsynClient" fails??
    free(p);
  }
  return err;
}

With this replacement, it should at least compile. If you know of any examples that use ESPAsyncTCP with SSL, that you can point me to? I can do a little more.

@mcspr
Copy link
Collaborator Author

mcspr commented Sep 23, 2019

No need to be sorry :) While we track down that it builds, idk if I would really advice using it right now since it only supports axtls builds (and as the result tls1.1). I have tried to compile someone else's work regarding bearssl (ref: me-no-dev/ESPAsyncTCP#95 (comment)), which does include a small SyncClient example, but I wanted this PR implemented first to have at least something else to test with.

Not really small examples, but there are also 2 other users of ESPAsyncTCP & SSL that I know about:
https://github.com/esphome/esphome
https://github.com/homieiot/homie-esp8266/tree/develop-v3
Which in turn depend on async-mqtt-client and it does have a small example:
https://github.com/marvinroger/async-mqtt-client/blob/master/examples/FullyFeaturedSSL/src/main.cpp

nvm my comment about the size, I was looking at travis01 .bin wrong because it includes influxdb support there. real increase is +928 .irom0.text, +224 .rodata, compared on local build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants