callbacks on natsConnection are called after natsConnection_destroy #735

thierryba · 2024-03-20T08:04:58Z

Observed behavior

callbacks, especially ClosedCB and DisconnectedCB are called after the call to natsConnection_Destroy

Expected behavior

Either those should not get called or there should be a way to silence those.

Server and client version

I used the nats.c 3.8.0 connecting to demo.nats.io (see the code).

Host environment

No response

Steps to reproduce

I wrote a little c++ example program. It has a class called MessagingController. I took advantage of the fact that we ca set a closure on the callbacks to pass a pointer to that object so that in the callbacks it can redirect the calls to it. That works really well except when I destruct my MessagingController instance.
In that case, some callbacks are called after the destructor of that object. I believe my program uses a very common pattern for encapsulating the nats connection functionality.
But there is no easy way to prevent callbacks on my MEssagingController instance after the destructor.
In my real code, I ended up putting all the "valid" MessagingController instance in a static set so that I can check for validity there.

I would be very grateful for any insight on this one.

Here is the code (c++17):

#include <nats/nats.h>
#include <memory>
#include <iostream>


template<class T> using UniquePtr = std::unique_ptr<T, void (*)(T*)>;

namespace {
void closedCB(natsConnection*, void* closure)
{
    std::cout << __func__ << ": " << closure << std::endl;
}

void connectedStatusChangedCB(natsConnection*, void* closure)
{
    std::cout << __func__ << ": " << closure << std::endl;
}
}


class MessagingController
{
public:
    MessagingController()
    {
        std::cout << __func__<< ": " << this << std::endl;

        //connect
        UniquePtr<natsOptions> opts{nullptr, nullptr};
        {
            natsOptions* opts_ = nullptr;
            if (natsStatus ns = natsOptions_Create(&opts_); ns == NATS_OK) {
                opts = {opts_, natsOptions_Destroy};
            } else {
                throw 1;
            }

            natsOptions_SetClosedCB(opts.get(), closedCB, this);
            natsOptions_SetDisconnectedCB(opts.get(), connectedStatusChangedCB, this);


            static const char *server = "demo.nats.io";
            natsOptions_SetServers(opts.get(), &server, 1);

            natsConnection *conn = nullptr;

            if (natsConnection_Connect(&conn, opts.get()); conn) {
                connection_ = {conn, natsConnection_Destroy};
            } else {
                throw 2;
            }
        }
    }

    ~MessagingController()
    {
        std::cout << __func__<< ": " << this << std::endl;
    }

private:
    UniquePtr<natsConnection> connection_{nullptr, nullptr};
};


int main()
{
    try {
        MessagingController controller;
    } catch(int err) {
        std::cerr << "ERROR: " << err << std::endl;
    }
    std::cout << "Here the controller is clearly already deleted (any callback after this will crash for sure)" << std::endl;
}

The text was updated successfully, but these errors were encountered:

thierryba · 2024-03-20T08:14:14Z

an example output for the execution of this program is:

MessagingController: 0x16ef0e4c0
~MessagingController: 0x16ef0e4c0
connectedStatusChangedCB: 0x16ef0e4c0
Here the controller is clearly already deleted (any callback after this will crash for sure)
closedCB: 0x16ef0e4c0

levb · 2024-03-20T17:22:28Z

@thierryba the connection event callbacks are indeed asynchronous, and may be called after natsConnection_Destroy has returned. I question the need to pass the MessagingController's this pointer as the closure context for the connection callbacks. Perhaps, either

(a) separate the necessary context into a different struct (e.g. MessageControllerConnectionData*, and clean it up in closedCB, including deleting it;
(b) use your preferred synchronization mechanism to wait in the destructor until closedCB is done;
or (c) if the cleanup is not needed until the MessageController destructor - place it there, explicitly in the destructor, and don't pass it to closeCB?

Note that if the intention is for MessageController to auto-delete in closedCB e.g. if the server is down, then it should not be allocated on the stack, and can then be deleted from closedCB. A close() method could be exposed to severe the connection (and auto-delete the controller, asynchronously) from the client side.

Please let me know if it helps. I am hesitant to consider controls to "silence them" until I further understand your use case.

thierryba · 2024-03-20T17:31:13Z

Hello @levb , I suppose option a or b could be implemented. My question is more about: how can this be ok that there is no builtin way to control such things. I think it is a very common use case to embed such c-style functionality inside a c++ class. But there is hardly a way to get that done easily. It is at the very least very error prone.

Having to add classes because the not-documented closedCB call is happening after I call destroy. Then I need to see if it was not already closed before and that opens up a can of worms.

I have an ugly workaround to this. But an elegant solution would be really needed.

levb · 2024-03-20T18:41:20Z

@thierryba, understood. I could easily add natsOptions_SetNoCallbacksFromDestroy to disable the callbacks from the destructor, and leaving you to add the cleanup code there explicitly.

I'd like to point out that natsConnection is ref-counted, and references may still exist after natsConnection_Destroy returns. You should generally avoid providing pointers to its callbacks that can be prematurely deleted by the client code. I'd need to do some more research to see if implementing something like natsConnection_Destroy(And-Wait-For-All-Callbacks?) would make sense and how exactly it would work.

thierryba · 2024-03-21T07:41:34Z

@levb thanks you for your reply and I will be waiting for your decision on that one.

levb · 2024-04-11T11:45:51Z

@thierryba I think that adding natsOptions_SetNoCallbacksFromDestroy could be quite confusing. I am still unclear on why do you need to pass the MessagingController pointer to the connection callbacks. Can MessagingController be reference counted so it is not destroyed until all callbacks have released it?

Adding a synchronous method to wait for all callbacks to have finished, in the NATS client itself, seems counter to its asynchronous design philosophy.

thierryba added the defect Suspected defect such as a bug or regression label Mar 20, 2024

levb self-assigned this Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

callbacks on natsConnection are called after natsConnection_destroy #735

callbacks on natsConnection are called after natsConnection_destroy #735

thierryba commented Mar 20, 2024

thierryba commented Mar 20, 2024

levb commented Mar 20, 2024

thierryba commented Mar 20, 2024

levb commented Mar 20, 2024

thierryba commented Mar 21, 2024

levb commented Apr 11, 2024

callbacks on natsConnection are called after natsConnection_destroy #735

callbacks on natsConnection are called after natsConnection_destroy #735

Comments

thierryba commented Mar 20, 2024

Observed behavior

Expected behavior

Server and client version

Host environment

Steps to reproduce

thierryba commented Mar 20, 2024

levb commented Mar 20, 2024

thierryba commented Mar 20, 2024

levb commented Mar 20, 2024

thierryba commented Mar 21, 2024

levb commented Apr 11, 2024