hub_connection::stop doesn't work reliably at end of process-life #162

ChristophAlbert · 2017-02-07T10:26:45Z

I have an application which, at the end of it's lifetime, calls hub_connection::stop(), waits for the returned task to finish, and then exits the process.

It turned out that this doesn't work reliably - sometimes, the process exits without having called abort on the server. For an unsecured SignalR connection, abort is called most of the time (but not always); for secure SignalR connections, abort is usually not called (but sometimes).

I looked a little into the issue and I think it is because connection_impl::shutdown starts two tasks: request_sender::abort and m_transport::disconnect. Only the latter task is returned from the method and ultimately waited upon, while request_sender::abort is fire-and-forget.

This seems to explain why the issue is more prevalent with TLS connections: In order to call abort, a new TLS handshake is performed, which isn't finished within the process' lifetime.

As a fix, I'd suggest wrapping both tasks in a task_group and returning that. In this way, the client can still fire-and-forget-stop the connection, but also wait for it synchronously and reliably:

        auto abort_task = request_sender::abort(*m_web_request_factory, m_base_url, m_transport->get_transport_type(), m_connection_token,
            m_connection_data, m_query_string, m_signalr_client_config)
            .then([](pplx::task<utility::string_t> abort_task)
            {
                try
                {
                    abort_task.get();
                }
                catch (...)
                {
                    // We don't care about the result and even if the request failed there is not much we can do. We do
                    // need to observe the exception though to prevent from a crash due to unobserved exception exception.
                }
            });

        auto disconnect_task = m_transport->disconnect();

        return pplx::task<void>([abort_task, disconnect_task]()
        {
          pplx::task_group task_group;
          task_group.run([abort_task](){abort_task.wait(); });
          task_group.run([disconnect_task](){disconnect_task.wait(); });
          task_group.wait();
        });

The text was updated successfully, but these errors were encountered:

moozzyk · 2017-02-07T17:30:46Z

Interesting. Seems like stop() should wait for abort to finish/fail (and ignore the result) but dtor should be fire, forget and hope it gets through. The server will timeout inactive connections but it would be best not to rely on this behavior if not necessary - it's mostly a fallback for clients that suddenly disappear.

ChristophAlbert · 2017-02-09T08:39:44Z

Yes, the server will notice the connection is gone when the heartbeat is missing, but that takes about 50s (iirc), which is slightly inconvenient for automated tests. The behavior should be as you said.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hub_connection::stop doesn't work reliably at end of process-life #162

hub_connection::stop doesn't work reliably at end of process-life #162

ChristophAlbert commented Feb 7, 2017

moozzyk commented Feb 7, 2017

ChristophAlbert commented Feb 9, 2017

hub_connection::stop doesn't work reliably at end of process-life #162

hub_connection::stop doesn't work reliably at end of process-life #162

Comments

ChristophAlbert commented Feb 7, 2017

moozzyk commented Feb 7, 2017

ChristophAlbert commented Feb 9, 2017