Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amqp_ssl_socket_close intermittent crash #803

Open
korymiller1489 opened this issue Dec 4, 2023 · 2 comments
Open

amqp_ssl_socket_close intermittent crash #803

korymiller1489 opened this issue Dec 4, 2023 · 2 comments

Comments

@korymiller1489
Copy link

We are using the 0.10 build compiled on SCO Unix OS5 with gcc 4.2.4.

We have encountered a crashing issue with our build intermittently during the amqp_destroy_connection -> amqp_ssl_socket_delete ->amqp_ssl_socket_close call. Specifically, the program crashes on the SSL_shutdown call.

static int amqp_ssl_socket_close(void *base, amqp_socket_close_enum force) {
  struct amqp_ssl_socket_t *self = (struct amqp_ssl_socket_t *)base;

  if (-1 == self->sockfd) {
    return AMQP_STATUS_SOCKET_CLOSED;
  }

  if (AMQP_SC_NONE == force) {
    /* don't try too hard to shutdown the connection */
    SSL_shutdown(self->ssl);
  }

  SSL_free(self->ssl);
  self->ssl = NULL;

  if (amqp_os_socket_close(self->sockfd)) {
    return AMQP_STATUS_SOCKET_ERROR;
  }
  self->sockfd = -1;

  return AMQP_STATUS_OK;
}

Admittedly, our env has some network issues with latency and dropped/out of order packets, and the crashes are likely related to corrupted/malformed packets, which we then drop the connection (and thus crash sometimes) and reconnect and continue running. But we do not have a way around the network issues at this time.

Modifying the amqp_ssl_socket_delete call to use AMQP_SC_FORCE (which bypasses the SSL_shutdown and skips to SSL_free) alleviates the problem, but does not seem like a good long term solution.

I noticed that there is no real library mechanism to call an amqp_socket_close_force type of operation. Would it be agreeable to add this logic? I am happy to provide a PR to this end.

@alanxz
Copy link
Owner

alanxz commented Dec 4, 2023

Can you describe what sort of crash you're getting?

@korymiller1489
Copy link
Author

korymiller1489 commented Dec 6, 2023

We're getting segfaults in the SSL_shutdown function periodically.

We see in the Rabbit logs a corresponding frame header error.

[error] closing AMQP connection {handshake_timeout,frame_header}

we don't crash everytime a frame header error is seen, but everytime the library crashes, we see a frame header error.

Basically, when we detect that the connection has a problem (connection closed etc, network issues) and we need to reconnect, we call amqp_destroy_connection to clean up resources before the next attempt. That call will sometimes segfault, at the chain mentioned above. The crash is always associated with the frame header error in the Rabbit log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants