Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DTLS] Problem processing queued client connections with multi-threaded server #430

Closed
adriancdiego opened this issue Mar 22, 2024 · 13 comments
Assignees

Comments

@adriancdiego
Copy link

Hi,

I'm working with dtls examples, both single-threaded and multi-threaded and I noticed a different result with queued client connections.

It works perfectly when using server_dtls.c to process multiple client connections from client-dtls-threaded.c. Of course, on the server side client requests are processed sequentally one by one, but it is able to process all the requests . On the contrary, when I use server-dtls-threaded.c with only 2 threads and client-dtls-threaded.c with 4 threads, the server processes the first two client connections simultaneously, but it only process the third one afterwards, and returns -395 (Duplicate message error) error for the fourth one when calling wolfSSL_connect.

I expected multi-threaded server code to process al the queued connections but it only processes the first one when a thread gets free after processing the first connection batch. Is this behaviour correct or there might be a problem with that?

Thank you very much in advance,
Adrian

@adriancdiego adriancdiego changed the title [DTLS] Problem processing queued client connections with multi-thredaded server [DTLS] Problem processing queued client connections with multi-threaded server Mar 22, 2024
@anhu anhu assigned anhu and rizlik and unassigned anhu Mar 22, 2024
@anhu
Copy link
Member

anhu commented Mar 22, 2024

Hi @adriancdiego ,
Sorry to hear you're having problems with DTLS in a multi-threaded use case. I'll be assigning this to our team member @rizlik to have a look into this.

Thanks for reaching out to us. Here at wolfSSL we love knowing how our software is being used. Can you please let know a bit about yourself and your project? For example, where are you located? What are your goals? Is this project out of personal, academic or professional interest? Is there an institution or group associated with this work?

Warm regards, Anthony

@rizlik
Copy link
Contributor

rizlik commented Mar 22, 2024

Hi @adriancdiego,

Did you try server-dtls13-event.c? It's more advanced and it can also be used for DTLSv1.2 connections.

@adriancdiego
Copy link
Author

Hi @anhu and @rizlik and thank you for your answer :)

I'm from Spain and I'm working in a project where we have to develop a server capable of communicating with hundreds of clients using dtls + psk.

I forgot to mention in my message that we are using v.5.6.6 of your library.

We continued testing the example and we could observe than the same problem that I described arises when multiple clients try to connect simultaneously (removing the one second sleep that can be found in client example between consecutive connections). We could also determine than everything works fine when there is a one msec gap between connections as we understand than in that case the server has time enough to process the handshaking in time and connections aren't queued.

Did you have the opportunity to check this out? Next, we are going to try server-dtls13-event.c as suggested.

Thanks,
Adrian

@rizlik
Copy link
Contributor

rizlik commented Apr 2, 2024

Did you have the opportunity to check this out?

Yes, I'm on this. I should provide a fix for the example soon.

@adriancdiego
Copy link
Author

ok, thank you. I'll be waiting for it

@rizlik
Copy link
Contributor

rizlik commented Apr 4, 2024

Hi @adriancdiego,
I've submitted a potential solution for the issues you've encountered in PR #431. It's important to note that the examples provided are basic and not intended for production use, particularly in terms of performance. A more robust and efficient solution likely needs to manage and multiplex the packets in a more sophisticated way. We plan to update the library to facilitate easier multiplexing of packets from a single socket across multiple SSL objects. Suggestions are always welcome. If you want to share more of your use case in a more private context, feel free to send an email to support@wolfssl.com.
Meanwhile, let me know if PR #43 fixes the problem for you.

@adriancdiego
Copy link
Author

Hi @rizlik and thank you very much for your effort to improve the multithreaded dtls server example :)

I applied the changes that you proposed and I don't see the issue with multiple client connections anymore, but we are facing another error with the socket (maybe a timeout) as a consequence of the fact that the server processes queued connections too slowly. Moreover, it seems like there might be a limit with the number of concurrent handshakings or similar as the server is not capable of using all the available threads.

Let me carry out some aditional tests so that I can give you more info about it to see.

Thank you!

@rizlik
Copy link
Contributor

rizlik commented Apr 9, 2024

Hi @adriancdiego ,
Thanks for testing it out. Yes, as I said, this example isn't production-ready and in this specific case, when multiple client hellos are received simultaneously, some may be discarded. Clients keep retrying and eventually connect, but this is not optimal. DTLS sessions are usually P2P with one session per port.
I've some ideas to improve the example further but I'm not sure when it will be ready. In the meantime, I'm waiting for your feedback.

@adriancdiego
Copy link
Author

Hi @rizlik,

We tested the code and we saw that the server struggles with handshaking when we use more than 32 client threads simultaneously. Moreover, we saw that sometimes it works perfectly or with some delay with the last clients connections, whereas sometimes some clients can't connect and error -308 is returned for wolfSSL_connect function (error state on socket).

Is there anything we or you can do to make this example work with more clients?

Thank you!

@adriancdiego
Copy link
Author

Hi, we continued testing the example and we observed that it is still convenient to add a one msec time gap between client connections to avoid problems with handshaking. With lower values we see that some clients are not attended in time.

Thank you!

@rizlik
Copy link
Contributor

rizlik commented May 15, 2024

Thanks @adriancdiego . Can we consider this issue closed?

@adriancdiego
Copy link
Author

Hi @rizlik

The original issue is solved as we no longer see error -395 (Duplicate message error). However, we see that the execution struggles with multiple simultaneous client connections and it's necessary to include a small time gap between consecutive client connections. It might be convenient to check this out at some time.

Thank you very much for your time and the patch you proposed :)

@rizlik
Copy link
Contributor

rizlik commented May 16, 2024

Thanks @adriancdiego ,

Yes, I agree with you. In a production environment, I do think a different approach to packet multiplexing has to be done to work properly. While there is room for improvement in the library APIs, this also involves the OS and the networking layer. At the same time, we don't want to overcomplicate this example too much.

Thanks again,
Marco

@rizlik rizlik closed this as completed May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants