Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DIRMINA-1173 #44

Open
wants to merge 16 commits into
base: 2.2.X
Choose a base branch
from
Open

DIRMINA-1173 #44

wants to merge 16 commits into from

Conversation

jon-valliere
Copy link

@jon-valliere jon-valliere commented Feb 22, 2024

Implements a Non Blocking Handler for SSL.

The new Handler implements 3 locks and additional queues for guaranteeing order of execution once the locks are released.

  • Read and Write operations are still primarily locked on each other. However, each operation now has a Queue for which to deposit produced objects. The Read operation is special in that it can cause Write operations which in turn cause Objects to be placed in both the Read and Write outbound queues. To keep this as simple as possible, this is the reason why they block on each other.
  • New Message Receive Queue; the output of the Read operation is a queue containing the Objects which need to be sent to the next filter via the messageReceived operation. The code which pulls objects out of this Queue is blocked by itself so no two threads can be pulling from the Queue concurrently. This design is NOT because ConcurrentLinkedQueues are unsafe in guaranteeing FIFO, its because existing applications may expect a more linear style lock handling versus having multiple threads operate on the FilterChain simultaneously and this could expose bugs which were not there previously.
  • New Message Write Queue, the output of the Write operation is a queue containing the Objects which will need to be send to the previous filter via the filterWrite operation. Otherwise essentially the same in function as the Receive Queue.
  • The Receive and Write Queues will NOT block on each other.
  • The main entry points into the Handler have been wrapped by try..catch blocks to ensure that the Queues can be released regardless of any errors that may have occurred.

CleanShot 2024-02-22 at 07 47 04

The following public endpoints for SSLHandlerG1 now correctly handle the non-block operations
- open
- write
- receive
- ack
- flush
- close
I added try..finally blocks to ensure processed messages are fired even if a subsequent message caused the SSL to fail.
@jon-valliere jon-valliere marked this pull request as ready for review February 22, 2024 23:29
@@ -37,7 +37,7 @@ public class DefaultSocketSessionConfig extends AbstractSocketSessionConfig {

private static final int DEFAULT_SO_LINGER = -1;

private static final boolean DEFAULT_TCP_NO_DELAY = false;
private static final boolean DEFAULT_TCP_NO_DELAY = true; // Disable Nagle algorithm by default
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @elecharny I'm not sure this is a good idea. If we want todo this also, I think it should be as part of a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jon-valliere, just read this blog post: https://brooker.co.za/blog/2024/05/09/nagle.html

Somehow, Nagle's algorithm is a thing of the past (ie when we were dealing with terminals over slow connections). In MINA's context, and if we except SSHD, I do think that the default to disable Nagle's algorithm actually makes sense. Most of MINA based application don't deal with char by char communication, rather with quite big chunks of data, so there is little need to wait for an ACK from the server.

Anyway, this is configurable, so I guess it should not be a big deal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling Nagle by default may create context switching performance hits in certain applications. Any situation where you may invoke more than one write in sequence, receives performance boosts from the way the IO scheduler works in TCP because it can produce MORE packets in a single go and does not have to be woken up as often. The OS itself does not disable Nagle by default and I do not think we should make this decision for the end user. As long as they can configure it, I think the default should be unset (to allow the OS default to apply). If other applications, built on MINA, want to configure a new default then it is up to them. For SSHD or Apache Directory, if you want to disable Nagle there, that is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not arguing here, I will revert but...

I do think that this flag should be opt-in, ie the Nagle algorithm should be disabled by default. The use cases were it's useful are not really the most frequent, and the drawbacks are numerous. If the default sendBuffer size is set to 64Ko, then you have a delayed for every packet that has a lower number of bytes to transfert. if you set the sendBuffer size to a lower value, then you are sending way more packet than needed. In any case, if you are to send some data that span on more than one packet, and the last packet is not completed, then you'll pay a 200ms delay every time. Not very cool.

With Gigabits connections, sending a lot of packets is not really an issue anymore, except on heavily loaded applications that aren't interactive (HTTP may enter into this category, but with REST application, I would argue that an immediate sending is a plus).

Regarding LDAP, requests are small there is no interest in differing the packet sendings, so yes, I'm likely to deactivate the Nagle Algorithm.

Seems like on Linux they have added the TCP_CORK parameter to let the sender decide when to send the data, bypassing the Nagle's algorithm completely, and also solving the context switching pb at the same time (somehow it's a way to say "you're responsability, not teh OS one", and I tend to agree). Sadly, this is not portable, so currently it's not an option.

Last, not least, for TLS HS, Nagle's Algorithm is delaying the whole negociation, which is a real PITA...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my other IO framework, all of the TCP/SOCK options all have unset states where the framework code will not call the SOCK configuration at all and allow the OS defaults to be applied.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://en.wikipedia.org/wiki/Nagle%27s_algorithm

Nagle only really applies for when the packets are smaller than the MSS which is usually 1280 bytes. AFAIK only the HELLO message in TLS HS is smaller than this; all other messages SHOULD be aggregated together by the SSL layer. There IS a well known interaction with Nagle and Delayed Ack which can cause Nagle to be worse than it would be by itself. What is the order of delays you are seeing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the MSS part. Ok, so makes sense to keep the parameter as is, but I do agree that not setting a default at all would be a better option. At least, you let the developer makes a choice depending on what they actually measure.

Thanks Jon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants