Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPN Azure fails easily due to OpenSSL 3.0 error queue leftover #1995

Open
1 task done
domosekai opened this issue May 10, 2024 · 3 comments
Open
1 task done

VPN Azure fails easily due to OpenSSL 3.0 error queue leftover #1995

domosekai opened this issue May 10, 2024 · 3 comments

Comments

@domosekai
Copy link
Contributor

Are you using SoftEther VPN 5.x?

  • Yes, I'm using SoftEther VPN 5.x, not 4.x.

Version

No response

Component

VPN Server

Operating system & version

Linux

Architecture or Hardware model

No response

Steps to reproduce

  1. Build with OpenSSL 3.0
  2. Start the server, create some user and enable VPN Azure
  3. Connect to the server directly in admin mode with either server manager or vpncmd
  4. Disconnect the admin connection
  5. Connect to the VPN Azure hostname with SSTP or SE

✔️ Expected Behavior

VPN connects successfully.

❌ Actual Behavior

The VPN Azure connection immediately after an admin connection is highly likely to fail.
In SSTP, the connection gets shut down immediately after the client sends the first SSTP hello message.
In SE, the connection gets cut after completing SE handshake. If the first connection fails, all additional connections fail too.

The error message in debug mode is

/home/admin/SoftEtherVPN/src/Mayaqua/Network.c 12313 SSL Fatal Error on ASYNC socket !!!

which is right after calling SSL_peek() in SecureRecv().

It turned out the the fatal error is not from SSL_peek() but some leftover in the error queue.

Anything else?

  1. This issue does not affect programs built with OpenSSL 1.1.
  2. Same behavior was observed on stable edition build 9772 (the first build embedding OpenSSL 3.0) but not on build 9760
  3. It affects connections via VPN Azure only, including the custom VPN Azure service. Direct connections are not affected.
  4. It might be related to a behavior change in OpenSSL 3.0.

    On an unexpected EOF, versions before OpenSSL 3.0 returned SSL_ERROR_SYSCALL, nothing was added to the error stack, and errno was 0. Since OpenSSL 3.0 the returned error is SSL_ERROR_SSL with a meaningful error on the error stack (SSL_R_UNEXPECTED_EOF_WHILE_READING).
    https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html

  5. Clearing the error queue before SSL operations seems to fix the problem.
    diff --git a/src/Mayaqua/Network.c b/src/Mayaqua/Network.c
    index 31890322..f5d2ac14 100644
    --- a/src/Mayaqua/Network.c
    +++ b/src/Mayaqua/Network.c
    @@ -12290,6 +12290,7 @@ UINT SecureRecv(SOCK *sock, void *data, UINT size)
                                    Debug("%s %u SecureRecv() Disconnect\n", __FILE__, __LINE__);
                                    return 0;
                            }
    +ERR_clear_error();^M
                            ret = SSL_peek(ssl, &c, sizeof(c));
     e = SSL_get_error(ssl, ret);
     Debug("SecureRecv 1: async=%d sslerror=%d ssl=%p\n", sock->AsyncMode, e, ssl);
    @@ -12349,7 +12350,7 @@ Debug("SecureRecv 1: async=%d sslerror=%d ssl=%p\n", sock->AsyncMode, e, ssl);
     #ifdef UNIX_SOLARIS
                    ttparam = NewSocketTimeout(sock);
     #endif // UNIX_SOLARIS
    -
    +ERR_clear_error();^M
                    ret = SSL_read(ssl, data, size);
     e = SSL_get_error(ssl, ret);
     Debug("SecureRecv 2: async=%d sslerror=%d ssl=%p\n", sock->AsyncMode, e, ssl);
    @@ -12447,6 +12448,7 @@ UINT SecureSend(SOCK *sock, void *data, UINT size)
                            Debug("%s %u SecureSend() Disconnect\n", __FILE__, __LINE__);
                            return 0;
                    }
    +ERR_clear_error();^M
     
                    ret = SSL_write(ssl, data, size);
                    if (ret < 0)
@domosekai
Copy link
Contributor Author

domosekai commented May 10, 2024

Although I have come up with a workaround, I am not sending this in a PR because there is still something important that remains unknown, e.g. why the bug only affects VPN Azure, why a new connection can be interfered by an old error.
SSL safety is fundamental to this program and I wish it can be handled carefully and properly.

@domosekai
Copy link
Contributor Author

There is a similar report on vpnusers forum in Japanese. The server cannot be connected via VPN Azure when connected to the server manager.
https://www.vpnusers.com/viewtopic.php?f=15&t=68458

@chipitsine
Copy link
Member

nice, VPN Azure seem to be easiest repro step. I did not try yet, but I'm planning soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants