Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure text mode file BIOs are buffered on Windows #24249

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joshcooper
Copy link

Commit 8300a87 disabled file buffering for text mode file BIOs on Windows to work around a bug in UCRT ftell. This is surprising to applications that expect BIO read/writes to be buffered, such as when reading cert.pem.

This wraps text mode file BIOs with a buffered BIO, so the overall result is buffered (just once).

Commit 8300a87 disabled file buffering for text
mode file BIOs on Windows to work around a bug in UCRT ftell. This is surprising
to applications that expect BIO read/writes to be buffered, such as when reading
cert.pem.

This wraps text mode file BIOs with a buffered BIO, so the overall result is
buffered (just once).
Copy link
Member

@slontis slontis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add some kind of test? (I assume there isn't a test or this would have been detected before now?)

@t8m t8m added branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature labels Apr 24, 2024
@t8m
Copy link
Member

t8m commented Apr 24, 2024

I do not think this is a good idea. This is going to surprisingly return a BIO chain instead of a single BIO on one platform and that means to free it you'd have to use BIO_free_all() instead of BIO_free(). This is an API change.

@joshcooper
Copy link
Author

joshcooper commented Apr 24, 2024

@t8m ah sorry, I thought BIO_free behaved like BIO_free_all. I see now the docs call that out "BIO_free() frees up a single BIO"

Would it make sense to instead create macros to wrap BIO_new_file and BIO_free like:

#if defined(OPENSSL_SYS_WINDOWS)
#define BIO_NEW_TEXT_FILE(b,f,m) do { \
       (b) = BIO_new_file((f),(m));  \
       if ((b) != NULL) \
           (b) = BIO_push(BIO_new(BIO_f_buffer()),(b)); \
       } while(0)
#define BIO_FREE_TEXT_FILE(b)    BIO_free_all(b)
#else
#define BIO_NEW_TEXT_FILE(b,f,m) ((b) = BIO_new_file((f),(m)))
#define BIO_FREE_TEXT_FILE(b)    BIO_free(b)
#endif

And then update the places where openssl creates file BIOs in text mode?

$ git grep 'BIO_new_file(.*"r")' crypto providers
crypto/conf/conf_def.c:    in = BIO_new_file(name, "r");
crypto/conf/conf_def.c:                next = BIO_new_file(include_path, "r");
crypto/conf/conf_def.c:    next = BIO_new_file(include, "r");
crypto/conf/conf_def.c:            bio = BIO_new_file(newpath, "r");
crypto/conf/conf_lib.c:    in = BIO_new_file(file, "r");
crypto/engine/eng_openssl.c:    in = BIO_new_file(key_id, "r");
crypto/ts/ts_conf.c:    if ((cert = BIO_new_file(file, "r")) == NULL)
crypto/ts/ts_conf.c:    if ((certs = BIO_new_file(file, "r")) == NULL)
crypto/ts/ts_conf.c:    if ((key = BIO_new_file(file, "r")) == NULL)
crypto/x509/by_file.c:    in = BIO_new_file(file, "r");
crypto/x509/v3_pci.c:            BIO *b = BIO_new_file(valp, "r");

For example in by_file.c:

-   in = BIO_new_file(file, "r");
+   BIO_NEW_TEXT_FILE(in, file, "r");
    if (in == NULL) {
        ERR_raise(ERR_LIB_X509, ERR_R_BIO_LIB);
        return 0;
    }
    inf = PEM_X509_INFO_read_bio_ex(in, NULL, NULL, "", libctx, propq);
-   BIO_free(in);
+   BIO_FREE_TEXT_FILE(in);

If that makes sense, where should the BIO_NEW_TEXT_FILE macro be placed so its not public API? Also I'm terrible at naming, so suggestions welcome.

@t8m
Copy link
Member

t8m commented Apr 25, 2024

I would like to know what problem is this trying to solve. Does it matter if reading certificates/keys is not buffered? These data files aren't large.

@joshcooper
Copy link
Author

Sure, the problem is our application takes more than 10% longer to start up using openssl 3.0 due to 116k calls to ReadFile, each one of which reads 2 bytes from cert.pem. Details are in puppetlabs/puppet-runtime#786

@davidben
Copy link
Contributor

davidben commented Apr 25, 2024

I might suggest that OpenSSL consider just open those cases internally in binary mode. The PEM parser already handles CRLF itself and this helps avoid platform differences in processing the same bytes. The ftell bug is then moot in those cases.

That doesn't account for the cases where OpenSSL takes a FILE* from the caller. But in those cases, the caller presumably is just expecting OpenSSL to call the libc functions on that FILE, platform warts and all. I.e. I think you can just assume the caller either doesn't care about ftell (most likely) or has applied a suitable workaround on their end.

Put those together and I would suggest reverting the ftell workaround and leaving it to the call sites to apply, seeing as the workaround has significant perf costs.

@t8m
Copy link
Member

t8m commented Apr 25, 2024

I might suggest that OpenSSL consider just open those cases internally in binary mode. The PEM parser already handles CRLF itself and this helps avoid platform differences in processing the same bytes. The ftell bug is then moot in those cases.

Yeah, that would make sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch: master Merge to master branch hold: cla required The contributor needs to submit a license agreement triaged: feature The issue/pr requests/adds a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants