Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

constexpr StringView without literals #123

Open
jdumas opened this issue Oct 7, 2021 · 9 comments
Open

constexpr StringView without literals #123

jdumas opened this issue Oct 7, 2021 · 9 comments

Comments

@jdumas
Copy link

jdumas commented Oct 7, 2021

Hi. I noticed that a constexpr Containers::StringView can only be created from const char * via a literal _s. Reading the doc this seems to be a deliberate design decision, but I'm not entirely sure why? E.g., the tag Global is documented as:

The referenced string is global, i.e., with an unlimited lifetime. Enabling this flag can avoid needless allocations and copies in cases where the consumer needs to extend lifetime of passed string view.

But why would you want to extend the lifetime of a passed string view? This is a view, not a smart pointer no?

@pezcode
Copy link
Contributor

pezcode commented Oct 7, 2021

There's also constexpr StringView(const char*, size_t) you could use if you know the literal length (e.g. const char x[] = "x" -> sizeof(x) - 1). I assume the issue with normal const char* literals is that there's no guaranteed constexpr equivalent of strlen before C++17.

The Global tag is a way to reason about the lifetime of the view's data pointer. If you need to, for example, keep the content of a view passed to a function around for later, you can avoid an owned copy if that tag is set. See for example String::nullTerminatedGlobalView which does exactly that.

@mosra
Copy link
Owner

mosra commented Oct 7, 2021

The main reason is std::strlen() not being constexpr, yes. However, assuming a possibility to have a constexpr std::strlen(), I'm afraid we'd either have to pick between being fast at runtime (by having a possibility to use 128/256-bit SSE4.2 / AVX instructions) or being executable at compile time (using a dumb slow loop in C++14, or worse, using a nasty recursion in C++11). Or we'd have to wait until C++20 and consteval, which I suppose could allow us to pick a different implementation for compile time and runtime. Not great, and something I'm disappointed about as well.

OTOH, as @pezcode says, it's possible to get away with using the char array length for size, although it may give different results than std::strlen() if the literal contains '\0' bytes. There's a conversion constructor from an ArrayView which could be abused to do this in a single expression, although it's currently not constexpr because I didn't want to add an implicit dependency between the StringView and ArrayView headers. But this use case is compelling enough that I'll try to make that work.

It'd then be something like the following, it's very explicit but since an arbitrary char[] can also be neither null-terminated nor a global constant, I'm afraid that adding a more convenient API for this could lead to error-prone code -- and since eventually a lot of Corrade APIs will depend on the NullTerminated flag (such as file opening, where a null-terminated string view is passed to std::fopen() as-is and a null-terminated copy is allocated otherwise), I don't want to introduce any API that could implicitly mark things as null terminated or global while they may be not.

// will be constexpr once I make that possible
constexpr Containers::StringView hello{Containers::arrayView("hello world!!").except(1), // strips the \0
    // you're responsible for ensuring these are correct
    Containers::StringViewFlag::Global|Containers::StringViewFlag::NullTerminated};

About the Global tag, it's unrelated to the constexpr, and it's set in the literal because the literal is the only place where it's clear at compile time that the string lives in a global constant memory and is null-terminated. The documentation for it is misleading, sorry about that. It was phrased in a way that suggested the flag somehow affects how the memory gets managed, wherereas it merely describes if the memory is a global constant or not. I'll change it to the following, which I hope explains the intent better:

The referenced string is global, i.e., with an unlimited lifetime. A string view with this flag set doesn't need to have a copy allocated in order to ensure it stays in scope.

@mosra mosra added this to the 2020.0b milestone Oct 7, 2021
@mosra mosra added this to TODO in Containers via automation Oct 7, 2021
@jdumas
Copy link
Author

jdumas commented Oct 7, 2021

How does std::string_view achieves this then? Does it perform a slow loop at compile time to find the length of the char *? Or it uses a constexpr version of strlen()? The following code:

    constexpr std::string_view a = "foo\0bar";
    std::string_view b = "foo\0bar";
    std::cout << a.size() << " " << b.size() << std::endl;

outputs 3 3. Maybe we could make that possible when compiling with C++17.

Thanks for the clarification on the Global tag.

@mosra
Copy link
Owner

mosra commented Oct 7, 2021

How does std::string_view achieves this then?

Hmm. Using std::char_traits::length(), which needs a good amount of compiler magic, apparently. This is the implementation in libstdc++ 11:

      static _GLIBCXX17_CONSTEXPR size_t
      length(const char_type* __s)
      {
#if __cplusplus >= 201703L
	if (__constant_string_p(__s))
	  return __gnu_cxx::char_traits<char_type>::length(__s); // <-- and this is the dump loop, yes
#endif
	return __builtin_strlen(__s);
      }

(Sigh.) For obvious reasons I don't want to depend on the <string> header for a string replacement to make use of std::char_traits, so I'm not really sure what to do here except for relying on compile-time-array length, or waiting for consteval that could make this possible even without compiler magic.

@jdumas
Copy link
Author

jdumas commented Oct 7, 2021

Why do you need consteval? We could reimplement a constrexpr strlen using a simple loop no? E.g.

constexpr std::size_t length(const char* start) {
   const char* end = start;
   for(; *end != '\0'; ++end)
      ;
   return end - start;
}

constexpr size_t s = length("foo\0bar");

@mosra
Copy link
Owner

mosra commented Oct 7, 2021

Because, as I said above, I don't want to give up on a possibility to use 128/256-bit SSE4.2 / AVX instructions at runtime instead of always iterating on a single byte at a time and then hoping the compiler optimizes that somehow. That's what std::strlen() / __builtin_strlen() does.

@jdumas
Copy link
Author

jdumas commented Oct 7, 2021

Got it. Would be nice if we could dispatch code based on whether the arg is constexpr or not...

@mosra
Copy link
Owner

mosra commented Oct 7, 2021

Yeah that's what consteval from C++20 does: https://en.cppreference.com/w/cpp/types/is_constant_evaluated

@mosra
Copy link
Owner

mosra commented Dec 29, 2022

A constexpr creation of a StringView from an ArrayView mentioned above got implemented in 81b7089, so the following is now possible, at least, although I'm not sure if it's any better than writing a helper that consumes char(&)[n] and passes that to the already-constexpr StringView constructor taking a pointer + size:

constexpr Containers::StringView a = Containers::arrayView("hello").exceptSuffix(1);

Everything else, related to constexpr-aware construction from a char*, needs something like #152, and even then I'm not sure if it's a good idea given that it needs something like a compile-time while loop that iterates over the array until the first '\0' is found to be really compatible with what strlen() does. Worth investigating is also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86590 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80265 where the GCC devs deal with poor codegen related to doing this very operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Containers
  
TODO
Development

No branches or pull requests

3 participants