Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iterator proxy pair ought to decay to value semantics on demand #31

Open
ned14 opened this issue May 22, 2020 · 4 comments
Open

iterator proxy pair ought to decay to value semantics on demand #31

ned14 opened this issue May 22, 2020 · 4 comments
Assignees
Labels
feature New feature or enhancement request

Comments

@ned14
Copy link

ned14 commented May 22, 2020

Lots of generic STL code would always do:

for(auto &i : container)

This prevents the value copy of items being iterated. You have proxy pair in place to prevent this occurring for:

for(auto i : container)

Firstly, any compiler which is not MSVC will elide that copy under optimisation if i never escapes or is modified. So your proxy optimisation probably doesn't actually save much, except on MSVC.

Meanwhile, the lack of value sematics for the second for loop causes much generic code to break. If the code asked for a copy, it expects an as-if copy. If you wish to retain your proxy, you ought to have it implicitly convert to a value type upon demand in order to retain expected semantics.

@ned14 ned14 added the bug Something isn't working label May 22, 2020
@marzer
Copy link
Owner

marzer commented May 22, 2020

Yeah, this is tricky. It's mainly to disguise the use of a unique_ptr<node> internally; I don't want to expose that directly, hence the proxy that exposes it as references instead (i.e. it wasn't done that way for optimization).

Do you have implementation suggestions? I don't want to implement anything that involves creating unnecessary copies of internal nodes, since they're created on the heap to satisfy polymorphism.

@marzer
Copy link
Owner

marzer commented May 23, 2020

Just realized that I could remove the need for the proxy altogether if I had the nodes represent the value types via composition (i.e. make it a glorified discriminated union), rather than inheritance, a bit like what nlohmann::json does with its json object.

Probably quite a bit more work than a basic refactor, but if I ever do a toml++ v2 I'd give that some very real thought.

@ned14
Copy link
Author

ned14 commented May 25, 2020

Do you have implementation suggestions? I don't want to implement anything that involves creating unnecessary copies of internal nodes, since they're created on the heap to satisfy polymorphism.

shared_ptr is probably the easiest given your present design.

Just realized that I could remove the need for the proxy altogether if I had the nodes represent the value types via composition (i.e. make it a glorified discriminated union), rather than inheritance, a bit like what nlohmann::json does with its json object.

Personally speaking, I wouldn't take this approach. Union storage is hard on the compiler, both on compile times and on the ability to optimise. I appreciate that it's pure in the sense of compsci theory, but as I get more experienced, I find myself avoiding union storage for non-trivially-copyable types more and more, like how I avoid threads nowadays. And if I ever do use threads, they're always like a child process i.e. very self contained, only ever touch shared state at the very beginning and very end.

BTW, if I were implementing a language parser in C++ 20, I think I'd use Coroutines with generators. Then iteration is just a suspended coroutine, and all your iteration state is inside the coroutine frame. I'd then wrap up that Coroutine based implementation into a non-Coroutine API using a bunch of preprocessor metaprogramming, because C++ 20 has currently really lousy coroutine vs non-coroutine integration. It'll get a lot better in 23 with Reflection, and hopefully sanding down of Coroutines many, many, many syntax rough edges.

The reason why coroutine generators are great for language parsing is because they're lazy. So, if given a 1Gb TOML file to parse, it's only as you query and iterate does parsing occur, and only on the bits you actually traverse. MSVC used to do that trick, it was much faster than other compilers because it simply never parsed any of a C++ file you didn't use, which was cool, but not standards conforming. Still, the right approach for general language parsing though.

@marzer
Copy link
Owner

marzer commented May 25, 2020

shared_ptr is probably the easiest given your present design.

Hmmn. This honestly strengthens my inclination towards the refactoring I was describing. I won't get any serious free time to do the work necessary to really pull things apart and put them back together for quite some time, so the current design will have to do for the time being.

Having said that, as part of the implementation for #29 I refactored how the temporary proxy pair is stored so all of the following should work:

for(auto i : container) {}
for(auto &i : container) {}
for(auto &&i : container) {}

The proxy pair object still contains references, but the reference category of the pair object shouldn't be able to break generic code now.

... coroutines ...

Sounds fun. I don't have a lot of experience with coroutines in any language, but I look forward to learning about them. In some far-future version of toml++ I'll remove support for C++17 and maybe have a crack at it then.

@marzer marzer added feature New feature or enhancement request and removed bug Something isn't working labels Jun 24, 2020
@marzer marzer mentioned this issue Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

2 participants