Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying a minimum size for Vec #2445

Open
eira-fransham opened this issue May 23, 2018 · 2 comments
Open

Allow specifying a minimum size for Vec #2445

eira-fransham opened this issue May 23, 2018 · 2 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@eira-fransham
Copy link

At the moment when a vector is created it doesn't allocate anything, but then when a single element is pushed to it it grows to hold precisely 4 elements. As further elements are pushed it doubles each time. That first allocation is a problem, though. It's quite common to use SmallVec in performance-sensitive code when there is an "expected maximum", a value which is normally not exceeded but may be exceeded in edge-cases. However, profiling shows that using a Vec with with_capacity is often faster than using SmallVec because of the reduced complexity.

This has a couple of problems though. Firstly, this isn't a drop-in replacement like SmallVec is. You need to replace every call to new with with_capacity and .collect() will still use the same minimum allocation size (although this is rarely a problem because of size_hint). Worst is that if you create a vector and then never push anything to it then you have a wasted allocation and deallocation.

You could have a wrapper around Vec that checks the capacity and allocates the supplied number of elements on first allocation but that's an extra cost on every insertion. The most efficient method would be to have that "precisely 4" be configurable somehow, probably with const generics (#2000). Because this couldn't be implemented today, I have written this as an issue rather than a PR.

@Lonami
Copy link

Lonami commented Jun 23, 2018

Isn't that "minimum capacity" already the with_capacity you mention? What would be the difference?

@burdges
Copy link

burdges commented Jun 24, 2018

No. Vec::new() is defined to do no allocation so that you can call Vec::reserve yourself, which possibly simplifies code flow. In particular [Vec::new(); 1024] does no allocation.

I think @Vurich wants control over the minimum size for the first implicit Vec::reserve call. After NLL, you do this manually by prefixing your v.push(...); calls with v.reserve(initial_reserve - v.len());, but right now it's more obnoxious.

I believe @Vurich proposes giving Vec a default constant parameter like

pub struct Vec<T, const INITIAL_RESERVE: usize = 4> {
    buf: RawVec<T>,
    len: usize,
}

so that generic code can use Vec::push easier, like say when creating a HashMap<K,Vec<T, const INITIAL_RESERVE=10>>.

If you're calling Vec::push yourself, not via generic code, then the above Vec::reserve sounds more flexible. Also, there is afaik no interaction with iterator adapters here because, as he says, if T is dynamically sized then Iterator::collect::<T>() via <T as FromIterator>::from_iter() should always call Iterator::size_hint(), and one should file a bug against any T that does not do so.

I'd suggest waiting for custom allocators because doing them right sounds like the priority and might overlap here, ala Vec<T, Allocator=MinninumAllocatorAdaptor<DefaultAllocator,10*size_of::<T>()>>

@Centril Centril added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

4 participants