You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently vector_append() uses an allocation strategy that optimizes for space: it increases storage only by as much as needed. This means that repeatedly appending small vectors to a larger vector will be inefficient.
In contrast, vector_push_back() optimizes for time performance with repeated appends, by doubling storage each time it runs out.
The choice between the two can make a huge difference in performance:
However, whenever appending more than a single element to a vector, vector_append() would be more convenient to use, and it could also be more efficient if it used a doubling allocation strategy.
I suggest making it possible to choose which allocation strategy to use: optimize for memory (extend by as much as needed) or optimize for time (double). This can be done either by having two separate functions (my preference) or adding a parameter to vector_append() which chooses between the two strategies.
A third option is of course to just change the allocation strategy of vector_append(). Note that it can't just double storage. vector_append(v1, v2) should use whichever is larger: 2*len(v1) or len(v1) + len(v2).
The text was updated successfully, but these errors were encountered:
Use two separate functions. This is an implementation detail that most users do not (and do not want to) care about. Adding an extra argument is API-breaking and it forces a cognitive burden on the end user.
Adding two functions is fine, and also my preference.
I would not call it an implementation detail though as it directly impacts the complexity of algorithms, as we just painfully found out with GraphML.
When I get the time, I'll look to see if there are any uses of vector_append where NOT doubling allocated storage is actually useful. An option would be to always double, and not even provide a memory-efficient version. Yet another option is to change the old function to double, and add a new memory-efficient variant.
I would not call it an implementation detail though
It's not an implementation detail from our perspective as library maintainers, but for the ordinary end user that just needs a dynamic resizable vector it is an implementation detail 99% of the time. That's what I meant.
Currently
vector_append()
uses an allocation strategy that optimizes for space: it increases storage only by as much as needed. This means that repeatedly appending small vectors to a larger vector will be inefficient.In contrast,
vector_push_back()
optimizes for time performance with repeated appends, by doubling storage each time it runs out.The choice between the two can make a huge difference in performance:
Examples:
However, whenever appending more than a single element to a vector,
vector_append()
would be more convenient to use, and it could also be more efficient if it used a doubling allocation strategy.I suggest making it possible to choose which allocation strategy to use: optimize for memory (extend by as much as needed) or optimize for time (double). This can be done either by having two separate functions (my preference) or adding a parameter to
vector_append()
which chooses between the two strategies.@ntamas, which one shall we go with?
A third option is of course to just change the allocation strategy of
vector_append()
. Note that it can't just double storage.vector_append(v1, v2)
should use whichever is larger:2*len(v1)
orlen(v1) + len(v2)
.The text was updated successfully, but these errors were encountered: