-
-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor H5T_cmp and reintroduce Tony Horrobin's cache of the member-name sort order #1387
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Improve efficiency of sorting by sorting just once for all writes that use a particular compound data type. --- Add an index array for sorting by name. --- Use it instead of local arrays in H5T_cmp(). --- Remember to deep copy it to avoid a SEGV due to using freed memory.
common subexpressions into temporary variables.
PRIdHID and PRId64, respectively.
``` if (r == NULL) { ... l = ... ; ... } else { l = r; } ``` to the simpler form ```if ((l = r) == NULL) { ... l = ...; ... } ``` `compound_cmp` is not called if `sh2->type` does not equal H5T_COMPOUND, so remove the comparison.
"signing" code and documentation using my name.
temporary variables.
verify "at a glance," sizeof(element[0]) instead of sizeof(element_t).
change that saves the sorted order of compound type members. In compound_cmp: be brief.
jump to the `error` label where the resources used to be re-released.
comparing it with zero.
`bubblesort_indices_by_name`. Quiet some GCC signed-overflow warnings. Use safer sizeof() idiom in allocations.
Avoid GCC signed-overflow warnings.
GCC signed-overflow warnings.
pass that, instead. `size_t` seems like a more appropriate type for the map, so make the change from `int`.
variable. Invalidate the cached member-name order.
order may change. Replace common subexpressions with a temporary variable.
gnuoyd
requested review from
derobins,
fortnern,
jhendersonHDF,
jrmainzer,
qkoziol,
soumagne and
vchoi-hdfgroup
as code owners
January 24, 2022 20:48
byrnHDF
reviewed
Jan 25, 2022
size_t total_size; | ||
unsigned i; /* Local index variable */ | ||
herr_t ret_value = SUCCEED; /* Return value */ | ||
H5T_shared_t *const msh = member->shared, *const psh = parent->shared; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the readability of the use of temp vars, however personally I'd prefer declarations on individual lines.
derobins
added
the
Type - Improvement
Improvements that don't add a new feature or functionality
label
Mar 3, 2023
derobins
added
Merge - To 1.12
Merge - To 1.14
This needs to be merged to HDF5 1.14
Priority - 3. Low 🔽
Code cleanup, small feature change requests, etc.
Component - C Library
Core C library issues (usually in the src directory)
labels
May 25, 2023
Closing due to age and lack of movement. Issue #4528 has been created as a reminder to go back over this. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Component - C Library
Core C library issues (usually in the src directory)
Merge - To 1.14
This needs to be merged to HDF5 1.14
Priority - 3. Low 🔽
Code cleanup, small feature change requests, etc.
Type - Improvement
Improvements that don't add a new feature or functionality
WIP
Work in progress (please also convert PRs to draft PRs)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have extracted a couple of subroutines from H5T_cmp and simplified a lot, mainly by using temporary variables to replace repeated expressions. The code is more readable this way.
There are a lot of changes in this PR. It consists of 33 individual commits with descriptions. I recommend reading the individual commits.
I'm marking this WIP because I have some doubts about (re-)sorting compound-type members "on demand" in H5T_cmp.
It seems that applications probably can afford the time and space cost of the library keeping members sorted by both name and index, always. I don't think applications will change any type's members often enough to make an important performance difference. I don't think applications use types in numbers great enough that space will be a problem. Not so?
If the members are always stored in
offset
order, and there is always aname
-ordered array of member indices alongside, then H5T_cmp does not have to allocate any memory. Right now, when an allocation fails, H5T_cmp returns 0, indicating arbitrarily that its H5T_t arguments are "equal". That sets up the library or application to fail mysteriously somewhere down the road.Enum types, too, should keep their cases continually sorted, to avoid on-demand heap allocation.