Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor H5T_cmp and reintroduce Tony Horrobin's cache of the member-name sort order #1387

Closed
wants to merge 36 commits into from

Conversation

gnuoyd
Copy link
Contributor

@gnuoyd gnuoyd commented Jan 24, 2022

I have extracted a couple of subroutines from H5T_cmp and simplified a lot, mainly by using temporary variables to replace repeated expressions. The code is more readable this way.

There are a lot of changes in this PR. It consists of 33 individual commits with descriptions. I recommend reading the individual commits.

I'm marking this WIP because I have some doubts about (re-)sorting compound-type members "on demand" in H5T_cmp.

It seems that applications probably can afford the time and space cost of the library keeping members sorted by both name and index, always. I don't think applications will change any type's members often enough to make an important performance difference. I don't think applications use types in numbers great enough that space will be a problem. Not so?

If the members are always stored in offset order, and there is always a name-ordered array of member indices alongside, then H5T_cmp does not have to allocate any memory. Right now, when an allocation fails, H5T_cmp returns 0, indicating arbitrarily that its H5T_t arguments are "equal". That sets up the library or application to fail mysteriously somewhere down the road.

Enum types, too, should keep their cases continually sorted, to avoid on-demand heap allocation.

triblatron and others added 30 commits January 24, 2022 11:57
-- Improve efficiency of sorting by sorting just once for all writes that use a particular compound data type.
--- Add an index array for sorting by name.
--- Use it instead of local arrays in H5T_cmp().
--- Remember to deep copy it to avoid a SEGV due to using freed memory.
common subexpressions into temporary variables.
```
if (r == NULL) {
	... l = ... ; ...
} else {
	l = r;
}
```

to the simpler form

```if ((l = r) == NULL) {
	... l = ...; ...
}
```

`compound_cmp` is not called if `sh2->type` does not equal H5T_COMPOUND,
so remove the comparison.
"signing" code and documentation using my name.
verify "at a glance," sizeof(element[0]) instead of sizeof(element_t).
change that saves the sorted order of compound type members.

In compound_cmp: be brief.
jump to the `error` label where the resources used to be re-released.
`bubblesort_indices_by_name`.  Quiet some GCC signed-overflow warnings.
Use safer sizeof() idiom in allocations.
pass that, instead. `size_t` seems like a more appropriate type for the
map, so make the change from `int`.
variable.  Invalidate the cached member-name order.
order may change.  Replace common subexpressions with a temporary
variable.
size_t total_size;
unsigned i; /* Local index variable */
herr_t ret_value = SUCCEED; /* Return value */
H5T_shared_t *const msh = member->shared, *const psh = parent->shared;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the readability of the use of temp vars, however personally I'd prefer declarations on individual lines.

@hyoklee
Copy link
Member

hyoklee commented Nov 4, 2022

#1176

@derobins derobins added the Type - Improvement Improvements that don't add a new feature or functionality label Mar 3, 2023
@derobins derobins added Merge - To 1.12 Merge - To 1.14 This needs to be merged to HDF5 1.14 Priority - 3. Low 🔽 Code cleanup, small feature change requests, etc. Component - C Library Core C library issues (usually in the src directory) labels May 25, 2023
@jhendersonHDF jhendersonHDF added this to the 1.14.5 milestone Mar 27, 2024
@fortnern fortnern marked this pull request as draft April 15, 2024 16:15
@derobins
Copy link
Member

Closing due to age and lack of movement. Issue #4528 has been created as a reminder to go back over this.

@derobins derobins closed this May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Merge - To 1.14 This needs to be merged to HDF5 1.14 Priority - 3. Low 🔽 Code cleanup, small feature change requests, etc. Type - Improvement Improvements that don't add a new feature or functionality WIP Work in progress (please also convert PRs to draft PRs)
Projects
Status: Merges Complete
Development

Successfully merging this pull request may close these issues.

None yet

7 participants