Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid custom code to combine hash codes, instead use Object.hash / Object.hashAll #264

Open
mkustermann opened this issue Jan 4, 2023 · 2 comments
Labels
type-enhancement A request for a change that isn't a bug

Comments

@mkustermann
Copy link
Member

Currently the DeepCollectionEquality code has it's own implementation of combining hashcodes, see lib/src/equality.dart:

class _MapEntry {
  ...
  int get hashCode =>
      (3 * equality._keyEquality.hash(key) +
          7 * equality._valueEquality.hash(value)) &
      _hashMask;
}

class MapEquality<K, V> implements Equality<Map<K, V>> {
  ...
  int hash(Map<K, V>? map) {
    if (map == null) return null.hashCode;
    var hash = 0;
    for (var key in map.keys) {
      var keyHash = _keyEquality.hash(key);
      var valueHash = _valueEquality.hash(map[key] as V);
      hash = (hash + 3 * keyHash + 7 * valueHash) & _hashMask;
    }
    hash = (hash + (hash << 3)) & _hashMask;
    hash ^= (hash >> 11);
    hash = (hash + (hash << 15)) & _hashMask;
    return hash;
  }
}

We may want to consider avoiding that, and instead use Object.hash / Object.hashAll in the core libraries, which implementation can optimize according to various modes (web or non-web, JIT/AOT, 32-bit/64-bit, compressed pointers or not)

/cc @lrhn

@lrhn
Copy link
Member

lrhn commented Jan 4, 2023

I think this code is trying to avoid doing two iterations and any intermediate allocations. It still does a lookup for each key, which is also bad, but possibly only constant-factor bad (assuming the map has constant-time lookup).

It's is valid performance concern to avoid linear allocations or double iteration, but that means we can't directly use Object.hash/Object.hashAll.

We could do Object.hashAll(map.entries.expand((e) => [_keyEquality.hash(e.key), _valueEquality.hash(e.value)]), but that allocates a MapEntry and a two-element list for each entry of the map.

Or Object.hashAll(map.keys.map(_keyEquality.hash).concat(map.values.map(_valueEquality.hash))), which is probably only a constant amount of allocation (five iterables, two closures), but does two iterations.

If we could expose an interface to hashing, like:

class Hasher<H> {
  static const Hasher<Object?> defaultHasher = _SystemHasher();

  H initialState();

  // May modify state and return the same object, or return a new state.
  // Always use the new value for all further operations.
  H update(H state, int value);

  int finalize(H state); 
}

based on the same primitives as the platform hash, then we could write this as:

  int hash(Map<K, V>? map) {
    const hasher = Hasher.defaultHasher;
    var state = hasher.initialState();
    map.forEach((k, v) {
      state = hasher.update(hasher.update(state, _keyEquality.hash(k)), _valueEquality.hash(v));
    });
    return hasher.finalize(state);
  }

In other words, the current platform hash operations are not great for someone wanting to write their own hash for a collection.

@lrhn lrhn added the type-enhancement A request for a change that isn't a bug label Jan 4, 2023
@mkustermann
Copy link
Member Author

If we could expose an interface to hashing, like ...

👍 I've also mentioned this use case in bottom of dart-lang/sdk#50693. Maybe we should consider adding this to corelibs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

2 participants