Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an ignore_order to DeepHash #335

Open
Eric-Vignola opened this issue Aug 10, 2022 · 2 comments
Open

Add an ignore_order to DeepHash #335

Eric-Vignola opened this issue Aug 10, 2022 · 2 comments

Comments

@Eric-Vignola
Copy link

It appears DeepHash ignores the order of the given object by default to compute a combined hash.

# 3 example objects
x = {'a':0, 'b':[1,2,3]} # a baseline example object
y = {'b':[1,2,3],'a':0}  # key order swapped 
z = {'a':0, 'b':[2,1,3]} # swapped positions in list for key 'b' 

# in all examples, the combined hash is the same
print (DeepHash(x)[x]) # '343d77f8a45dac16bc49a7be37c1ee73250ac4311e316862393f3c2552ff5b64'
print (DeepHash(y)[y]) # '343d77f8a45dac16bc49a7be37c1ee73250ac4311e316862393f3c2552ff5b64'
print (DeepHash(z)[z]) # '343d77f8a45dac16bc49a7be37c1ee73250ac4311e316862393f3c2552ff5b64'

It would be incredibly useful to respect order when computing hash signatures of complex data structures, something like:
DeepHash(x, ignore_order=False)[x] == DeepHash(z, ignore_order=False)[z] # Returns False

Allowing dict keys as an exception would also be great to give more flexibility:
DeepHash(x, ignore_order=False, sort_dict=True)[x] == DeepHash(y, ignore_order=False, sort_dict=True)[y] # Returns True
DeepHash(x, ignore_order=False, sort_dict=True)[x] == DeepHash(z, ignore_order=False, sort_dict=True)[z] # Returns False

@seperman
Copy link
Owner

@Eric-Vignola interesting idea. Currently DeepDiff uses DeepHash to figure out identical objects before it starts digging into the ones that are not identical. Then and only then inside DeepDiff we start ignoring order between these nonidentical objects.

What you are asking also needs a rewrite into how we serialize objects. A non-trivial amount of work needs to be done for that to happen.

@Okroshiashvili
Copy link

#373

This is my issue pointing the same. I've closed it thinking it was silly question 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants