Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Repetition of Elements when Generating Delta Between Dictionaries #410

Open
kfirc opened this issue Jul 25, 2023 · 1 comment
Assignees
Labels

Comments

@kfirc
Copy link

kfirc commented Jul 25, 2023

Please checkout the F.A.Q page before creating a bug ticket to make sure it is not already addressed.

Describe the bug
When using the deepdiff library to discern the differences between two dictionaries and generate a delta, an unexpected repetition of elements occurs.

To Reproduce

from deepdiff import DeepDiff, Delta

d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}

deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_repetition=True)

result = d2 + Delta(deep_diff_result)
print(result)
  1. Take two dictionaries:
d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}
  1. Use the following code to compare them:
from deepdiff import DeepDiff, Delta
deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_repetition=True)
  1. Check the output:
{'repetition_change': {"root['a'][0]": {'old_repeat': 3, 'new_repeat': 4, 'old_indexes': [0, 1, 2], 'new_indexes': [0, 1, 2, 3], 'value': {'id': 1}}}}
  1. Apply the delta to d2:
result = d2 + Delta(deep_diff_result)
print(result)

Expected behavior
I anticipated the only difference between d1 and d2 to be the {'id': 4} entry.

OS, DeepDiff version and Python version (please complete the following information):

  • OS: macOS
  • Version Ventura 13.4.1
  • Python Version 3.9.0
  • DeepDiff Version 6.3.0

Additional context
The result produced was {'a': [{'id': 1}, {'id': 1}, {'id': 1}, {'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}, which had four repetitions of {'id': 1}.

I've not found any similar issue on Stack Overflow, and I've reviewed open and closed issues on the deepdiff GitHub repository without identifying any similar scenarios.

Related Research: I've looked through Delta Documentation, but it didn't provide clarity for this particular case.

Thanks in advance

@seperman
Copy link
Owner

seperman commented Aug 31, 2023

Hi @kfirc
Thanks for reporting this.
What is happening here is that exclude_regex_paths is not working properly with report_repetition:

In [1]: from deepdiff import DeepDiff, Delta
   ...:
   ...: d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
   ...: d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}
   ...:
   ...: deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_re
   ...: petition=True)
   ...:

In [2]: deep_diff_result
Out[2]:
{'repetition_change': {"root['a'][0]": {'old_repeat': 3,
   'new_repeat': 4,
   'old_indexes': [0, 1, 2],
   'new_indexes': [0, 1, 2, 3],
   'value': {'id': 1}}}}

In [3]: deep_diff_result = DeepDiff(d1, d2, ignore_order=True, report_repetition=True)

In [4]: deep_diff_result
Out[4]: {'iterable_item_added': {"root['a'][3]": {'id': 4}}

What delta object gets in your case is that {'id': 1} needs to be repeated 4 times. That's why you get the unexpected result.

@seperman seperman added the bug label Aug 31, 2023
@seperman seperman self-assigned this Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants