Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firestore: using ArrayUnion in 'update()' or 'set(merge=true)' on nested data removes all previous data #14

Closed
A-Iskakov opened this issue Mar 17, 2019 · 16 comments · Fixed by #290
Assignees
Labels
api: firestore Issues related to the googleapis/python-firestore API. priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. status: blocked Resolving the issue is dependent on other work. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@A-Iskakov
Copy link

Environment details

  1. Cloud Firestore
  2. Ubuntu 18.04
  3. Python 3.7
  4. google-cloud-firestore Version: 0.31.0

Steps to reproduce

  1. Use ArrayUnion([1,2,3]) on updating nested data in Document reference (see code example bellow)

Code example

from google.cloud import firestore
from google.cloud.firestore_v1beta1 import ArrayUnion

# creating reference
db = firestore.Client()
doc_ref = db.collection(u'test_collection').document(u'test_document')

# creating initial data with nested tree_1
doc_ref.set(
            {
                u'forest': {
                    u'tree_1': u'oak'
                }
            }
        )

# getting document after initial data
print('Here is initial data with 1 nested tree', doc_ref.get().to_dict())

# adding tree_2 to nested data (tree_1 remains in nested data)
doc_ref.set(
            {
                u'forest': {
                    u'tree_2': u'pine'
                }
            },
            merge=True
        )

print('Here is 2 trees in nested data after using set(merge=true)', doc_ref.get().to_dict())

# this code removes other nested data in document
# including tree_1 and tree_2, even if merge=true  is used
doc_ref.set(
            {
                u'forest': {
                    u'tree_3': ArrayUnion([u'spruce'])
                }
            },
            merge=True
        )

print('after using ArrayUnion, all previous nested data is removed', doc_ref.get().to_dict())

Stack trace

Here is initial data with 1 nested tree {'forest': {'tree_1': 'oak'}}
Here is 1 initial tree and 1 additional tree after set(merge=true) {'forest': {'tree_1': 'oak', 'tree_2': 'pine'}}
after using set(merge=true) with ArrayUnion all nested data is removed {'forest': {'tree_3': ['spruce']}}

Many Thanks!

@tseaver tseaver changed the title using ArrayUnion in update() or set(merge=true) on nested data removes all previous data Firestore: using ArrayUnion in update() or set(merge=true) on nested data removes all previous data Mar 18, 2019
@tseaver tseaver changed the title Firestore: using ArrayUnion in update() or set(merge=true) on nested data removes all previous data Firestore: using ArrayUnion in 'update()' or 'set(merge=true)' on nested data removes all previous data Mar 18, 2019
@tseaver
Copy link
Contributor

tseaver commented Mar 20, 2019

@A-Iskakov Thanks for the report. I believe this behavior is due to the same underlying cause as googleapis/google-cloud-python#7215: the code leaves behind the parent mapping in the update_mask when moving the transform (ArrayUnion in your case, SERVER_TIMESTAMP in googleapis/google-cloud-python#7215) to the second write of the commit.

@A-Iskakov
Copy link
Author

@tseaver Hello!
Thank you for reply!

I believe this behavior is due to the same underlying cause as googleapis/google-cloud-python#7215

I believe not as I have tried to replace ArrayUnion with SERVER_TIMESTAMP in the code to reproduce current issue, it works fine, nothing is deleted.
So the possible cause of problem in googleapis/google-cloud-python#7215 might be use of dots while addressing nested objects like 'foo.now'

@A-Iskakov A-Iskakov reopened this Mar 20, 2019
@A-Iskakov
Copy link
Author

@tseaver hello
I double checked the issue with SERVER_TIMESTAMP:
when SERVER_TIMESTAMP is only data to be sent with set(dict, merge=True) function, all original data is also erased.
if you add any other data to nested dictionary among with SERVER_TIMESTAMP, function works correct

@HemangChothani
Copy link
Contributor

@A-Iskakov hello
I have checked issue it works corrrect, you just have to use ArrayUnion([u'spruce'].values, else it throws a TypeError.

Code Example

from google.cloud import firestore
from google.cloud.firestore_v1beta1 import ArrayUnion

db = firestore.Client()
doc_ref = db.collection(u'test_collection').document(u'test_document')

# creating initial data with nested tree_1
doc_ref.set(
            {
                u'forest': {
                    u'tree_1': u'oak'
                }
            }
        )
print('Here is initial data with 1 nested tree', doc_ref.get().to_dict())
doc_ref.set(
            {
                u'forest': {
                    u'tree_3': ArrayUnion([u'spruce']).values
                }
            },
            merge=True
        )

print('after using ArrayUnion' , doc_ref.get().to_dict())

Output:

Here is initial data with 1 nested tree {'forest': {'tree_1': 'oak'}}
after using ArrayUnion {'forest': {'tree_1': 'oak'},{'tree_3': ['spruce']}}

@tseaver
Copy link
Contributor

tseaver commented Jul 19, 2019

I can confirm that the issue remains even after the merge of googleapis/google-cloud-python#8701

Python 3.7.1 (default, Oct 22 2018, 11:31:38) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from google.oauth2 import service_account
>>> from google.cloud import firestore
>>> creds = service_account.Credentials.from_service_account_file('/path/to/creds.json')
>>> client = firestore.Client(credentials=creds, project=creds.project_id)
>>> doc_ref = client.document('gcp-7523', 'test-document')
>>> doc_ref.get().to_dict()
>>> doc_ref.set({'forest': {'tree_1': 'oak'}})
update_time {
  seconds: 1563560252
  nanos: 187033000
}

>>> doc_ref.get().to_dict()
{'forest': {'tree_1': 'oak'}}
>>> doc_ref.set({'forest': {'tree_2': 'pine'}}, merge=True)
update_time {
  seconds: 1563560269
  nanos: 162782000
}

>>> doc_ref.get().to_dict()
{'forest': {'tree_1': 'oak', 'tree_2': 'pine'}}
>>> doc_ref.set({'forest': {'tree_3': firestore.ArrayUnion(['spruce'])}}, merge=True)
update_time {
  seconds: 1563560280
  nanos: 164830000
}

>>> doc_ref.get().to_dict()
{'forest': {'tree_3': ['spruce']}}

@jadekler, @schmidt-sebastian ISTM that the behavior here (set-with-merge of data with nested transforms wipes out other nested keys) is mandated by https://github.com/googleapis/conformance-tests/blob/master/firestore/v1/set-st-merge-nonleaf-alone.json. Can you confirm?

@jeanbza
Copy link
Member

jeanbza commented Jul 19, 2019

cc @BenWhitehead who has taken over those tests / firestore.

@tseaver
Copy link
Contributor

tseaver commented Jul 19, 2019

@jadekler Thanks, sorry for the noise.

This is my reproducer system test for this issue:

def test_repro_7523(client, cleanup):

    doc_ref = client.document("gcp-7523", "test-document")
    cleanup(doc_ref.delete)
    doc_ref.delete()
    tree_1 = {"forest": {"tree-1": "oak"}}
    tree_2 = {"forest": {"tree-2": "pine"}}
    tree_3 = {"forest": {"tree-3": firestore.ArrayUnion(["spruce"])}}

    doc_ref.set(tree_1)
    expected = tree_1.copy()
    assert doc_ref.get().to_dict() == expected

    doc_ref.set(tree_2, merge=True)
    expected["forest"]["tree-2"] = tree_2["forest"]["tree-2"]
    assert doc_ref.get().to_dict() == expected

    doc_ref.set(tree_3, merge=True)
    expected["forest"]["tree-3"] = "spruce"
    assert doc_ref.get().to_dict() == expected

@BenWhitehead
Copy link
Collaborator

BenWhitehead commented Jul 22, 2019

After chatting with @schmidt-sebastian this issues does appear to be a bug, and the merge with transform should work with nested fields.

We identified this conformance test set: ServerTimestamp alone with MergeAll as similar to this issue, however this issue is updating in a nested map whereas the test is not.

I'm going to look into adding a new test for this case and we can ensure it's working for all the clients.

Internal bug for new test b/138130675

@tseaver
Copy link
Contributor

tseaver commented Jul 22, 2019

@BenWhitehead OK, cool. Please do also consider the set-st-merge-nonleaf-alone test (whose description reads "set-merge: non-leaf merge field with ServerTimestamp alone").

@tseaver
Copy link
Contributor

tseaver commented Sep 27, 2019

@BenWhitehead Any update on this issue? I don't see any new commits to the firestore conformance tests.

@amitjain-v
Copy link

Is there any known workaround for this issue ?

@crwilcox crwilcox transferred this issue from googleapis/google-cloud-python Jan 31, 2020
@product-auto-label product-auto-label bot added the api: firestore Issues related to the googleapis/python-firestore API. label Jan 31, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 3, 2020
@crwilcox crwilcox added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Feb 4, 2020
@BenWhitehead BenWhitehead removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 4, 2020
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 4, 2020
@BenWhitehead
Copy link
Collaborator

The new conformance test it still on my backlog I hope to get to it soon.

@BenWhitehead BenWhitehead removed the 🚨 This issue needs some love. label Feb 14, 2020
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 15, 2020
@BenWhitehead BenWhitehead added type: process A process-related concern. May include testing, release, or the like. and removed type: process A process-related concern. May include testing, release, or the like. labels Aug 6, 2020
@crwilcox
Copy link
Contributor

The conformance test work is still pending at this time.

b/138130675

@crwilcox crwilcox added the status: blocked Resolving the issue is dependent on other work. label Nov 4, 2020
@tseaver
Copy link
Contributor

tseaver commented Jan 19, 2021

PR #219, which dropped the second Write.transform message in favor of using update_transforms on the first Write message, may have fixed this bug.

@crwilcox
Copy link
Contributor

@tseaver that seems likely. @craiglabenz and I just paired and have a conformance test as well. I think what might have been missing is some of the machinery around update masks also. But we should be protected once that conformance test is merged :)

@crwilcox
Copy link
Contributor

gcf-merge-on-green bot pushed a commit that referenced this issue Jan 21, 2021
Adds a system test demonstrating that ArrayUnion operation no longer deletes pre-existing data

Fixes #14 🦕
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/python-firestore API. priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. status: blocked Resolving the issue is dependent on other work. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants