Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: FirestoreKVStore's aget_all raises AttributeError when collection is not empty #13385

Closed
gblazq opened this issue May 9, 2024 · 1 comment · Fixed by #13386
Closed
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@gblazq
Copy link
Contributor

gblazq commented May 9, 2024

Bug Description

The aget_all method in FirestoreKVStore is assuming the type signature of google.cloud.firestore_v1.async_collection.AsyncCollectionReference.list_documents to be AsyncGenerator[DocumentReference, None]. However, since this is an async collection, the actual return type is AsyncGenerator[AsyncDocumentReference, None] (googleapis/python-firestore#918)

Assuming the wrong return type raises AttributeError: 'coroutine' object has no attribute 'to_dict' on this line

Instead, since doc is an AsyncDocumentReference, it should be awaited before accessing the to_dict method.

Version

0.10.35.post1

Steps to Reproduce

  • Add documents to a FirestoreKVStore
  • Get all documents asynchronously with aget_all

Relevant Logs/Tracbacks

Traceback (most recent call last):
  File "/Users/gubl/repos/mlt-content-generator-backend/contentgenerator/ingestion_pipeline.py", line 196, in <module>
    asyncio.run(main())
  File "/Users/gubl/.pyenv/versions/3.10.12/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/gubl/.pyenv/versions/3.10.12/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/gubl/repos/mlt-content-generator-backend/contentgenerator/ingestion_pipeline.py", line 192, in main
    await asyncio.gather(documents_pipeline.arun(), keywords_pipeline.arun())
  File "/Users/gubl/Library/Caches/pypoetry/virtualenvs/mlt-content-generator-backend-mezdK3Cq-py3.10/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 858, in arun
    nodes_to_run = await self._ahandle_upserts(
  File "/Users/gubl/Library/Caches/pypoetry/virtualenvs/mlt-content-generator-backend-mezdK3Cq-py3.10/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 781, in _ahandle_upserts
    (await self.docstore.aget_all_document_hashes()).values()
  File "/Users/gubl/Library/Caches/pypoetry/virtualenvs/mlt-content-generator-backend-mezdK3Cq-py3.10/lib/python3.10/site-packages/llama_index/core/storage/docstore/keyval_docstore.py", line 577, in aget_all_document_hashes
    for doc_id in await self._kvstore.aget_all(
  File "/Users/gubl/Library/Caches/pypoetry/virtualenvs/mlt-content-generator-backend-mezdK3Cq-py3.10/lib/python3.10/site-packages/llama_index/storage/kvstore/firestore/base.py", line 209, in aget_all
    data = doc.get().to_dict()
AttributeError: 'coroutine' object has no attribute 'to_dict'
@gblazq gblazq added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 9, 2024
Copy link

dosubot bot commented May 9, 2024

The issue you're experiencing with FirestoreKVStore's aget_all method is due to not awaiting the doc.get() coroutine before attempting to call to_dict() on its result. This leads to the AttributeError: 'coroutine' object has no attribute 'to_dict' because doc.get() returns a coroutine that needs to be awaited to obtain the document snapshot, which then allows you to call .to_dict() on the snapshot.

Here's how you can fix the method:

async def aget_all(self, collection: str = DEFAULT_COLLECTION) -> Dict[str, dict]:
    collection_id = self.firestore_collection(collection)
    docs = self._adb.collection(collection_id).list_documents()
    output = {}
    async for doc in docs:
        key = doc.id
        doc_ref = await doc.get()  # Await the coroutine to get the document snapshot
        if doc_ref.exists:
            data = doc_ref.to_dict()  # Now you can safely call to_dict on the document snapshot
            if data is None:
                continue
            val = self.replace_field_name_get(data)
            output[key] = val
    return output

By awaiting doc.get(), you ensure that you're working with a document snapshot from which you can safely call .to_dict(), thus resolving the AttributeError.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant