Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discovery document cache file has invalid JSON (possibly being written by two processes) #970

Closed
dilip-grexit opened this issue Jul 14, 2020 · 3 comments
Assignees
Labels
needs more info This issue needs more information from the customer to proceed. type: question Request for information or clarification. Not an issue.

Comments

@dilip-grexit
Copy link

dilip-grexit commented Jul 14, 2020

All of a sudden, the Gmail API call is failing due to the issue with the discovery doc. The discovery doc cache file was invalid - it looks like two processes were writing at the same time (inspite of the locking?)
Happened at about 9.50 AM, 9th Jul 2020 GMT.

Environment details

  • OS type and version: Ubuntu 14.04.6 LTS
  • Python version: Python 2.7.15
  • pip version: pip 9.0.3 from /home/shard/opt/python-2.7.15/lib/python2.7/site-packages (python 2.7)
  • google-api-python-client version: 1.6.7

Steps to reproduce

  1. Trying to make Gmail API call.

Code example

>>> from googleapiclient.discovery import build
>>> service = build('gmail', 'v1')
  File "/home/shard/opt/python-2.7.15/lib/python2.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/home/shard/opt/python-2.7.15/lib/python2.7/site-packages/googleapiclient/discovery.py", line 225, in build
    credentials=credentials)
  File "/home/shard/opt/python-2.7.15/lib/python2.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/home/shard/opt/python-2.7.15/lib/python2.7/site-packages/googleapiclient/discovery.py", line 331, in build_from_document
    service = json.loads(service)
  File "/home/shard/opt/python-2.7.15/lib/python2.7/json/__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "/home/shard/opt/python-2.7.15/lib/python2.7/json/decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/shard/opt/python-2.7.15/lib/python2.7/json/decoder.py", line 380, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 3770 column 104 (char 166290)

example

I've attached an invalid file here: https://drive.google.com/drive/folders/12LWm_EKNeWenJWUeIJA8BRDrXjz27OYO?usp=sharing

Please notice that in line 7 in the following snippet, we have an invalid JSON, possibly being written by a different process.

"quotaUser": {
      "type": "string",
      "description": "Available to use for quota purposes for server-side applications. Can be any arbitrary string assigned to a user, but should not exceed 40 characters.",
      "location": "query"
    },
    "upload_protocol": {
      "description": "Upload protocol for media (e.g. \"raw\", \"eapis.com/auth/gmail.settings.sharing": {
          "description": "Manage your sensitive mail settings, including who can manage your mail"
        },
        "https://www.googleapis.com/auth/gmail.modify": {
          "description": "View and modify but not delete your email"
        },
        "https://www.googleapis.com/auth/gmail.metadata": {
          "description": "View your email message metadata such as labels and headers, but not the email body"
        },
        "https://www.googleapis.com/auth/gmail.labels": {
          "description": "Manage mailbox labels"
        },
        "https://www.googleapis.com/auth/gmail.addons.current.message.metadata": {
          "description": "View your email message metadata when the add-on is running"
        },
        "https://www.googleapis.com/auth/gmail.insert": {
          "description": "Insert mail into your mailbox"
        }
      }
    }
@busunkim96
Copy link
Contributor

@dilip-grexit What environment are you using the library in? (This changes the type of cache the library chooses). Are you using multiple threads or processes with the library?

@busunkim96 busunkim96 added needs more info This issue needs more information from the customer to proceed. type: question Request for information or clarification. Not an issue. labels Jul 14, 2020
@dilip-grexit
Copy link
Author

We're not using multiple processes or threads (Independent processes using the library are run in the same server using supervisor). It uses file as the caching method. Let me know what details are needed.

@parthea
Copy link
Contributor

parthea commented Dec 7, 2020

Hi @dilip-grexit ,

I'm sorry that it took a while to provide a response. Based on this comment from #1061, google-api-python-client will be downloading discovery docs and distributing them with the library itself. There is a work in progress PR here #1109 and I'm hoping to have it merged soon. At that point caching discovery documents that are included with google-api-python-client should be unnecessary.

If you're still wanting to use caching, I'd like to use an existing issue #1061 to brainstorm improvements to caching in general so that all of the discussions around caching are captured in a single issue. One potential solution is to improve caching is to move to a directory based cache rather than file based cache where each discovery document is in a separate file. This way we don't need to lock the single cache file. This would also reduce the chance of having 'collisions' resulting in corrupt files.

I'm going to close this as duplicate of #1061 but please feel free to re-open it if you're still having trouble.

@parthea parthea closed this as completed Dec 7, 2020
@parthea parthea self-assigned this Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs more info This issue needs more information from the customer to proceed. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants