Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Drive API does not always return permissionIds in files.list request #7853

Closed
chscott opened this issue May 18, 2024 · 8 comments
Closed
Labels
Remote: Drive Support Contract Issues made for customers with support contracts

Comments

@chscott
Copy link

chscott commented May 18, 2024

The associated forum post URL from https://forum.rclone.org

N/A

What is the problem you are having with rclone?

Google Drive does not always return permissionIds in files.list request. I believe I'm seeing the same issue documented in this StackOverflow topic.

Here's what I see in our mapper log when the problem occurs. Unfortunately, I do not have the rclone log, but this logging just prints the raw object received from rclone:

[1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_] Object from rclone: {"SrcFs":"Source{xrgXa}:","SrcFsType":"drive","DstFs":"Target{qpiyZ}:","DstFsType":"onedrive","Remote":"REDACTED.pdf","Size":97263,"MimeType":"application/pdf","ModTime":"2024-01-17T20:29:42.252Z","IsDir":false,"ID":"1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_","Metadata":{"btime":"2024-01-17T20:30:45Z","content-type":"application/pdf","copy-requires-writer-permission":"false","mtime":"2024-01-17T20:29:42.252Z","starred":"false","viewed-by-me":"true","writers-can-share":"false"}}

Note that there is no permissions property.

We enabled verbose logging for a second attempt, but the file wasn't deleted in the target, so it wasn't marked for transfer. But I see this in the rclone log:

{
  "copyRequiresWriterPermission": false,
  "md5Checksum": "a11510a3be225916b8e61fe37d0bd81a",
  "viewedByMe": true,
  "mimeType": "application/pdf",
  "parents": ["1WLFh3GPi4yuo14my91XXdhKEl3LVLkwr"],
  "webViewLink": "https://drive.google.com/file/d/1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_/view?usp=drivesdk",
  "size": "97263",
  "id": "1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_",
  "name": "REDACTED.pdf",
  "starred": false,
  "trashed": false,
  "explicitlyTrashed": false,
  "createdTime": "2024-01-17T20:30:45.000Z",
  "modifiedTime": "2024-01-17T20:29:42.252Z",
  "viewedByMeTime": "2024-01-17T20:30:16.047Z",
  "hasAugmentedPermissions": false,
  "permissionIds": ["08143305669083284370", "09148056542529227285", "05730075870274744324", "14428220901927124198", "10333761160097088327", "01188868095341529512", "11263538924981798501", "15302165612694109360", "15968751374380107705", "01848923844518485244", "17043712991773318823", "00993337219959399960", "13488475314984407437", "15844604519322965153", "08826457509796904545", "12361398014948882636", "14049068988452217961", "17543280361655486121", "16152041014387613021", "14207865859295967165", "02698292558398922638", "08631903179092097718", "06109885187853851371", "04455832434476281173", "12302732694699397834", "07136569753069756070", "07122999719496272434", "17295166189381661175", "02012353872150768475", "13316925832582353280", "03558791494311673145", "06079240731634784000", "01693074727832241524", "16323005513084617832", "09884135463593788567", "05699788620167125482", "07088479073237879776", "11978124046481694590", "02068273953380346298"]
}

When metadata is enabled and a given file does not have the permissionIds property set, can we issue a request to get that information for the file in question, thereby working around this apparent API defect?

What is your rclone version (output from rclone version)

rclone v1.67.0-beta.7957.625622bcb.fix-onedrive-metadata
- os/version: Microsoft Windows 11 Pro 23H2 (64 bit)
- os/kernel: 10.0.22631.3527 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.22.3
- go/linking: static
- go/tags: cmount

Which OS you are using and how many bits (e.g. Windows 7, 64 bit)

Windows 11, 64 bit

Which cloud storage system are you using? (e.g. Google Drive)

Google Drive

The command you were trying to run (e.g. rclone copy /tmp remote:tmp)

rclone copy Source: Target:

A log from the command with the -vv flag (e.g. output from rclone -vv copy /tmp remote:tmp)

I'll email these separately. Unfortunately, this issue is intermittent, and we saw the issue before enabling verbose logging.

How to use GitHub

  • Please use the 👍 reaction to show that you are affected by the same issue.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.
@chscott chscott added the Support Contract Issues made for customers with support contracts label May 18, 2024
@ncw
Copy link
Member

ncw commented May 18, 2024

Heads up @rclone/support - the "Support Contract" label was applied to this issue.

@ncw
Copy link
Member

ncw commented May 20, 2024

We only add permissionIds if we didn't have any permissions.

// PermissionIds: Output only. List of permission IDs for users with
// access to this file.
//
// Only process these if we have no Permissions
if len(info.PermissionIds) > 0 && len(info.Permissions) == 0 {

Perhaps this is the problem? I think the reasoning behind this is that decoded permissions will have everything you need in so only show this if not present. This could be a bit cleverer and only show permisisonIds that aren't in permissions.


From the stack overflow post this looks like a bug in google drive (rclone already has quite a few workarounds for them).

If you do an rclone -M -R --disable ListR lsjson source: on the drive remote then this should be identical to what rclone is doing when it is doing an rclone -M sync/copy

Can you do that a few times and see if you can see flakiness?

I suggest you pipe it through jq to make it easy to read, then diff the trials.

I didn't see a bug in the google bug tracker about this.

When metadata is enabled and a given file does not have the permissionIds property set, can we issue a request to get that information for the file in question, thereby working around this apparent API defect?

Possibly but depending on exactly what is happening we may not be able to tell the permissionsIds are missing or not.

@chscott
Copy link
Author

chscott commented May 20, 2024

I don't have direct access to this environment, but I'll see if I can arrange that test. I corrected my original posting, where I mistakenly said that the permissionIds property was not set in the metadata from rclone. That should have said the permissions property; as far as I know, I don't ever get the permissionIds property in the metadata from rclone, as that wouldn't be helpful for translating to the target backend.

If rclone does not receive a permissions property from Google Drive but does receive a permissionIds property, does it fetch the permissions for those IDs so they can be sent along to the mapper? If so, then I think the problem must be that Google Drive will not always provide at least one of these properties so that the permissions can be available for the mapper. Clarifying my suggestion, I'm asking if we can use the absence of both properties as a signal to proactively attempt to fetch the permissions of an object using the permissions.list API. This is a hack, but if Google Drive won't provide the permissions in the files.list API, it seems better to take the performance hit of additional overhead than to risk thinking a folder/file doesn't have permissions when it really does.

Regarding --disable ListR, is that recommended in general for Google Drive? We don't set --fast-list, but we currently don't specifically disable ListR, so I guess rclone can possibly choose it in our scenarios.

@chscott
Copy link
Author

chscott commented May 20, 2024

@ncw The HTTP response from the test shows the presence of the permissionIds property, but the lsjson output does not have any permissions-related information in the metadata.

HTTP Response

{
  "copyRequiresWriterPermission": false,
  "md5Checksum": "a11510a3be225916b8e61fe37d0bd81a",
  "viewedByMe": true,
  "mimeType": "application/pdf",
  "parents": ["1WLFh3GPi4yuo14my91XXdhKEl3LVLkwr"],
  "webViewLink": "https://drive.google.com/file/d/1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_/view?usp=drivesdk",
  "size": "97263",
  "id": "1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_",
  "name": "REDACTED.pdf",
  "starred": false,
  "trashed": false,
  "explicitlyTrashed": false,
  "createdTime": "2024-01-17T20:30:45.000Z",
  "modifiedTime": "2024-01-17T20:29:42.252Z",
  "viewedByMeTime": "2024-01-17T20:30:16.047Z",
  "hasAugmentedPermissions": false,
  "permissionIds": ["08143305669083284370", "09148056542529227285", "05730075870274744324", "14428220901927124198", "10333761160097088327", "01188868095341529512", "11263538924981798501", "15302165612694109360", "15968751374380107705", "01848923844518485244", "17043712991773318823", "00993337219959399960", "13488475314984407437", "15844604519322965153", "08826457509796904545", "12361398014948882636", "14049068988452217961", "17543280361655486121", "16152041014387613021", "14207865859295967165", "02698292558398922638", "08631903179092097718", "06109885187853851371", "04455832434476281173", "12302732694699397834", "07136569753069756070", "07122999719496272434", "17295166189381661175", "02012353872150768475", "13316925832582353280", "03558791494311673145", "06079240731634784000", "01693074727832241524", "16323005513084617832", "09884135463593788567", "05699788620167125482", "07088479073237879776", "11978124046481694590", "02068273953380346298"]
}

lsjson output

[{
        "Path": "REDACTED.pdf",
        "Name": "REDACTED.pdf",
        "Size": 97263,
        "MimeType": "application/pdf",
        "ModTime": "2024-01-17T20:29:42.252Z",
        "IsDir": false,
        "Hashes": {
            "md5": "a11510a3be225916b8e61fe37d0bd81a"
        },
        "ID": "1mZ2lGOWiTo1fZ9mToOwy7MoELJlhIlH_",
        "Metadata": {
            "btime": "2024-01-17T20:30:45.000Z",
            "content-type": "application/pdf",
            "copy-requires-writer-permission": "false",
            "mtime": "2024-01-17T20:29:42.252Z",
            "starred": "false",
            "viewed-by-me": "true",
            "writers-can-share": "false"
        }
    }
]

@chscott
Copy link
Author

chscott commented May 20, 2024

From a data perspective, it looks like this is another case of permissions not being propagated because they are not set directly on the file (i.e., `"hasAugmentedPermissions": false). So I think things are being filtered out here:

if o.fs.isTeamDrive && !info.HasAugmentedPermissions {
// Don't process permissions if there aren't any specifically set
info.Permissions = nil
info.PermissionIds = nil
}

These being inherited permissions was not my understanding when the issue was raised to me, so I'll pursue it on my end for now. I think it would be helpful to log something (perhaps at the DEBUG level) in this spot so it is clear why permissions are not being propagated.

@ncw
Copy link
Member

ncw commented May 20, 2024

If rclone does not receive a permissions property from Google Drive but does receive a permissionIds property, does it fetch the permissions for those IDs so they can be sent along to the mapper?

Yes it does.

The logic is

  • if rclone gets permissions then it ignores permissionIds
  • if there are no permissions then rclone fetches the permissions mentioned in the permissionIds and returns them as the permissions array in the metadata
  • rclone never puts permissionIds in the metadata (sorry about suggesting it did above!)

In the Google Drive API docs it says this about reading permissions

Output only. The full list of permissions for the file. This is only available if the requesting user can share the file. Not populated for items in shared drives.

So in some cases the permissions won't be available in which case rclone uses the permissionIds to fetch them if possible.

Regarding --disable ListR, is that recommended in general for Google Drive? We don't set --fast-list, but we currently don't specifically disable ListR, so I guess rclone can possibly choose it in our scenarios.

rclone sync won't use ListR by default, but rclone lsjon -R will. The ListR code for google drive has a number of workarounds for bugs in the google API so I wanted to make sure they were using the same listing API.

From a data perspective, it looks like this is another case of permissions not being propagated because they are not set directly on the file (i.e., `"hasAugmentedPermissions": false). So I think things are being filtered out here:

That makes sense.

These being inherited permissions was not my understanding when the issue was raised to me, so I'll pursue it on my end for now. I think it would be helpful to log something (perhaps at the DEBUG level) in this spot so it is clear why permissions are not being propagated.

I had a go at that here

v1.67.0-beta.7971.567fa2514.fix-7853-drive-perms on branch fix-7853-drive-perms (uploaded in 15-30 mins)

It will debug

Ignoring %d permissions and %d permissionIds as is shared drive with hasAugmentedPermissions false

@chscott
Copy link
Author

chscott commented May 20, 2024

@ncw The debug statement works well. Thanks! I think we can consider this issue resolved.

@ncw
Copy link
Member

ncw commented May 21, 2024

Great!

I've merged the debug to master now which means it will be in the latest beta in 15-30 minutes and released in v1.67

I'll close this for now.

@ncw ncw closed this as completed May 21, 2024
miku added a commit to internetarchive/rclone that referenced this issue May 23, 2024
* master: (133 commits)
  build: update all dependencies
  drive: debug when we are ignoring permissions rclone#7853
  Add Dominik Joe Pantůček to contributors
  docs: crypt: fix incorrect terminology
  operations: rework rcat so that it doesn't call the --metadata-mapper twice
  operations: ensure SrcFsType is set correctly when using --metadata-mapper
  onedrive: allow setting permissions to fail if failok flag is set
  Add Evan McBeth to contributors
  docs: improve readability in faq
  fs: fix panic when using --metadata-mapper on large google doc files
  Add JT Olio to contributors
  Add overallteach to contributors
  go.mod: update storj.io/uplink to latest release
  chore: fix function name in comment
  build: update issue label notification machinery
  operations: fix missing metadata for multipart transfers to local disk
  local: implement Object.SetMetadata
  fs: define the optional interface SetMetadata and implement it in wrapping backends
  drive: allow setting metadata to fail if failok flag is set
  cmd/gitannex: When tags do not match, run e2e tests anyway
  ...
gotenksIN added a commit to gotenksIN/rclone that referenced this issue May 26, 2024
* github.com:rclone/rclone: (59 commits)
  docs: fix some comments
  build: update all dependencies
  drive: debug when we are ignoring permissions rclone#7853
  Add Dominik Joe Pantůček to contributors
  docs: crypt: fix incorrect terminology
  operations: rework rcat so that it doesn't call the --metadata-mapper twice
  operations: ensure SrcFsType is set correctly when using --metadata-mapper
  onedrive: allow setting permissions to fail if failok flag is set
  Add Evan McBeth to contributors
  docs: improve readability in faq
  fs: fix panic when using --metadata-mapper on large google doc files
  Add JT Olio to contributors
  Add overallteach to contributors
  go.mod: update storj.io/uplink to latest release
  chore: fix function name in comment
  build: update issue label notification machinery
  operations: fix missing metadata for multipart transfers to local disk
  local: implement Object.SetMetadata
  fs: define the optional interface SetMetadata and implement it in wrapping backends
  drive: allow setting metadata to fail if failok flag is set
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Remote: Drive Support Contract Issues made for customers with support contracts
Projects
None yet
Development

No branches or pull requests

2 participants