Skip to content
This repository has been archived by the owner on Feb 12, 2019. It is now read-only.

keybase chat commands get slower and slower #1888

Open
mortenkjarulff opened this issue Oct 31, 2018 · 24 comments
Open

keybase chat commands get slower and slower #1888

mortenkjarulff opened this issue Oct 31, 2018 · 24 comments
Labels

Comments

@mortenkjarulff
Copy link

Attached script do:

while true ; do

  sleep 15

  keybase chat send ...

  ls /keybase/team/...

  echo '{"method": "read", ... | keybase chat api ...

  if 6 hours has gone
    run_keybase
  fi

done

After a while, chat send and chat read gets slower and slower, going from less that 1 second to more than 60 seconds.
sleep and ls times seems constant.
When 6 hours has gone and the run_keybase is executed, times get low again (but build up over the next 6 hours).

If I outcomment the ls command, all times are constant low.

These logs are from the same run, WITH the ls command:
a4f72ae29df5ba7e9a4c2a1c
0e3959537531bc60a74ac21c

These logs are from the same run, WITHOUT the ls command:
41c5342a84ccee599571831c
b7cc0500989c6a67920c0a1c

/Morten

@mortenkjarulff
Copy link
Author

test.sh.txt

@strib
Copy link
Contributor

strib commented Oct 31, 2018

@mortenkjarulff did you mean to file this in keybase/client? Chat is pretty much completely separate from kbfs at this point.

@mortenkjarulff
Copy link
Author

Hi,

Appears not to be so separate 😀

Shall I file it there instead?

/Morten

@strib strib added the acked label Oct 31, 2018
@strib
Copy link
Contributor

strib commented Oct 31, 2018

Oh, I guess I glossed over or didn't understand the line about "outcomment the ls".

Ok, here's what appears to be the issue. The team folder you're lsing has been written to by a device that has since been revoked. When you ls it, that triggers kbfs to attempt to "garbage-collect" the old data in it, by iterating through the old revisions of the folder. When it looks at the revisions that were written by the now-revoked device, it triggers a bunch of extra checking to verify that the device was valid at the time of the revision.

It looks like, in the case of this folder, the revisions by that revoked device are so expensive that they are slowing down chat.

Can you just ls that folder once, and maybe let the garbage collection run overnight and see if it finishes? You can tell if it made progress overnight by looking at the LastGCRevision field in the team folder's .kbfs_status file.

If it doesn't make any progress, even after many hours of work, I'll try to figure out a custom solution for you.

And obviously we need to figure out a way to reduce these extra checks without affecting security...

@mortenkjarulff
Copy link
Author

Not sure I understand. I can reproduce at any time, even long time after I revoked.

I will manually ls now, and start the script tomorrow.

@strib
Copy link
Contributor

strib commented Oct 31, 2018

We only garbage-collect data when it is at least two weeks old, but it could be longer depending on when your devices were online and when they accessed this particular folder. In your case the device was revoked on October 3rd, but it's trying to garbage-collect data written on September 29th.

@mortenkjarulff
Copy link
Author

Ok, will try

Log after manual ls
a0cb1d208188c2886665501c

@mortenkjarulff
Copy link
Author

LastGCRevision": 0,

Is that good?

@strib
Copy link
Contributor

strib commented Oct 31, 2018

That means it has never been successfully garbage collected. If the team/folder is very old, this is surprising and might indicate some other bug.

@mortenkjarulff
Copy link
Author

Timestamp on folder is October 8, that could well be the date I created the subteam

@strib
Copy link
Contributor

strib commented Oct 31, 2018

Looks like it was created on September 20, so that lines up with the garbage collection and revocation messages I mentioned above.

@mortenkjarulff
Copy link
Author

LastGCRevision": 0,
d46b49510feabaac7be4161c

If it can help

@strib
Copy link
Contributor

strib commented Oct 31, 2018

Please just let it run for a long time uninterrupted.

@mortenkjarulff
Copy link
Author

Still 0

@strib
Copy link
Contributor

strib commented Nov 1, 2018

Hrm that's a shame. I'm going to work today on making this whole thing better.

strib added a commit that referenced this issue Nov 1, 2018
Instead of hitting the server every single time we need to verify a
revoked key.  For example, when we have to verify lots of MDs written
by the same revoked key when doing garbage collection.

Issue: KBFS-3581
Issue: #1888
strib added a commit that referenced this issue Nov 1, 2018
Instead of hitting the server every single time we need to verify a
revoked key.  For example, when we have to verify lots of MDs written
by the same revoked key when doing garbage collection.

Issue: KBFS-3581
Issue: #1888
strib added a commit that referenced this issue Nov 1, 2018
Instead of hitting the server every single time we need to verify a
revoked key.  For example, when we have to verify lots of MDs written
by the same revoked key when doing garbage collection.

Issue: KBFS-3581
Issue: #1888
@strib
Copy link
Contributor

strib commented Nov 1, 2018

I put up #1893 to help eliminate much of the redundant work we're doing for revoked devices. It's under review now, but if you're feeling bold and want to try it out before it's merged officially, let me know.

@mortenkjarulff
Copy link
Author

When will it be official maybe?

If it can help, how can I test it?

Would it help me if I created a new subteam, copy over and rename?

/Morten

@strib
Copy link
Contributor

strib commented Nov 2, 2018

It probably won't be out in the official release for a while, but I've made a .deb file for you, if you feel like trying it out:

/keybase/public/strib/keybase-2.10.0-20181102162247.e4a1860091-amd64.deb

or if you trust keybase.pub:

https://keybase.pub/strib/keybase-2.10.0-20181102162247.e4a1860091-amd64.deb

Note this is not a well-tested release, so there could be some rough edges. Also, there might be multiple issues going on with your folder, and I only fixed the most obvious first one, so this debugging might take more steps. We'll see!

If you feel like trying it, please install the above deb, run run_keybase, ls your folder again, and wait a bit. If you do a log send after a few minutes I can check to see if it seems to be making progress. Thanks!

@mortenkjarulff
Copy link
Author

Installed your test version.
Still 0.
921efd3e424dcbad89bf081c

/Morten

@strib
Copy link
Contributor

strib commented Nov 4, 2018

Thanks. I was wrong to say that it starts the GC process after 1 minute -- we changed it a while back to one hour. It looks like you sent the log less than an hour after you did the ls. Can you please send another log now if it's been up this whole time?

@mortenkjarulff
Copy link
Author

"LastGCRevision": 9,
d96ca4acddd6f20834cd6d1c

It has been running for some hours.

/Morten

@strib
Copy link
Contributor

strib commented Nov 4, 2018

Ok cool. It at least stopped chewing up your CPU right? I'll look into why it hasn't gone past 9 yet tomorrow, but that seems like a separate issue.

@mortenkjarulff
Copy link
Author

Have been running my test script all day and send/read times are constant low.

I don't think the issue was CPU, but can be wrong.

Should I reinstall the official version again, or?

/ Morten

@strib
Copy link
Contributor

strib commented Nov 5, 2018

Great!

I wouldn't recommend re-installing the master version, as the problem could come back. Just wait for the next apt release, which will include these changes. Though if you seem to be having unexpected problems with the release you have. let me know.

strib added a commit that referenced this issue Nov 12, 2018
Instead of hitting the server every single time we need to verify a
revoked key.  For example, when we have to verify lots of MDs written
by the same revoked key when doing garbage collection.

Issue: KBFS-3581
Issue: #1888
strib added a commit that referenced this issue Nov 12, 2018
Instead of hitting the server every single time we need to verify a
revoked key.  For example, when we have to verify lots of MDs written
by the same revoked key when doing garbage collection.

Issue: KBFS-3581
Issue: #1888
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants