Skip to content
This repository has been archived by the owner on Mar 12, 2020. It is now read-only.

Remove user from all classrooms tied to a GH organization that has revoked their access #2480

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

JessRudder
Copy link
Contributor

Addresses to https://github.com/github/education/issues/988

Our previous code assumed a 1-to-1 mapping of GitHub organizations to classrooms. However, many of our teachers are using a single organization on GitHub as a source for multiple classrooms.

This update ensures that when access is revoked for a GitHub organization for a user, the user is removed from all of the associated classrooms as well.

Previously we would not remove classroom access if a user was the only user with admin access to the classroom. With this update, we will be removing access regardless. In a small number of cases, this may result in admin-less classrooms.

@JessRudder JessRudder requested a review from a team November 27, 2019 00:20
@JessRudder JessRudder self-assigned this Nov 27, 2019
@ghost ghost requested a review from tessgriffin November 27, 2019 00:20
@JessRudder JessRudder requested review from jeffrafter and spinecone and removed request for tessgriffin November 27, 2019 00:21
expect(assignment.reload.creator_id).not_to eq(@user.id)
end

it "deletes user from multiple organizations (classrooms) mapped to the same GitHub organization" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof this is another one of those times when I wish we didn't have 2 different concepts of "organization" in the backend :C

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I can't decide if it's going to get better or worse if we move into the monolith.

@spinecone
Copy link
Contributor

Should we run a migration/transition to clean up existing user/classroom associations that should have been removed already?

@JessRudder
Copy link
Contributor Author

@spinecone probably...since it's triggered by a webhook when someone's removed from the GitHub organization, I don't think there's a simple way of doing it. We'll likely need to iterate through every user/github_organization combo and make sure the user still has access (though I'm definitely open to a cleaner way of doing it).

Should that be part of this PR so we know when/why it was added? Or should that come in a follow on PR?

@spinecone
Copy link
Contributor

I don't know of a cleaner way than iterating through every user (or organization, whichever has fewer records), I would just make sure to use as few github.com API requests as possible so we don't hit any rate limits. It seems cleaner to me to have everything in this one PR, but I defer to what makes the most sense to you.

@JessRudder
Copy link
Contributor Author

@spinecone I prefer having things together, but I end up getting a lot of feedback that my PRs are too big, so, I'm probably overly cautious now. I'll get it added in and re-ping for review.

We attepted to start with orgs as there were fewer orgs than users and this seemed like a way to limit the amount of api calls. However, in order to check if someone is an admin on an org, we have to make an api call to that org authenticated as someone who is already an admin. That would require going through users in that org to see if one is an admin before running through the remaining users to see if they should be removed or not. It seems simpler now to start with a user and iterate through their orgs (removing them from any they're not an admin for).

github_org_ids.each do |gh_id|
github_org = GitHubOrganization.new(user.github_client, gh_id)
next if github_org.admin?(user.github_login)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love a double check on this line. If this returns true, I want to move on to the next iteration. I only want to move forward to the payload line if github_org.admin?(user.github_login) returns false. I'm 99% sure I've written it right, but, you can never be too sure!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that admins cannot have their access to an organization revoked? I would think we would want to respect the right of an organization to restrict access to their repositories from anyone they want.

@JessRudder
Copy link
Contributor Author

@spinecone this should be ready to review again. I've added a rake task that will go through and check for any records that may need to be cleaned up. It relies on the client for the user that is being checked, so, we shouldn't end up hitting any API limits. I went with a rake task instead of a migration as this may be something we need to run again in the future.

Also, no need to rush the check on this. I won't be deploying it this late on a holiday weekend and I won't be back until Tuesday, so there's plenty of time.

def perform(payload_body)
return true unless payload_body.dig("action") == "member_removed"

github_user_id = payload_body.dig("membership", "user", "id")
github_organization_id = payload_body.dig("organization", "id")
@organization = Organization.find_by(github_id: github_organization_id)
organizations = Organization.where(github_id: github_organization_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need an .all? It feels like this will return a scope and the subsequent check for empty? will always be false. Unless that needs to be an any? or .count == 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely want the collection here, so, we shouldn't tack the .all? on, otherwise, we'll be iterating over a boolean on line 18.

I think empty? does work here though. That query returns an ActiveRecord::Relation. If you run .empty? on an AR relation, it behaves as expected (if there are no records in it, it returns true, else false).

But, it seems like .empty? might be a non-standard thing to use here. I ended up using it because I had a vague recollection that one of the checks was less effecient than the others. Based on this post it seemed .empty? was the way to go. That said, it's unlikely we'll be in the situation where this type of optimization will truly matter, so if .any? would be the more common thing to use here, I'm happy to update the check on line 16.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, if it works then it's good! Either way. I realized I had a formatting error above: .all? was supposed to be .all? (i.e., the question mark wasn't meant to be part of the method, but was definitely formatted that way 😭)


@user = @organization.users.find_by(uid: github_user_id)
failed_removals = organizations.reject do |org|
user = org.users.find_by(uid: github_user_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how many orgs we will generally have or if there is a way to avoid this N+1. It seems like it will be negligible and in a job... so... not a big deal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried fixing this by using an includes where I grab the organizations above:

organizations = Organization.where(github_id: github_organization_id).includes(:users)

However, I end up with an error from the bullet gem:

     Bullet::Notification::UnoptimizedQueryError:
       user: jessrudder
        
       AVOID eager loading detected
         Organization => [:users]
         Remove from your finder: :includes => [:users]
       Call stack
         /Users/jessrudder/github/classroom/spec/support/bullet.rb:16:in `block (2 levels) in <top (required)>'
         bin/rspec:16:in `load'
         bin/rspec:16:in `<main>'

If an includes would be the right thing to do here to avoid the N+1, I can whitelist this for the bullet gem so it doesn't unfairly fail the test. If not, is there another way to avoid the N+1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm I think this is probably okay as an N+1, I don't think we will have too many of these. And I realized it is just run by the rake task.

Copy link
Contributor

@jeffrafter jeffrafter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants