Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a user id to the collector end point (was record ip/browser agent) #22

Open
2 of 4 tasks
danmarsden opened this issue Dec 4, 2017 · 9 comments
Open
2 of 4 tasks
Assignees

Comments

@danmarsden
Copy link
Member

danmarsden commented Dec 4, 2017

  • When we send the headers we are not yet logged in so we'll send generic ones
  • later in the bootstrap, when we are logged in, lets see if we can overwrite the headers. If we can replace the csp report uri with a collector.php?user=xxx
  • when that is present then a violation comes in then record it too. If it is not then we can fall back to some other method like ip or user agent.
  • recording the user / ip / US against each record will make this a combinatoric explosion in the amount of data we would collect, so not sure if we stash this behind a setting, or just wear it

Original:

It would be really nice if the csp logs also recorded the ip and browser agent strings..

@brendanheywood
Copy link
Contributor

By design we didn't want to collect a ton of data, it's just updating counts of errors so you can see whats the most important stuff to fix first. If we record ips and agents then this feels more like just dumping that into the normal moodle log, or maybe into a special log file or std error with a fixed prefix so can be easily filtered?

@danmarsden
Copy link
Member Author

we've noticed some strange behaviour related to a specific user (probably browser hacked) - it would be nice to have a way to identify this a bit better and match it to the specific user that triggered the event by looking at the csp logs. (and matching against ip/browser agent)

@brendanheywood
Copy link
Contributor

I'm gonna close this. This feels weird to be related to the csp plugin. If it's still an issue please dump some data or more context so I can get a handle on why we'd want this

@danmarsden
Copy link
Member Author

the csp report was showing some extremely suspicious Javascript that was being included in a page that violated the CSP report policy - obviously malicious JS that was being included by a users browser - Turned out the users browser had been hacked but tracking the specific user down was difficult because all CSP reports was the url accessed, if more information was recorded at the time of the report it would have made it much easier to idenify the user so that IT support could make contact and arrange for password resets of all their logins/virus checker installed etc...

@brendanheywood
Copy link
Contributor

That makes sense. You tracked it down via the access log then?

Also was that in report mode or in the real mode? Once we have settled on a policy and are enforcing it we could treat this more as a thing which alerts or collects data. I'm just worried about crazy amounts of data when you first turn this on, as a typical page could have 10 reports on it, multiplied by every page multiplied by every user.

@danmarsden
Copy link
Member Author

only showed in report mode - in real mode you don't get the report at all, because the browser just blocks it.

Was a real pain to track down because the site was under heavy use, and lots of people were hitting the same pages/courses - we had to analyse the list of pages and then try and compare it against the current active users to find a similar pattern of access.

@brendanheywood
Copy link
Contributor

I've softened up on this front so I'll reopen this. I've just come across the exact same scenario and now I can see properly how this is inadequate.

I think when you start out and still don't have a good policy in place you'll be spammed a lot and I don't want to record everything so I'm thinking about either putting this behind a setting so that it only records when you have a good policy in place and you only have a very small report of violations.

I think we can actually instrument this in a way to directly collect the user by adding a user id of some sort into the collection uri in the header

@brendanheywood brendanheywood changed the title record ip/browser agent Add a user id to the collector end point (was record ip/browser agent) Nov 21, 2019
@brendanheywood
Copy link
Contributor

@danmarsden I've just pushed a micro change 10c52a6 which should get us most of the way. Strictly I'm not recording the uid for the user but I just tacking it onto the collector endpoint. I'm not doing anything with it after that but this does let us easy trace these back through via the normal access logs. There will still be some more matching up of things to fully link it, but this is about as good as we can get generically without a crazy performance hit.

@brendanheywood
Copy link
Contributor

Just adding some more thoughts here. In #40 we want to give out finer grained reports to course managers so they can see and fix their own content. However we also want to clearly distinguish between content level violations which need to be fixed by the manager vs violations caused by browser plugins or other issues. These latter tend to be specific to individuals. If we collect data for everyone then we get this massive log explosion, so I'm thinking of a hybrid:

  1. we keep it as a single db record for each class of violation
  2. we have a small field which records a list of affected userids, could be comma sep and contain no more than say 5 ids
  3. if we collect more than 5 we just ignore it
  4. if a report has < 5 ids then we can assume that this is an issue related to a plugin rather than the content

It's not perfect but feels like the right balance of performance and removing noise from the reports. These violations will still be visible to admins and we could also expose them to the course managers too via some warnings around how to interpret them properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants