Adapting incidents reports for quantitative analysis #677

lexeree · 2020-06-18T22:45:05Z

Hey, this is an amazing project and I think there are a lot of possibilities for how to use this data. However, I've noticed that incident reports don't seem to be recorded in a way that facilitates a quantitative analysis of the data.

Looking at amalgamated data from different data sets or providing methods to generate analytics/visualizations could be really helpful, and I'd be happy to code them myself...but more data would be needed.

Specifically, I think that whenever possible, it would be great if the following data points could be parsed out from the incident reports:

Number of people injured.
Number of people killed.
Number of people arrested.
Scenario description (e.g. protest, response to 911 call, random encounter...).
Whether or not the incident has been addressed by the police (and the nature of that response).
Whether or not the incident was reported on by any mainstream news outlets (local or national).
Anything else people might want stats on...?

I realize that some of this data can be extracted from the tags, but they may not always be applied consistently or may be too granular/not granular enough for certain analyses.

Thanks!

ContributingThrowaway · 2020-06-19T05:07:31Z

The vast vast vast majority of these occur at, around, or shortly after protests.

Why do you need to know whether the incident was reported on by media? (I ask because often when it was, the relevant coverage is linked; it might be possible to extract relevant coverage from the links.)

I agree that information like numbers of people injured, killed, arrested, etc would be very helpful, but I suspect it will be extremely hard to extract from many (most?) incidents. Some are such that it's not clear what would even count as being injured, killed, or arrested in the course of them. (There is a tag for the rare cases where an individual dies in a given incident.) Perhaps we could add a tag for serious injuries? (I'd also like to see a "wrongful-arrest" tag -- the "arrest" tag is used both when a wrongful arrest is conducted and when a reasonable arrest is conducted using excessive force.)

I'd be very interested down the road to see how many of these are actually dealt with by police. I'd also like some way of marking severity of incidents -- they range from slightly-careless tear-gassing to five officers pinning a man down and beating him senseless for no good reason.

lexeree · 2020-06-19T10:45:21Z

I personally would like to know about media coverage because an issue I have dealt with when speaking to people about police brutality in the protests is "well I haven't seen that much about it!" I think it would be helpful to know how much is covered in national news. The idea about the links is a good idea - I could definitely parse out the site name and compare that against a list...but there's also the possibility that incidents reported in local news could be picked up by larger networks at a later date.

I agree that numbers are difficult for number of people injured, but maybe we could make a web scraper to look for any new information published about incidents, or volunteers could check for updates? Either way, I just think this project is really great, but is limited in use unless we can efficiently extract analytics - being able to use social media posts about police abuse and comparison to media coverage etc could be a very useful tool for being able to give solid numbers regarding the reality of abuse of power among authorities in the US

ed42311 · 2020-06-19T16:56:08Z

@happy-lambdas a few points of clarity here are you talking about a standardized format or just adding more fields for specific sets of information? For instance there is a description with each set a video links and a title. Although maybe scenario description is something more specific.

Scenario description (e.g. protest, response to 911 call, random encounter...).

This is a possibility, with some research ( unless the linked video is a news outlet )

Whether or not the incident was reported on by any mainstream news outlets (local or national).

These are more difficult, with the possible exception of people killed. Injuries and arrests would be based on estimations, I guess we could record that with some explicit margin of error, fact checked against recordings.

Number of people injured.
Number of people killed.
Number of people arrested.

This would be great, and I have to imagine that many of these incidents are stored in the public record, with a little bit of legwork we might be able to link the police reports with the incidents. That sounds like a great idea ( and a lot of work :D )

Whether or not the incident has been addressed by the police (and the nature of that response).

Also I'm on board with some sort of rating system @ContributingThrowaway or maybe a binary. Was there brutality or not? Not sure what to base the metric on if we are going with a scale. The Bethel, Ohio protests seem to me like a decent talking point. There was brutality ( although I did not see recorded instances of police brutality ), and the police did not engage at times, but for the most they tried to act as intermediaries while keeping the peace ( this is an observation as a third party, with minimal context)

ContributingThrowaway · 2020-06-19T22:35:01Z

My first thought is to tag with many severity-relevant characteristics, such that researchers and frontend developers could come up with their own indices for severity -- +5 for improper use of rubber bullets, +6 for a wrongful arrest, etc. Possible tags include:

Whether there was a wrongful arrest; if so how many people were arrested (ex a wrongful-arrest tag for one wrongful arrest, a multiple-wrongful-arrest tag for approx. 2-9 wrongful arrests, and a mass-wrongful-arrest tag for 10 or more)
Tags for bad use of projectiles -- things like indiscriminately firing into crowds, aiming for the head, etc
A serious-injury tag for when someone is severely injured (though this would necessarily be quite subjective)
etc.

Alternatively, we could just introduce three tags -- high-severity, medium-severity, and low-severity -- and gradually build consensus as to which should be used when. That wouldn't allow the same granular analysis, but it would at least help with filtering out the least severe incidents or focusing on the most severe.

What's the process for adding new tags?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapting incidents reports for quantitative analysis #677

Adapting incidents reports for quantitative analysis #677

lexeree commented Jun 18, 2020

ContributingThrowaway commented Jun 19, 2020

lexeree commented Jun 19, 2020 •

edited

ed42311 commented Jun 19, 2020

ContributingThrowaway commented Jun 19, 2020

Adapting incidents reports for quantitative analysis #677

Adapting incidents reports for quantitative analysis #677

Comments

lexeree commented Jun 18, 2020

ContributingThrowaway commented Jun 19, 2020

lexeree commented Jun 19, 2020 • edited

ed42311 commented Jun 19, 2020

ContributingThrowaway commented Jun 19, 2020

lexeree commented Jun 19, 2020 •

edited