589/add year selector #642

McEileen · 2019-03-13T16:20:12Z

Status

Ready for review

Description of Changes

Fixes #604 and #589 .

Changes proposed in this pull request:

Add last_employment_date and last_employment_details fields to all officers.
Add last_employment_date and last_employment_details fields to the add/edit officer forms.
Add a year dropdown selector to the page that displays all of a department's officers.

Notes for Deployment

Screenshots (if appropriate)

Tests and linting

I have rebased my changes on current develop
pytests pass in the development environment on my local machine
(There are two failing tests, test_admin_can_disable_user and test_lastname_capitalization, however, they were failing before this PR.
flake8 checks pass

dismantl · 2019-03-20T20:11:17Z

I'm not understanding what the 'Year' dropdown in the filter section does. Does that mean year of employment?

McEileen · 2019-03-20T20:27:18Z

Yes, so for example, if someone were trying to find an officer they interacted with in 2018, they could select "2018" in the dropdown. @camfassett pointed out in this issue that as we obtain rosters for 2019, we want historic data to remain accessible to the public. It's a good way to see how officer's assignments or roster sizes change over time.

McEileen · 2019-03-26T21:14:26Z

Could someone please review this when they get a chance? Thanks! 🙏

dismantl

Great work on this! I left a couple comments, mostly about avoiding static lists of years. Let me know what you think about that.

OpenOversight/app/main/forms.py

.vscode/settings.json

OpenOversight/app/utils.py

OpenOversight/migrations/versions/f01cb1bc6293_.py

McEileen · 2019-06-21T22:09:34Z

@r4v5 @redshiftzero Could one of you please review this PR? @dismantl already reviewed it once and his main feedback was to remove the hardcoded lists of years and to instead use a QuerySelectField. Thank you! 🙏

McEileen · 2020-02-24T04:09:04Z

I found a bug in this PR where officers that started working for a department after the year selected would still appear in results. For example, if a user selected "1970" from the drop-down selector, an officer that started working in "2005" would still appear in the results.
I will fix this bug and then ping someone for a re-review.
The PR is ready for re-review!
I fixed the bug where the year selector returned officers who worked for the department after the selected year. I also fixed a bug where adding a new officer to the department would cause the year_choices() method to return duplicate years.
@redshiftzero @r4v5 @dismantl @b-meson @brianmwaters Your thoughts are appreciated.

McEileen · 2020-03-13T01:38:55Z

OpenOversight/app/utils.py

@@ -62,6 +62,14 @@ def dept_choices():
    return db.session.query(Department).all()


+def year_choices():
+    years = [year[0] for year in db.session.query(cast(func.extract('year', Officer.employment_date), Integer)).filter(Officer.employment_date.isnot(None)).order_by(func.extract('year', Officer.employment_date))]
+    years_deduped = list(set(years))


I realize this isn't a super efficient way to dedupe the years. Please let me know if anyone can think of a more performant approach.

Could we just hardcode the years to be from like the year we founded OO or something (open to discussion how far back to go), to the current year?

Even though the current code isn't ideal, I prefer it to hardcoding a list of years. I think hardcoding is brittle and vulnerable to breakage.

I am not sure I understand what exactly we want to get here. I think whatever years we offer to select should be an uninterrupted range. If it was 1990, 1995, 2003 and 2010 a user would probably be somewhat confused. If you want 2005, do you select 2003 or 2010? I would either hardcode the start year (maybe 2014 or something, I don't think we have much data from before that) and go until current year or take the lowest that's in the database and offer every year from that year on until current year

My goal with year_choices() is to display every year for which a department has roster information.

take the lowest that's in the database and offer every year from that year on until current year

I went for this approach.

brianmwaters · 2020-07-10T20:00:12Z

Still working on my in-depth code review here, but I noticed in the meantime that the "year" selector only appears in the filter bar on the left-hand side of the officer listing page, but not on the "find an officer" page (e.g, /find). This seems to sort of make sense, but just want to verify that that's by design and not an oversight (ba-dum tiss)

brianmwaters

Review done! Sorry if this is too nitpicky, I guess that's what happens when you do a line-by-line review. Also github says there's a merge conflict so we need to fix that too. From a high level this looks like a great new feature, thanks @McEileen

brianmwaters · 2020-07-09T22:21:25Z

OpenOversight/app/main/forms.py

@@ -202,6 +202,9 @@ class AddOfficerForm(Form):
                            query_factory=unit_choices, get_label='descrip',
                            allow_blank=True, blank_text=u'None')
    employment_date = DateField('Employment Date', validators=[Optional()])
+    last_employment_date = DateField('Last Employment Date', validators=[Optional()])
+    last_employment_details = StringField('Last Employment Details', default='', validators=[
+        Regexp(r'\w*'), Length(max=50), Optional()])


Why are we restricting this comment to just alphanumeric characters? What about punctuation, etc? It seems like a length-limited text field should be pretty open ended.

My first thought was, "shoot, you're right." Then I tested this and was able to successfully enter last employment details with punctuation.

When I changed the validator from Regexp(r'\w*') to Regexp(r'\w'), then I was no longer able to enter punctuation. It appears that the wildcard adds flexibility to this regex.

The issue here is that Regexp(r'\w*') doesn't do anything. It matches literally any string. Regexp(r'\w') only fails if the first character of the string is punctuation. So the question here really is which strings should be allowed and which shouldn't?

I think my 2019 self used this regex because it's being used in most of the other string fields in main/forms.py. Since @abandoned-prototype is right that it matches literally any string, I changed the regex to require alphanumeric characters in its input. The field now allows punctuation, but will throw an error if the input is only punctuation.

brianmwaters · 2020-07-09T22:22:16Z

OpenOversight/app/main/forms.py

@@ -248,6 +251,9 @@ class EditOfficerForm(Form):
    gender = SelectField('Gender', choices=GENDER_CHOICES, coerce=lambda x: x or None,
                         validators=[AnyOf(allowed_values(GENDER_CHOICES))])
    employment_date = DateField('Employment Date', validators=[Optional()])
+    last_employment_date = DateField('Last Employment Date', validators=[Optional()])
+    last_employment_details = StringField('Last Employment Details', default='', validators=[
+        Regexp(r'\w*'), Length(max=50), Optional()])


Ditto the above comment about alphanumeric chars

brianmwaters · 2020-07-09T22:31:49Z

OpenOversight/app/commands.py

-                        ((salary.overtime_pay and 'overtime_pay' in row and
-                            Decimal('%.2f' % salary.overtime_pay) == Decimal('%.2f' % float(row['overtime_pay']))) or
-                            (not salary.overtime_pay and ('overtime_pay' not in row or not row['overtime_pay']))):
+                        ((salary.overtime_pay and 'overtime_pay' in row and Decimal('%.2f' % salary.overtime_pay) == Decimal('%.2f' % float(row['overtime_pay']))) or (not salary.overtime_pay and ('overtime_pay' not in row or not row['overtime_pay']))):


This conditional was super confusing before and is super duper confusing now! Not gonna lie I have no idea what's going on here. Do you think you could shed some light w/ a comment?

This only appears in the diff because of a formatting change. Since I didn't write the logic, I'm not the right person to add an explanatory comment.

This is comparing the imported salary with the officer's existing salaries, due to type and formatting differences.

brianmwaters · 2020-07-10T20:34:50Z

OpenOversight/app/main/views.py

@@ -481,12 +476,15 @@ def list_officer(department_id, page=1, race=[], gender=[], rank=[], min_age='16
    if request.args.get('rank') and all(rank in rank_choices for rank in request.args.getlist('rank')):
        form_data['rank'] = request.args.getlist('rank')

+    if request.args.get('year') and request.args.get('year') in year_choices():


It looks like this was already happening the code base before you added this line, but I've always learned to use is not None to check for None in Python, as opposed to relying on the falsy-ness of None. This is recommended by PEP 8 (see the 2nd bullet point under the heading): https://www.python.org/dev/peps/pep-0008/#programming-recommendations

brianmwaters · 2020-07-10T20:41:20Z

OpenOversight/app/utils.py

@@ -62,6 +62,14 @@ def dept_choices():
    return db.session.query(Department).all()


+def year_choices():
+    years = [year[0] for year in db.session.query(cast(func.extract('year', Officer.employment_date), Integer)).filter(Officer.employment_date.isnot(None)).order_by(func.extract('year', Officer.employment_date))]
+    years_deduped = list(set(years))


Could we just hardcode the years to be from like the year we founded OO or something (open to discussion how far back to go), to the current year?

brianmwaters · 2020-07-10T20:42:52Z

OpenOversight/app/utils.py

@@ -300,6 +310,15 @@ def filter_by_form(form, officer_query, department_id=None):
        officer_query = officer_query.filter(db.or_(db.and_(Officer.birth_year <= min_birth_year,
                                                            Officer.birth_year >= max_birth_year),
                                                    Officer.birth_year == None))  # noqa
+    if form.get('year') and form.get('year') in year_choices():


Same comment as above w/ regard to checking the truthyness of None

brianmwaters · 2020-07-10T20:44:42Z

OpenOversight/tests/conftest.py

@@ -81,17 +82,29 @@ def pick_salary():
    return random.randint(100, 100000000) / 100


+def generate_random_date(year, month):


Hmmmmm randomness in tests seems bad. What about picking a random number once and then hardcoding that into the tests?

Unless, if this is standard practice in our test suite already, then I guess it's fine to keep doing it.

I think it's okay because a lot of the other methods in conftest.py, like pick_birth_date() or pick_gender(), use random.randint() or random.choice(). Let me know if you disagree.

It is true that we do this in a lot of places and I think it is actually a problem. I saw an issue with the randomness of our test-data just yesterday.
But given that it is already so widespread I guess it's fine to keep doing it for now. And if it's just the actual value randomness is not as bad as deciding if a field is None or not based on randomness.

PR #756 offers an alternate implementation that might have better random distribution since it hashes the officer name: https://github.com/lucyparsons/OpenOversight/pull/756/files#diff-4eef8ece740b94822be839e43d28d7a7R42

I saw an issue with the randomness of our test-data just yesterday.

What was the problem you saw with test data randomness? Is it something that we need to open an issue for?

I can use the implementation from PR #756, but I would also like to know more about the problems with our existing use of random test data.

I will open an issue on the concrete problem I saw and probably add some ideas I have for moving forward

brianmwaters · 2020-07-10T20:46:19Z

OpenOversight/tests/test_utils.py

@@ -118,6 +119,86 @@ def test_filter_by_partial_unique_internal_identifier_returns_officers(mockdata)
        assert returned_identifier == identifier


+def test_year_filter_selects_officer_who_quit_that_year(mockdata):


Looks like you covered a lot of corner cases, nice job

McEileen · 2020-07-27T01:28:56Z

Still working on my in-depth code review here, but I noticed in the meantime that the "year" selector only appears in the filter bar on the left-hand side of the officer listing page, but not on the "find an officer" page (e.g, /find). This seems to sort of make sense, but just want to verify that that's by design and not an oversight (ba-dum tiss)

Yep, I only put the "year" selector on the browse page, since that's where the original issue, #589, requested it. Still, thanks for opening (ba-dum tiss?) my eyes to this.

McEileen · 2020-07-29T02:27:11Z

I responded to feedback. Here's a video of the feature in action.

McEileen · 2020-08-02T20:16:05Z

I responded to feedback. Thanks to everyone who offered their thoughts.

abandoned-prototype · 2020-08-05T02:59:54Z

OpenOversight/tests/conftest.py

+def pick_date(seed: bytes = None, start_year=2000, end_year=2020):
+    # source: https://stackoverflow.com/questions/40351791/how-to-hash-strings-into-a-float-in-01
+    # Wanted to deterministically create a date from a seed string (e.g. the hash or uuid on an officer object)
+    from struct import unpack
+    from hashlib import sha256
+
+    def bytes_to_float(b):
+        return float(unpack('L', sha256(b).digest()[:8])[0]) / 2 ** 64
+
+    if seed is None:
+        seed = str(uuid.uuid4()).encode('utf-8')
+
+    return datetime.datetime(start_year, 1, 1, 00, 00, 00) \
+        + datetime.timedelta(days=365 * (end_year - start_year) * bytes_to_float(seed))
+


Nice approach. If we want to go with this I think we should put the bytes_to_float function in a utils-file are something so that we can use it everywhere, and I suggest rewriting the function as

def bytes_to_float(seed = None): if seed is None: seed = uuid.uuid4() return float(unpack('L', sha256(str(seed).encode('utf-8')).digest()[:8])[0]) / 2 ** 64

that way any object that has a str function can be used as seed.

if we want the result to be deterministic we have to make sure that the input is deterministic too. In this case it seems that the last name is used, which is unfortunately randomly chosen.

abandoned-prototype · 2020-08-05T03:19:51Z

OpenOversight/app/utils.py

+    oldest_year = db.session.query(cast(func.extract('year', Officer.employment_date), Integer)) \
+                            .filter(Officer.employment_date.isnot(None)) \
+                            .order_by(func.extract('year', Officer.employment_date)) \
+                            .limit(1).first()[0]
+    current_year = datetime.datetime.now().year
+    year_choices = [str(year) for year in range(oldest_year, current_year + 1)]
+    return year_choices


This looks good. I think for the oldest year we should use a SELECT min(employment_date) FROM officers to make it as efficient as possible. I looked up how to do that in sqlalchemy and it seems
oldest_year = db.session.query(func.min(Officer.employment_date)).scalar().year
should be do the trick.
What do we want to do in the case that there is no officer with a (not NULL) employment date?

I keep coming back to this and I am think we should just hard-code the oldest_year. I think we now have some officers in the database whose employment_date is before 1950, which will make the drop-down menu pretty long and we really don't have good data for those year anyways. So my suggestion would be to hard-code it to 1980 or something, but I am also fine with the suggestion I made, if we can come up with a solution for an officer table without any non-NULL employment dates.

abandoned-prototype · 2020-09-03T04:37:03Z

OpenOversight/app/templates/officer.html

+              {% endif %}
+              {% if officer.last_employment_details %}
+                <tr>
+                  <td><b>Last Employment Date Details</b></td>


Can we change the label that we display to the users to something that is easier understood? Like "Details on Resignation" or "Details on End of Employment" or something? Not exactly sure how this field will be used though

dismantl reviewed Mar 27, 2019

View reviewed changes

OpenOversight/app/main/forms.py Outdated Show resolved Hide resolved

.vscode/settings.json Outdated Show resolved Hide resolved

OpenOversight/app/utils.py Outdated Show resolved Hide resolved

OpenOversight/migrations/versions/f01cb1bc6293_.py Outdated Show resolved Hide resolved

McEileen mentioned this pull request Apr 22, 2019

604/add last employment date #636

Closed

3 tasks

McEileen force-pushed the 589/add-year-selector branch 2 times, most recently from 56a7759 to f606b88 Compare June 21, 2019 21:45

McEileen force-pushed the 589/add-year-selector branch from f606b88 to 6a8e1c7 Compare October 20, 2019 02:01

McEileen force-pushed the 589/add-year-selector branch from 48e1e60 to b4fbab1 Compare January 21, 2020 02:16

McEileen changed the title ~~589/add year selector~~ [WIP] 589/add year selector Feb 24, 2020

McEileen force-pushed the 589/add-year-selector branch from b4fbab1 to 1704181 Compare March 13, 2020 01:02

McEileen changed the title ~~[WIP] 589/add year selector~~ 589/add year selector Mar 13, 2020

McEileen force-pushed the 589/add-year-selector branch 2 times, most recently from dd010ad to bb076a1 Compare March 13, 2020 01:28

McEileen commented Mar 13, 2020

View reviewed changes

McEileen force-pushed the 589/add-year-selector branch from bb076a1 to 2dc4ef7 Compare June 25, 2020 03:44

McEileen requested review from redshiftzero, abandoned-prototype, brianmwaters and r4v5 June 25, 2020 03:45

brianmwaters reviewed Jul 10, 2020

View reviewed changes

McEileen force-pushed the 589/add-year-selector branch from 2dc4ef7 to 34972cc Compare July 29, 2020 02:22

McEileen added 3 commits August 2, 2020 12:12

add year selector without hardcoding year options

4c5eeb5

resolve make populate error

d477566

refactor year_choices() and year filter in browse form

a1a1d6d

McEileen added 3 commits August 2, 2020 12:12

do cleanup, including alembic split head fix

88d1dec

make tweaks and fix merge conflict

896496d

refactor regexp validator, test date data and year selector

8eef74c

McEileen force-pushed the 589/add-year-selector branch from 34972cc to 8eef74c Compare August 2, 2020 20:15

abandoned-prototype reviewed Aug 5, 2020

View reviewed changes

abandoned-prototype reviewed Sep 3, 2020

View reviewed changes

abandoned-prototype mentioned this pull request Sep 28, 2020

Indicate whether officers are actively employed or retired #836

Open

		@@ -81,17 +82,29 @@ def pick_salary():
		return random.randint(100, 100000000) / 100


		def generate_random_date(year, month):

		@@ -118,6 +119,86 @@ def test_filter_by_partial_unique_internal_identifier_returns_officers(mockdata)
		assert returned_identifier == identifier


		def test_year_filter_selects_officer_who_quit_that_year(mockdata):

589/add year selector #642

Are you sure you want to change the base?

589/add year selector #642

Conversation

McEileen commented Mar 13, 2019

Status

Description of Changes

Notes for Deployment

Screenshots (if appropriate)

Tests and linting

dismantl commented Mar 20, 2019

McEileen commented Mar 20, 2019

McEileen commented Mar 26, 2019

dismantl left a comment

Choose a reason for hiding this comment

McEileen commented Jun 21, 2019

McEileen commented Feb 24, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianmwaters commented Jul 10, 2020

brianmwaters left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

McEileen commented Jul 27, 2020

McEileen commented Jul 29, 2020

McEileen commented Aug 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

McEileen commented Feb 24, 2020 •

edited