Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Justice Counts] Convert all column names to lowercase during spreads…
…heet upload. (Recidiviz/recidiviz-data#29389) ## Description of the change Convert all column names to lowercase during spreadsheet upload. It turns out that the reason we were running into this mysterious/misleading unexpected error for Carrol County (see Recidiviz/recidiviz-data#29303) is because we are converting columns to lowercase when looking for unexpected column names, but then NOT using case-insensitive logic when actually parsing the columns later on. This puts us in a weird state where we aren't catching the unexpected "Year" column during sheet validation, but the later parsing steps don't recognize the "Year" column and throw an unexpected error. One solution here would be to make the parsing logic case-insensitive, however this will lead us into issues later if we miss spots or forget to make case-insensitive parsing later down the road. Instead, let's convert all column names to lowercase as an initial step during workbook upload. We have a place in the workbook uploader code where we drop all rows that contain Nans. Let's do another spreadsheet cleaning step here and convert all the column names to lowercase. ## Testing I tested this change locally by submitting a bulk upload sheet with a "Year" column instead of the lowercase "year" column. Here is the errors that we are getting _without_ any changes. <img width="816" alt="Screenshot 2024-04-26 at 10 49 36 AM" src="https://github.com/Recidiviz/recidiviz-data/assets/130382407/997b5040-c81d-43bb-b4fd-ffcfeddc0c8b"> Here is the improved error once we fix the expected-columns check (by not making that process case insensitive). <img width="773" alt="Screenshot 2024-04-26 at 10 49 58 AM" src="https://github.com/Recidiviz/recidiviz-data/assets/130382407/94de377b-5271-4736-86e0-b133031eb6b9"> And here is the final fix - when we convert all column names to lowercase, making the uploaded spreadsheet a valid one. <img width="1401" alt="Screenshot 2024-04-26 at 10 50 39 AM" src="https://github.com/Recidiviz/recidiviz-data/assets/130382407/1835ff84-751e-4fe5-ac85-d9adc2c745d6"> GitOrigin-RevId: c5401416da9877603636ff45e451129b603fc5aa
- Loading branch information