Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed Data in Courses JSON #688

Open
noschiff opened this issue Apr 24, 2022 · 0 comments
Open

Malformed Data in Courses JSON #688

noschiff opened this issue Apr 24, 2022 · 0 comments
Labels
bug Something isn't working

Comments

@noschiff
Copy link
Member

Our script to populate full-courses.json takes the data as is from the API, and this can lead to issues when our code tries to use these courses.

First, many courses don't have data in the way we expect it to for catalogWhenOffered, such as "Fall or Spring." or "Fall, spring." Second, some strings have NBSPs (Non-Breaking Space), particularly for the catalogComments field. NBSPs are almost never a good idea to use. At best, removing them will avoid formatting issues, and at worst, nothing would change.

I see two primary paths to take with fixing this. We can either implement "cleaning" into our courses-json-generator.ts script to (attempt to) fix the semesters offered and replace NBSPs with spaces, or we can improve our code to verify semesters to accept "weirder" data. I strongly prefer the former, at least for the NBSPs, because this will allow any code that uses the semesters offered to be free to do its job without checking for irregularities. We can also create a script to go through full-courses.json afterwards to clean the data, but it seems way simplier to just do it as we get it from the API.

One issue with trying to fix the semesters in the JSON is that there are some values that we can't handle yet, such as 7-week courses. This is good info to have for the user (I believe its used in the bottom bar?), so I wouldn't want to turn Fall (weeks 1-7) into Fall, but the additional information about the weeks messes with our semester validation. I think we could solve this by storing a field of just the season in full-courses.json while keeping the current catalogWhenOffered field for the user to see. That way, we can provide useful information without having to constantly handle bad data throughout the code. We should, of course, clean the catalogWhenOffered as best as we can to fix capitalzation and overall make it adhere to a standard.

If we make this change, we need to be careful to deal with courses already in a user's plan. It seems that courses already in a user's plan won't show the updated seasons after being changed in full-courses.json.

Before fixing the lowercase s in spring:
example course

After fixing it, the old course is unchanged, but a new one is right:
new
The warning for the fixed course is for it being a duplicate.

@noschiff noschiff added the bug Something isn't working label Apr 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant