You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes court records are updated. Right now, the scraper creates a hash of the html file, in order to differentiate whether the file has changed since the last time it was scraped. But it doesn't actually check the hash.
We want to keep old versions of the records, differentiating each version by a field called "revision id". The latest version of a record would then be the one with the highest number for "revision id", the first version of a record would have revision id = 1.
Add database migration:
add a field for revision id
add a field for hash (for easier lookup)
add a field for case number (for easier lookup)
Add the following logic to the scraper:
if it's a brand new case number, store it with revision id = 1
if it's an old case number, but a new hash, store it with a higher rev id
if it's an old case number and the same hash, don't store the record
(Note the above logic should also solve the problem of multiple court calendar dates pointing to the same record, and currently saving them all. If that problem remains, create a new issue for it.)
The text was updated successfully, but these errors were encountered:
Sometimes court records are updated. Right now, the scraper creates a hash of the html file, in order to differentiate whether the file has changed since the last time it was scraped. But it doesn't actually check the hash.
We want to keep old versions of the records, differentiating each version by a field called "revision id". The latest version of a record would then be the one with the highest number for "revision id", the first version of a record would have revision id = 1.
Add database migration:
Add the following logic to the scraper:
(Note the above logic should also solve the problem of multiple court calendar dates pointing to the same record, and currently saving them all. If that problem remains, create a new issue for it.)
The text was updated successfully, but these errors were encountered: