Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine the old lobbying DB and the new lobbying DB into one DB #9

Open
rofreg opened this issue Dec 15, 2017 · 1 comment
Open

Combine the old lobbying DB and the new lobbying DB into one DB #9

rofreg opened this issue Dec 15, 2017 · 1 comment

Comments

@rofreg
Copy link
Collaborator

rofreg commented Dec 15, 2017

It looks like this might be our first major technical task!

Here's the problem:

  • In 2017, the Secretary of State's office started using a brand-new MySQL database for gathering lobbyist data, with a new schema
  • There is an entirely separate MS SQL database that contains all the lobbyist data from the mid-2000s through 2016
  • It would be really great if we could migrate the data from the old MS SQL database and add it to the new MySQL database

Here are some details:

  • To the best of our knowledge, the new database schema is backwards compatible with the old schema (e.g. they added new fields in 2017, but they didn't change any existing fields, we think)
  • The old DB is likely to have malformed data. Almost all fields are unvalidated strings. Any import process will need to look out for bad data and have a process to correct it.
  • I haven't been able to get access to actual data yet, but the old DB is roughly 3-4 GB in size

Any initial thoughts? It's worth noting that this is essentially a one-time task – once we successfully convert the pre-2017 data into the 2017 format, we'll never need to work with the old data again.

I'll let y'all know as soon as we get access to actual backups or schema information.

@dj0nes
Copy link

dj0nes commented Dec 19, 2017

I'd be happy to take this one on. I'll dm for the data.

857 page db schema.... lol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants