Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading House votes in 2001 and 1991 raises exception #275

Open
Andrew-Chen-Wang opened this issue Jul 8, 2021 · 5 comments
Open

Downloading House votes in 2001 and 1991 raises exception #275

Andrew-Chen-Wang opened this issue Jul 8, 2021 · 5 comments

Comments

@Andrew-Chen-Wang
Copy link

Andrew-Chen-Wang commented Jul 8, 2021

Note: I ran the ./votes command for 2001 and 1991

Two house members for each date 2001 and 1991 have the same first, middle, and last name. This is the 2001 data point:

{'C000488': {'type': 'rep', 'start': '1999-01-06', 'end': '2001-01-03', 'state': 'MO', 'district': 1, 'party': 'Democrat'}, 'C001049': {'type': 'rep', 'start': '2001-01-03', 'end': '2003-01-03', 'state': 'MO', 'district': 1, 'party': 'Democrat'}}

Note that they start and end on the same date. This exception is raised when you run:

from .utils import lookup_legislator
from datetime import datetime

lookup_legislator(107, "rep", "Clay", "MO", "D", datetime(year=2001, month=1, day=3), "bioguide")

A solution to this is to check if the multiple matches have the same date for start for one member as the other member's end date. If so, then choose the member that has the latter date because we can compare the date string of the when with each member's start and end.

The only thing that worries me is this comment:

# This is a possible match. Remember which term matched, but because of term overlaps
# on Jan 3's, don't key on the term uniquely, only on the moc.

Does that mean a representative going out can vote on the same day one comes in?

@JoshData
Copy link
Member

JoshData commented Jul 8, 2021

Does that mean a representative going out can vote on the same day one comes in?

Of course. In the general case, a member might resign after a vote on the same day another member elected by special election is sworn in. In the more specific Jan 3 case, there can be a vote in the morning of Jan 3 and a vote in the afternoon of Jan 3 and those would be in different Congresses with a (overlapping but) totally different set of legislators serving.

In this particular case, it's a father-son pair.

To help debugging, the issue you found can be reproduced by running one of:

./run votes --chamber=house --congress=107 --session=2001
./run votes --vote_id=h2-107.200

This was all working at some point because this is how I got the vote data into GovTrack in the first place, but something must have broken.

The way to properly resolve this is for us to compare the congress number of the vote to the congress numbers that the matched terms are for, but the latter needs to be computed (there is a function named get_term_congresses but I can't say if it is correct).

@Andrew-Chen-Wang
Copy link
Author

Andrew-Chen-Wang commented Jul 8, 2021

Thanks for responding quickly. IIRC, from the congress-legislators repo, there was an XML file that included a tag <congress id="Congress number">. I just can't recall where I saw this or which link gets all congressmen data. Which link are we getting all the historical Congressmen from?

In that case, we can then update the files with that new data point, congress

@JoshData
Copy link
Member

JoshData commented Jul 8, 2021

It reads the YAML files at https://github.com/unitedstates/congress-legislators/. (I don't think the XML file you are describing comes from these repos.)

@Andrew-Chen-Wang
Copy link
Author

This repo reads the files stored in that repo. But I was wondering which files that repository collects, not this one.

@JoshData
Copy link
Member

We scrape several sources in that repository. I don't remember off hand what all of the URLs are. But you can scan through the scripts at https://github.com/unitedstates/congress-legislators/tree/main/scripts to see.

For this issue, we can also go an easier route to solve it and just hard code the right bioguide ID to use for each of these votes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants