Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vote format has changed for House 2020? #258

Open
demongolem opened this issue Apr 10, 2020 · 6 comments
Open

Vote format has changed for House 2020? #258

demongolem opened this issue Apr 10, 2020 · 6 comments

Comments

@demongolem
Copy link

demongolem commented Apr 10, 2020

Here is one that is not Python 3 :)

I am running over the code in vote.py and I see that the regex on vote id is failing. And that is because instead of the 4 parts that were expected I am seeing some vote string have 5th parts. The 4th part was the year, but in this string the 5th part is now the year and the 4th part is something which I have not discovered yet. Let me give you an example string. Perhaps the format has changed and newer vote ids need separate processing.

h102-116.5.2020

For regex I have something like this is split_vote_id which is actually in utils.py. Maybe I am missing the end $ in mine, but anyhow an additional number group representing the 5 above needs to be added.

    return re.match("^(h|s)(\d+)\-(\d+)\.(\d+)\.(\d\d\d\d|[0-9A-Z])", vote_id).groups()
    #return re.match("^(h|s)(\d+)-(\d+).(\d\d\d\d|[0-9A-Z])$", vote_id).groups()
@demongolem demongolem changed the title Vote format has changed for Senate 2020? Vote format has changed for House 2020? Apr 10, 2020
@JoshData
Copy link
Member

I run these scripts every few hours every day to pull in new data and haven't been having a problem.

What command line are you using? Where is this vote id coming from?

@demongolem
Copy link
Author

demongolem commented Apr 10, 2020

I use ./run votes. A typical value for vote_id at the above line which I commented out is

h102-116.5.2020h102-116.5.2020h102-116.5.2020h102-116.5.2020h102-116.5.2020

except they are unique ids concatenated together (not the same vote id over and over again) which need to be split (I don't have the output in front of me right now)

At https://github.com/unitedstates/congress/wiki/votes I see the vote id looks like

"vote_id": "h202-113.2013"

@JoshData
Copy link
Member

"h202-113.2013" is what the vote IDs should look like. I'm not sure where the .5 is coming from.

Can you post a stack trace when you get a chance? Hopefully that'll point us in the right direction. :)

@demongolem
Copy link
Author

When I was logging these vote_ids to disk, I omitted a newline :(. So really vote_id is only a single vote_id of the form I indicated.

When I do ./run votes, here is the beginning of the output I get

Going to fetch 102 votes from congress #116.5 session 2020
h102-116.5.2020
h101-116.5.2020
h100-116.5.2020
h99-116.5.2020
h98-116.5.2020
h97-116.5.2020
h96-116.5.2020
h95-116.5.2020
h94-116.5.2020
h93-116.5.2020
h92-116.5.2020
h91-116.5.2020
h90-116.5.2020
h89-116.5.2020
h88-116.5.2020
h87-116.5.2020
h86-116.5.2020
h85-116.5.2020

And here is the stack trace which is received with the regex as it was

[h1-116.5.2020] Exception:

Traceback (most recent call last):

File "/home/gwerner/from_greg/congress/tasks/utils.py", line 182, in process_set
results = fetch_func(id, options, *extra_args)

File "/home/gwerner/from_greg/congress/tasks/vote_info.py", line 15, in fetch_vote
vote_chamber, vote_number, vote_congress, vote_session_year = utils.split_vote_id(vote_id)

File "/home/gwerner/from_greg/congress/tasks/utils.py", line 156, in split_vote_id
return re.match("^(h|s)(\d+)-(\d+).(\d\d\d\d|[0-9A-Z])$", vote_id).groups()

AttributeError: 'NoneType' object has no attribute 'groups'

If I go to an online python regex validator, obviously there will be no matches for the vote_ids which I have supplied.

@JoshData
Copy link
Member

I'm going to go out on a limb here and say that you are somehow running this with Python 3 or non-standard Python 2 command-line arguments? "116.5" looks like 116-and-a-half which suggests some Python 3 division is happening.

@demongolem
Copy link
Author

Yes, I see where the division is happening in utils.py

def congress_from_legislative_year(year):
return ((year + 1) / 2) - 894

Of course in python 3 that would be // instead of /

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants