Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seems to be missing some data #230

Open
programmin1 opened this issue Sep 13, 2018 · 18 comments
Open

Seems to be missing some data #230

programmin1 opened this issue Sep 13, 2018 · 18 comments

Comments

@programmin1
Copy link

Some of the congressional votes seem to be missing. I just ran ./run votes and I can't find the latest ones that McHenry voted in for example:
https://mchenry.house.gov/voterecord/

Maybe I made a dumb data mistake in processing, I found only three nonneuteral votes in h(number)/data.json in my downloaded dataset.

@simon555
Copy link

Hi, did you get any updates on this ?

Thanks!

@programmin1
Copy link
Author

It appears the latest commit to this repo was a year ago. Not sure if there is more accurate and up to date data somewhere else?

@JoshData
Copy link
Member

Can you be more specific about the problem you're having? The vote-fetching code in this repo is working fine for me.

@simon555
Copy link

Thanks for your answers,

When I run the script for scaping the votes, say congress=114 and session=2015, I collect ~400 files from the senate (with chamber=senate in the votes params)

However, in the bills data, I can see that there are 3000+ bills that have been passed this year.

Maybe I am missing something but it looks like there are some bill data that don't have vote data, right?

Thanks for your help! :)

@JoshData
Copy link
Member

JoshData commented Sep 24, 2019 via email

@simon555
Copy link

Sure,

For instance in the 114th congress,

'''python ./run votes --congress=114 --chamber=senate''' for both sessions of this congress
brings 502 votes over 2015 and 2016 years

However, the text data scraped with
'''!python ./run govinfo --collections=BILLS --congress=114 --store=text --bulkdata=False '''
brings 3275 bills with associated texts.

It seems that there are some bills that don't have votes recorded, right? In this case it would concern 3275 - 502 = 2773 bills

@dwillis
Copy link
Member

dwillis commented Sep 24, 2019

@simon555 Most bills never get a single vote, fwiw, so this isn't a case of missing data.

@JoshData
Copy link
Member

@simon555 Derek is of course right, but I just want to add, my suggestion is to pick a particular bill to look at, if you want to understand better what's going on. It'll become clearer if you look at any of the actual bill data files. Or any bill on Congress.gov.

@simon555
Copy link

Sorry for the confusion, here is what I want to do :

Given the vote data on a particular bill, I want to retrieve the text data of that bill.

I can download the vote data with the previously cited command, but it seems that I am not looking for the text data in the right place...

Please, taking congress 114 as example as above, how can I retrieve the text data of the 502 bills that have been voted on ?

Thank you!

@JoshData
Copy link
Member

Did you try following the steps on the Bill Text page in the wiki?

@simon555
Copy link

I did, but I end up with more bill text than bill votes.

@JoshData
Copy link
Member

Unless your hard drive is really small and this is causing you to run out of disk space, I'm not understanding the problem. Since you have downloaded the vote data and the bill text data, you should have everything you need. Where are you getting stuck?

@simon555
Copy link

Sorry I was not clear :

I cannot link vote data and text data : I can't find a common key in the vote data that allows me to collect the text data associated to a certain vote

@JoshData
Copy link
Member

JoshData commented Sep 29, 2019

The vote data (edit: the vote data JSON files) includes information about what the vote was about. The very end of the Votes wiki page documents the fields that indicate the bill that the vote is related to (if any, and note that most votes that are related to a bill are not votes to pass the bill).

@dwillis
Copy link
Member

dwillis commented Sep 29, 2019

In each vote data file there is a bill object if the vote was on a bill (many votes, especially in the Senate, do not pertain to a specific bill, so you will not find a bill for every vote). For example:

"bill": {
    "congress": 114, 
    "number": 3762, 
    "title": "Restoring Americans' Healthcare Freedom Reconciliation Act of 2015", 
    "type": "hr"
  }, 

The bill is hr3762.

@simon555
Copy link

simon555 commented Oct 29, 2019

Hi,

Thanks for all these clarifications.

I had a question regarding the amendment of a bill. Given a vote that amends a bill, I will have access to the bill information, as well as the amendment info.

I have the possibility to retrieve the text version of the bill through the BILL TEXT section of this repo. However, I can't see where I can download an AMENDMENT TEXT. The AMENDMENT section of the bill contains general info about it but does not provide the updated version of the bill text if the amendment has passed.

Do you know how I can collect the updated version of the bill text through the several amendments?

@Gneedsausername
Copy link

Gneedsausername commented Aug 6, 2022

Hey,
Wondering if anyone has had issue with the votes scraper? Specifically, House votes before session 108 don't seem to pull any data, while all Senate sessions 101 onwards and House sessions 108 onwards downloads data, and the vote counts corroborate the totals from source url for each session.

I used a bash file to try to get them all initially- but calling specific house session votes (i.e. usc-run votes --congress=107 --session=2001 --chamber=h for instance, does not pull any data.

@JoshData
Copy link
Member

JoshData commented Aug 6, 2022

I ran that command and it started pulling data and creating files e.g. data/107/votes/2001/h488/data.json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants