Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraping statutes at large (bill text 1951-1992) #238

Open
TomHSYu opened this issue Feb 3, 2019 · 5 comments
Open

Scraping statutes at large (bill text 1951-1992) #238

TomHSYu opened this issue Feb 3, 2019 · 5 comments

Comments

@TomHSYu
Copy link

TomHSYu commented Feb 3, 2019

Hi,
I am trying to download bill text for pre-1993 periods that have no normal bill text. I followed the instructions in the bill text page and successfully downloaded the statutes data (replacing fdsys with govinfo) below. However, the scraper line does not appear to be running, as it relies on modules removed per #169. I tracked down the missing file bill_versions.py and downloaded other older related files (e.g., fdsys.py), but the latter appears to rely on older versions of some current files (e.g., bill_info.py). Is there any way I could download the older bill text?
./run fdsys --collections=STATUTE --store=mods,pdf --granules
./run statutes --volumes=65-106 --textversions --extracttext
Thanks in advance, and please let me know if you need any clarification!

@JoshData
Copy link
Member

JoshData commented Feb 3, 2019

It's probably that when I overhauled the fdsys/govinfo scraper last year, because fdsys was shut down, I broke the other parts while trying to get the scraper working with govinfo.gov. (As far as I knew I was the only person who ran the other scripts, and I didn't need them at the time.) Unfortunately I don't really have the time to revisit the statutes scripts and get them back into shape.

@qdupouy
Copy link

qdupouy commented Mar 10, 2019

If the statutes scraper is broken and THOMAS is gone, does that mean that there is not a way to access the bills data from the 93rd to 100th Congress? I am hoping to find the committee of referral(s) for each bill from the 93rd Congress to the present.

@JoshData
Copy link
Member

JoshData commented Mar 10, 2019

The Statutes at Large scraper (which reorganizes Statute text as bill text) doesn't really have anything to do with bill data like that, but, yes, now that THOMAS is gone, there is no official raw data or a scraper for bill data from the 93rd to the 112th Congress.

I have archived my last scrape of THOMAS here which has that data (please credit the Library of Congress (the maintainer of THOMAS) and GovTrack.us):
https://www.govtrack.us/files/thomas-bill-status-data.tgz [link updated]

@lwaltman
Copy link

Is there any way you can re-expose the old bill (I believe THOMAS) directories that used to be exposed under https://www.govtrack.us/data/congress/ for a bit? I just need to collect a couple bills from there. If not, I understand if it takes too much time.

@JoshData
Copy link
Member

Sorry I renamed the file that I posted in my previous comment. Archival data from THOMAS is now posted at:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants