Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC Pregnancy page no longer in QA format. #99

Open
mfleming99 opened this issue Apr 18, 2020 · 3 comments
Open

CDC Pregnancy page no longer in QA format. #99

mfleming99 opened this issue Apr 18, 2020 · 3 comments

Comments

@mfleming99
Copy link
Contributor

https://github.com/deepset-ai/COVID-QA/blob/master/datasources/scrapers/CDC_Pregnancy_scraper.py

The CDC changed this page from a QA style page to a factual page on 7 April 2020.
This scraper no longer produces any data when run.

@Timoeller
Copy link
Contributor

Hey @mfleming99 I worked on the CDC general scraper that know has many more QA pairs and moved the CDC_Pregnancy_scraper.py to datasources/outdated in #101

Would you be interested in updating the pregnancy scraper yourself so we can add this data to our backend?

@mfleming99
Copy link
Contributor Author

Hi @Timoeller I would update the pregnancy scraper, but the CDC pregnancy page is no longer has question answer pairs to scrape. The page was converted into an informative page.

@Timoeller
Copy link
Contributor

Ok understood. If you think it is valuable information we should still add it to our service.

If you update the scraper, please also update the manual check for a "?" at the end of each question/statement in our META_scraper.py?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants