Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA data source #172

Open
zstumgoren opened this issue Aug 3, 2014 · 3 comments
Open

CA data source #172

zstumgoren opened this issue Aug 3, 2014 · 3 comments
Assignees
Labels

Comments

@zstumgoren
Copy link
Contributor

Write CA data source.

@rkiddy
Copy link

rkiddy commented Feb 1, 2016

@zstumgoren Does every state need a separate repo for -data- and -results-? and -sources? Would a scraper go into the -data- repo?

I am interested in picking up some of the CA work. Making sure I understand your process.

@dwillis
Copy link
Contributor

dwillis commented Feb 1, 2016

Hi @rkiddy: thanks! Every state gets a -results repository at the end of the process: that's where the raw results are published. The -data repos contain results that are pre-processed before we load them into our system (usually this means converting from PDFs or other formats into CSV files). -sources repos are for results files that we get from official sources that aren't posted online. Does that help?

@zstumgoren
Copy link
Contributor Author

Yep. What @dwillis said. Also, to emphasize, you don't necessarily need to have a -data repo. Our normal pipeline is intended to handle most processing tasks, as long as the files fit the mold. The pre-processing code in -data directories are used when the available files require some extra initial wrangling to whip them into shape for the normal processing pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants