Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop tooling for caching and accessing SEC 10k filings during experimental work #3499

Open
2 tasks done
zschira opened this issue Mar 25, 2024 · 0 comments
Open
2 tasks done
Assignees
Labels
mozilla_sec_to_eia Mozilla AI for EJ grant to link SEC utility ownership data to EIA operational data

Comments

@zschira
Copy link
Member

zschira commented Mar 25, 2024

Background

We have all of the SEC filings available in GCS with a metadata DB. To aid exploratory extraction of SEC filings, we need tooling to work with this documents. Getting generic company data out of filings is fairly straightforward as there's standard structure, but exhibit 21's (which contain info on subsidiaries of each company) are much less standardized and will require more complex models to extract this data.

Scope

  • Develop tools to cache filings locally to make test/training sets with low latency access
  • Add ability to create images of exhibit 21's from filings, which can be used in extraction models
@zschira zschira added the mozilla_sec_to_eia Mozilla AI for EJ grant to link SEC utility ownership data to EIA operational data label Mar 25, 2024
@zschira zschira self-assigned this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mozilla_sec_to_eia Mozilla AI for EJ grant to link SEC utility ownership data to EIA operational data
Projects
Status: Done
Development

No branches or pull requests

1 participant