Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to download the structured files directly in XLSX / CSV format #43

Open
pyturn opened this issue Jun 1, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@pyturn
Copy link

pyturn commented Jun 1, 2020

Hi,

Can you add the ability by which we can download the structured file formats directly. For example -

Capture_ability

@jadchaar
Copy link
Owner

jadchaar commented Jun 7, 2020

Hey @pyturn thanks for suggesting this feature. Can you provide me a link to the page shown in the screenshot?

Also I am curious, what value does the excel document provide? Curious to understand use cases :).

@jadchaar jadchaar changed the title Add ability to download the structured files directly. In (XLSX / CSV formats) Add ability to download the structured files directly in XLSX / CSV format Jun 30, 2020
@jadchaar jadchaar added the enhancement New feature or request label Jun 30, 2020
@jadchaar
Copy link
Owner

jadchaar commented Jan 9, 2022

Examples of XLSX:

Seems that the URL hierarchy is standard: https://www.sec.gov/Archives/edgar/data/{CIK}/{ACCESSION_NUM}/Financial_Report.xlsx

This should be quite easy to add if the filename is standard as Financial_Report.xlsx, else adding this functionality would require web scraping. Without web scraping, I may be able to attempt the download at the URL https://www.sec.gov/Archives/edgar/data/{CIK}/{ACCESSION_NUM}/Financial_Report.xlsx and if I get an error, I can just ignore the download and move on. If every filing uses Filing_Report.xlsx all these downloads should succeed since the resource would exist at the URL.

@lkl2050
Copy link

lkl2050 commented Jul 25, 2022

hope to add function to download other types of attachment documents as well. Like https://www.sec.gov/Archives/edgar/data/1934348/000166919122000687/offeringstatement.pdf

Their names are not always like offeringstatement.pdf, but they will usually be pdf and jpg files, so its possible to use regex to allow downloading all urls that have a format of https://www.sec.gov/Archives/edgar/data/CIK/ACCESSION_NUM/.*.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants