Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The screenshot file is missing when pulling API data from the official website #311

Open
r90tpass opened this issue Feb 16, 2024 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed question Further information is requested

Comments

@r90tpass
Copy link

Notes

The screenshot file is missing when pulling API data from the official website
Hello, big man
After I set up the project and finished running poetry run tools/import_from_instance.py
Found missing screenshots file, please tell me what to do?

Now the situation is that my web page has a screenshot of the button, but there is no image on the back end, resulting in 404.
Will poetry run scrape all of them and then overwrite the screenshot file to fix this?
Please tell me what to do.

I hope you can help me, thank you

@r90tpass r90tpass added the help wanted Extra attention is needed label Feb 16, 2024
@FafnerKeyZee
Copy link
Collaborator

I need to update the importer :/

@liamhess
Copy link

liamhess commented May 13, 2024

Heyhey,
I noticed this too but now looking at how big the screenshots folder gets in just a couple of days, is this even feasible to do?

Because as I saw in another issue recently the site is having uptime issues already so opening this image export would have to be super limited in order to not just DOS the site constantly right?

Maybe it could be an option to put the images in a minio s3 server and host that separately, however that might add too much complexity and license problems for a questionable return.

Or do you have any other ideas on how to achieve this image export feature? I know relatively little about the main instance but it just runs on one server currently right?

@FafnerKeyZee
Copy link
Collaborator

Hello,

Just for your information, we have something like 6K screenshots from scraping and more than 23k images extracted from Telegram channels...
If we check the size for all data it's more than 7.5Go. So not sure about using the API is the best way... (and telegram only use 1.5Go)
So we need some time to thing about the best way. :/

Regards,

BTW, we upgraded the server today due to physical failure on the older :(

@FafnerKeyZee FafnerKeyZee added enhancement New feature or request question Further information is requested labels May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants