-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New requests.exceptions.HTTPError today while wget and browser can still fetch, old python quark maybe? #319
Comments
If I use the following, where the URL is that which the exception is thrown for, but attempt to fetch it with pyton this way:
Then it also works perfectly fine... So something happened with whatever python library or function that is used to fetch the video clips with, it my best guess... But I can not quite find where that is... |
For the record, in case I lose track, I began to encounter this issue at precisely: Tue 19 Dec 2023 10:50:16 AM PST Before that time it seems to have been fetching video clips properly with some custom archival code and loops, which I have been using for a few years now... Maybe Ring or AWS is rejecting my user_agent which is ChromeAtHome/20220801, tried updating to todays data, also tried using nothing. But that part seems to be working fine, I still get all of the event IDs and then eliminate those I already have locally, then try to fetch any that I do not already have... Same as the past few years.. So second system at home yeilds the same results. But all other Cromium, wget, other python url fetching libraries all seem to be able to get the video downloaded... Just this one is returning 404 for some reason all of the sudden.. so weird.. any crazy ideas whats wrong? |
So, I resorted to upgrading to a brand new Raspbian 12 64bit (bookworm), same error... However, this time It threw us a clue...
ring_doorbell.exceptions.RingError: HTTP error with status code 404 during query of url
Although, perhaps Ring has changed something with these URLs which are no longer working with the python-ring-doorbell project? Or, perhaps the python-urllib3 that I believe this project is using for the GET calls for clips no longer works because ring has changed something about the way these are fetched? Shrugs. |
Is no one else having this problem? I have no idea why this behavior changed all of the sudden on me. The Traceback dumps the 404 error and url that if failed as described above, and if I just "wget url_for_share-service-download-bucket.s3.amazonaws.com_blah_blah -O name_wanted_for_saved_file.mp4" then it works perfectly fine. I think I could go so far as to write a further bash wrapper that would scrape off my intended filename, and then grab this url from the Traceback, and then use the pair to have wget to do my video downloads... But that is getting a little bit insane like a bad inception nightmare... |
So, I resorted to bash to catch and filter out the FILE and the URL to attempt with wget from outside the python. It appears that there is more latency than there once was. I would also get the 404 response with wget when I did not have the natural delay of my manually copy/pasting the failed url... By adding an extraordinary 50 seconds of sleep between the url = URL_RECORDING.format(recording_id) line and the req = self._ring.query(url, timeout=timeout) lines 382 and 385-ish in the doorbot.py file, I am now back to successfully downloading clips again. Even with this 50 seconds of added sleep, download attempts do still fail, 30% of the time thus far with a sample size of 3. Manually retrying the third was successful. And now a fourth needed more than 50 seconds. Going to bump this up to 90 seconds and see if I can get less failures. So It appears the above attempt needs to explicitly catch the 404 response code and then do an escalating delay and retries such as sleeping for seconds: 15, 30, 60, 120, 240 and then finally giving up. Or it could perhaps gain a pair of new variables like NotFoundRetryDelay or 5 seconds and NotFoundRetryAttempts of 12=60seconds or 24=2minutes worth of retrying. The goal should be to try a few more times, but not so much that the s3 bucket thinks it is being DOSed by retrying too rapidly... Still got a 404 at 90 seconds which succeeded when retried manually using wget, so bumping up to 120 seconds. It might be possible that an initial attempt triggers the s3 to actually make the target object available, so each retry might improve the odds that the object is actually ready to be downloaded? Still failed at 120 but manually succeeded upon retry, increasing delay to 150 (2.5 minutes), which is absolutely ridiculous... If I am forced to wait multiple minutes between downloading each video then my process here will never keep up with archiving all of my videos like I have been doing for the past several years... So, I might be able to muddle through and add a new handler for 404 to get that function in the mentioned file to retry after a delay a few times, but I promise any such patch is bound to be ugly. So someone elses should probably pick it up from here. I find it difficult to believe that other people are not encountering this problem. |
The bash flavored exception handler in all it's glory, I was using this to simply keep alive my session since late Dec, adapted into retrying the 404 failed attempts:
I normally run an bash alias that calls this custom py of mine and progressively increases the queue_depth starting at 8 until about 4096-ish or more depending on how deep I need to catch up with clips. Most cameras are not so busy and a depth this large can go back a month or so. But some cameras are very busy and have many hundreads or thousands of clips per day. I have yet another bash alias that can take a date.time string and walk backwards in time by the hour or day so to fetch deeper and deeper in time in order to catch up... I'll probably need to leverage that one as I am coming up on 30 days without having been able to fetch any clips. Yet another rub, not getting 403: Forbidden |
I believe the 403 forbidden as because the clip download url only remains valid for less than 3~5 minutes... So perhaps a valid method is to slowly retry during 404 until 403 is returned which indicates it has expired. |
clip seems to remain valid for about 5 minutes to 5min 30 seconds ish. |
I was not able to create a 404 handler and retry because it exits with the Traceback before getting to that point after the line that attempts to download the clip... Perhaps instead, before fetching the clip for download it can query for some other property for the clip like size or modified time and then only when those do not return 404 would it proceed to attempt the download? |
A little context, my ./RingFetch.py is a crummy wrapper python script that gives me some progress output with nice colors and such which looks something like this: Sometimes a clip is 0 bytes, so we skip those. Sometimes we skip cameras by their name, such as ## PlasmaRandom ## is being skipped here. If we will attempt to fetch, we prefix with "Fetchin" then the file name, and a suffex of "Fetch" while in progress. Upon success, the prefix "Fetchin" is replaced with "Success#" and is green and we print the file size as a suffex.
This last one is the new (as of a month ago 404 unhandled exception), which as it turns out, If I could get it to retry untill it succeeded OR until we get a 403 response because the elapsed time has expired the url, then these would work again. But since I'm no good at python, I've composed this nasty bash loop in the mean time, which while ugly, seems to be working and handles all exceptions I've encountered thus far:
DP is the clip depth for each camera, I normally wrap my RingFetch.py in a bash alias loop that increases the queue depth while counting the zero lenth files and incrimenting the depth if the zero size videos is unchanged... If the zero length video count increases, then something when wrong and we should retry at the same queue depth again until we completely fetch all clips for that given depth, only then do we fetch a larger list of clips to be downloaded.
|
@4Dolio quite the extensive history you've recorded here. Definitely provides some valuable insight. I've also been experiencing 404 errors when attempting to download many consecutive camera recordings. Here's what seems to be happening:
From what it seems, there's some kind of wait period that needs to be satisfied before following the redirect, since the Ring API / S3 needs time to actually prepare the file for download. To confirm this theory, you can grab the S3 URL from the thrown error message and paste it in your browser a few seconds later and the file will start downloading just fine (you may need to refresh a few times if it's not ready yet). In addition, this can be confirmed by the behavior that's exerted on the official Ring website upon downloading recordings the normal way. The solution for this library (@tchellomello 😉🔔) would be to implement logic that handles 404 errors and then retries the post-redirect link (S3) until no 404 is returned (with some retry counter/cap of course). The solution above should resolve this issue and prevent additional confusion. |
@sdb9696 gonna bing you on this one to bring to your attention, see above ^ |
@5E7EN do you have the details of the API call it makes? |
@sdb9696 sure, and feel free to have a look yourself as well on the Ring history web dashboard. Clicking the "Download" button triggers the following series of API calls:
|
I managed to resolve the download issue in my case. The 404 error occurs when calling |
@5E7EN does this mean that you do not get the 404 for normal non older_than (current) clips? I managed to work around the problem by wrapping my python (I fumbled around until I got a python process that uses the video file names I want and redraws the progress lines while it attempts, succeeds, fails, gets an empty file, etc)... Anyway, I wrapped that python in a bash shell (that I'm way better at) alias that can scrape out the clip final failed 404 url and the intended file name and then repeat the download attempt using wget repeatedly until it succeeds or I decide it's been too long (about 55.5 minutes). I've been collecting counts of how many times it loops until it succeeds, so could try and quantify the spread of the delays I've seen over the past few months of using this new method. I was seeing the same issue that the clips only become valid after some variable delay of 5 seconds up to 5.5-ish minutes after which they expire again. Since I use mine to download all my videos locally for archival, I can't just wait 5 minutes. Tangent: So I also implemented a parallel race-condition-lock, so I can run many instances of the bash(python) to fetch in parallel and they don't DOS the ring service. Which will cause your IP to get blocked for hours-days. Perk of having multiple uplinks, I'm able to switch the uplink my RasPi is using, FYI should not try to fetch the clip list more than once per 5 seconds as I recall... |
I have begun to get the following error earlier today:
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: hteeteeps :// share-service-download-bucket.s3.amazonaws.com/FooBar...ManyLinesLongURL...OfVideoFile
If I manually open up that url in a browser or with wget then it works fine...
So something seems to have broken with the python in the past 24 hours. Two RasPi which are capable of fetching all my scripts are both doing this. I know just enough python to have written a wrapper for my use case some years ago. But I'm not sure why python can no longer fetch these, when wget and the local browser can? Did some core certificates change perhaps?
The text was updated successfully, but these errors were encountered: