Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tumblr [downloader.http][warning] '404 Not Found' #5565

Open
openDef opened this issue May 8, 2024 · 7 comments
Open

tumblr [downloader.http][warning] '404 Not Found' #5565

openDef opened this issue May 8, 2024 · 7 comments

Comments

@openDef
Copy link

openDef commented May 8, 2024

Good afternoon, dear developer!
Please help me solve this problem
I apologize in advance for the grammar (automatic translation)
When downloading from tumblr, very often the download gets up for quite a long period of time , then the download continues until the next error.
The wrong address opens normally in the browser
If you restart the download, the error is repeated with another link
Extremely rare, but the download can go without errors
That's what it shows

[downloader.http][warning] '404 Not Found' for 'https://64.media.tumblr.com/561d741270e97b2f944093eda765f06b/720934f95e54bd55-5f/s99999x99999/7dc8a9a2aa7debaafa8faf365d06a050d34348ad.png'
[download][info] Trying fallback URL #1

@openDef
Copy link
Author

openDef commented May 8, 2024

I also noticed this error , which is extremely rare

[tumblr][warning] Unable to fetch higher-resolution version of https://64.media.tumblr.com/9dfa1ca50184eaa057b8ebf44837f62e/ec050682828daa53-14/s99999x99999/e8456d7273a15107ee4943b37cdc603a29ab23e1.jpg (734239171816898560)
[download][error] Failed to download e8456d7273a15107ee4943b37cdc603a29ab23e1.jpg

@Hrxn
Copy link
Contributor

Hrxn commented May 14, 2024

@openDef

Well, #2957 was closed for a reason, because the problem has been solved.

Sometimes there may be errors with tumblr, but only occasionally, and they are very rare, depending on how their servers handle the requests made by gallery-dl with the URL modifications. There is nothing on gallery-dl's side that can be done about this, fundamentally. Or, to put it in another way, gallery-dl has already everything to mitigate this problem as best as possible.

Here's what you should be doing:

  1. A continuous "logfile" ("mode": "a") so that you always keep track of any error that could possibly occur.
{
   "output":
   {
       "log": {
           "level": "info",
           "format-date": "%Y-%m-%dT%H:%M:%S",
           "format": {
               "debug"  : "\u001b[0;37mDebug  :  {name} -> {message}\u001b[0m",
               "info"   : "\u001b[1;37mInfo   :  {name} -> {message}\u001b[0m",
               "warning": "\u001b[1;33mWarning:  {name} -> {message} {extractor.url:?[/]/}\u001b[0m",
               "error"  : "\u001b[1;31mError  :  {name} -> {message} {extractor.url:?[/]/}\u001b[0m"
           }
       },

       "logfile": {
           "path": "D:\\gallery-dl\\gallery-dl.log.txt",
           "mode": "a",
           "format": {
               "debug"  : "[{asctime}][{levelname}] {message}",
               "info"   : "[{asctime}][{levelname}] {message}",
               "warning": "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]",
               "error"  : "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]"
           },
           "format-date": "%Y-%m-%dT%H:%M:%S",
           "level": "info"
       }
   }
}
  1. Use the correct extractor options for tumblr. Here's what I am using (except for the tokens, obviously)
{
    "extractor":
    {
        "tumblr":
        {
            "avatar": false,
            "external": true,
            "inline": true,
            "original": true,
            "ratelimit": "wait",
            "reblogs": true,
            "posts": "all",
            "fallback-delay": 90.0,
            "fallback-retries": 6,

            "retries": 24,
            "skip": "abort:12",
            "sleep-request": [0.2, 0.6],
            "sleep-extractor": [0.4, 2.0],
            "blacklist": ["twitter", "instagram", "flickr"]
        }
    }
}

Using "original": true is kind of optional, but it prevents downloading lower-res as a fallback, so that you can avoid cleaning that up, later, should you not want to keep them.

The "fallback-delay" and "fallback-retries" options are important here for the URL substitution trick, because the source of these errors is either tumblr's CDN not giving us the response that we want, or other intermittent network issues, so simply raise retries and delay to deal with this.

Should their servers still act up after all waiting and retrying, we now have the log that we can use.

[yyyy-mm-ddTHH:MM:SS][warning] Unable to fetch higher-resolution version of https://64.media.tumblr.com/<...>  (<post_ID>) [Source URL: <blog_URL>]

You can reconstruct the post URL, by grabbing the <post_ID> and the <blog_URL> and putting them together like this:
<blog_URL>/post/<post_ID>

Once you have a bunch of these, simply feed them again to gallery-dl

@openDef
Copy link
Author

openDef commented May 14, 2024

Thanks a lot for the reply!
And because of my level of development, I hardly realize your answer))
But it looks beautiful)
Here's a config I've built
Is this how this whole structure ( config.json )should look like?

    {
    "output":
    {
   "log": {
       "level": "info",
       "format-date": "%Y-%m-%dT%H:%M:%S",
       "format": {
           "debug"  : "\u001b[0;37mDebug  :  {name} -> {message}\u001b[0m",
           "info"   : "\u001b[1;37mInfo   :  {name} -> {message}\u001b[0m",
           "warning": "\u001b[1;33mWarning:  {name} -> {message} {extractor.url:?[/]/}\u001b[0m",
           "error"  : "\u001b[1;31mError  :  {name} -> {message} {extractor.url:?[/]/}\u001b[0m"
       }
   },

   "logfile": {
       "path": "D:\\gallery-dl\\gallery-dl.log.txt",
       "mode": "a",
       "format": {
           "debug"  : "[{asctime}][{levelname}] {message}",
           "info"   : "[{asctime}][{levelname}] {message}",
           "warning": "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]",
           "error"  : "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]"
       },
       "format-date": "%Y-%m-%dT%H:%M:%S",
       "level": "info"
       }
       }
        }
       {
         "extractor": {
         "tumblr":
    { 
"avatar": false,
        "external": true,
        "inline": true,
        "original": true,
        "ratelimit": "wait",
        "reblogs": true,
        "posts": "all",
        "fallback-delay": 90.0,
        "fallback-retries": 6,

        "retries": 24,
        "skip": "abort:12",
        "sleep-request": [0.2, 0.6],
        "sleep-extractor": [0.4, 2.0],
        "blacklist": ["twitter", "instagram", "flickr"]	
"api-key": "6--------------------------------------c",
"api-secret": "5----------------------------------------X",
"filename": "{filename}.{extension}",		
"image-filter": "extension not in ('m4v', 'gif', 'mp3', 'webm', 'avi', 'mp4', 'mkv', '')"
 }
    
 }
 }

@openDef openDef closed this as completed May 14, 2024
@openDef openDef reopened this May 14, 2024
@openDef
Copy link
Author

openDef commented May 14, 2024

I definitely made a mistake somewhere.
I don't understand where

@openDef
Copy link
Author

openDef commented May 14, 2024

@openDef
Quotes and commas defeated me
and I couldn't put everything right to make it work

@Hrxn
Copy link
Contributor

Hrxn commented May 14, 2024

Here is a full config for tumblr, just change values as you need:

{
    "extractor":
    {
        "base-directory": "D:\\Home\\Downloads\\",
        "archive": "D:\\Home\\Apps\\gallery-dl\\gallery-dl.archive.global.db",
        "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
        "skip": true,

        "keywords": {"category": ""},
        "keywords-default": "",
        "parent-directory": true,
        "extension-map":
        {
            "jpeg": "jpg",
            "jpe" : "jpg"
        },

        "tumblr":
        {
            "user":
            {
                "directory": {
                    "locals().get('category')": ["Tumblr", "Blogs", "{category}", "{blog_name!c}"],
                    ""                        : ["Tumblr", "Blogs", "Unsorted", "{blog_name!c}"]
                },
                "filename": {
                    "locals().get('slug')": "{date:%Y-%m-%d-%H%M%S}.{id}.{num:>02}.{slug:R.//}.{extension}",
                    ""                    : "{date:%Y-%m-%d-%H%M%S}.{id}.{num:>02}.{extension}"
                }
            },
            "post":
            {
                "directory": {
                    "locals().get('category')": ["Tumblr", "Posts", "{category}"],
                    ""                        : ["Tumblr", "Posts", "Unsorted"]
                },
                "filename": {
                    "locals().get('slug')": "{date:%Y-%m-%d-%H%M%S}.{blog_name}.{id}.{num:>02}.{slug:R.//}.{extension}",
                    ""                    : "{date:%Y-%m-%d-%H%M%S}.{blog_name}.{id}.{num:>02}.{extension}"
                }
            },

            "archive-prefix": "",
            "archive": "D:\\Home\\Apps\\gallery-dl\\gallery-dl.archive.tumblr.db",
            "avatar": false,
            "external": true,
            "inline": true,
            "original": true,
            "ratelimit": "wait",
            "reblogs": true,
            "posts": "all",
            "fallback-delay": 90.0,
            "fallback-retries": 6,

            "retries": 24,
            "skip": "abort:12",
            "sleep-request": [0.2, 0.6],
            "sleep-extractor": [0.4, 2.0],
            "blacklist": ["twitter", "instagram", "flickr"],

            "api-key": "  ----  ",
            "api-secret": "  ----  ",
            "access-token": "  ----  ",
            "access-token-secret": "  ----  "
        }
    },
    "output":
    {
        "mode": "color",
        "ansi": true,
        "shorten": "eaw",
        "skip": true,
        "progress": true,
        "log": {
            "level": "info",
            "format-date": "%Y-%m-%dT%H:%M:%S",
            "format": {
                "debug"  : "\u001b[0;37mDebug  :  {name} -> {message}\u001b[0m",
                "info"   : "\u001b[1;37mInfo   :  {name} -> {message}\u001b[0m",
                "warning": "\u001b[1;33mWarning:  {name} -> {message} {extractor.url:?[/]/}\u001b[0m",
                "error"  : "\u001b[1;31mError  :  {name} -> {message} {extractor.url:?[/]/}\u001b[0m"
            }
        },
        "logfile": {
            "path": "D:\\Home\\Apps\\gallery-dl\\gallery-dl.log.txt",
            "mode": "a",
            "format": {
                "debug"  : "[{asctime}][{levelname}] {message}",
                "info"   : "[{asctime}][{levelname}] {message}",
                "warning": "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]",
                "error"  : "[{asctime}][{levelname}] {message} [Source URL: {extractor.url}]"
            },
            "format-date": "%Y-%m-%dT%H:%M:%S",
            "level": "info"
        }
    }
}

@openDef
Copy link
Author

openDef commented May 15, 2024

Thank you so much for your help
Health and prosperity to you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants