Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote Sync Strips Trailing Forward Slash - Results in 404 #3240

Open
joey-grant opened this issue Sep 1, 2023 · 8 comments
Open

Remote Sync Strips Trailing Forward Slash - Results in 404 #3240

joey-grant opened this issue Sep 1, 2023 · 8 comments
Assignees
Labels

Comments

@joey-grant
Copy link

Summary

I have 2 pulp servers, one of which serves as a primary (where I control package promotion, etc) and the other simply syncs repositories from the primary. I am using nginx as my reverse proxy and am also utilizing certgaurd as well (though I don't think this has impact here). The problem, is that my secondary server's rpm remote points to the primary's distribution and includes a trailing slash, but pulp rpm sync seems to be stripping that slash away, resulting in 404s.

Steps to reproduce

[root@primary ~]# pulp rpm distribution show --name application-eng
{
  "pulp_href": "/pulp/api/v3/distributions/rpm/rpm/ea16f53b-8a78-4395-8935-eaa7d96f06c7/",
  "pulp_created": "2023-08-09T15:28:09.236805Z",
  "base_path": "application-x86_64-eng",
  "base_url": "https://primary.env.company.com/pulp/content/application-x86_64-eng/",
  "content_guard": "/pulp/api/v3/contentguards/certguard/x509/b0fa42ca-5eff-4c0d-a53c-dfadf9268ae5/",
  "pulp_labels": {},
  "name": "application-eng",
  "repository": null,
  "publication": "/pulp/api/v3/publications/rpm/rpm/5f217017-ab21-4ed6-8b5e-96afab630321/"
}
[root@secondary ~]# pulp rpm remote show --name application-uri-eng
{
  "pulp_href": "/pulp/api/v3/remotes/rpm/rpm/8c4ee2f0-e095-4967-bc48-fa1a041d37e2/",
  "pulp_created": "2023-09-01T14:12:29.471030Z",
  "name": "application-uri-eng",
  "url": "https://primary.env.company.com/pulp/content/application-x86_64-eng/",
  "ca_cert": <redacted>,
  "client_cert": <redacted>
  "tls_validation": true,
  "proxy_url": null,
  "pulp_labels": {},
  "pulp_last_updated": "2023-09-01T14:29:33.775680Z",
  "download_concurrency": null,
  "max_retries": null,
  "policy": "on_demand",
  "total_timeout": null,
  "connect_timeout": null,
  "sock_connect_timeout": null,
  "sock_read_timeout": null,
  "headers": null,
  "rate_limit": null,
  "hidden_fields": [
    {
      "name": "client_key",
      "is_set": true
    },
    {
      "name": "proxy_username",
      "is_set": false
    },
    {
      "name": "proxy_password",
      "is_set": false
    },
    {
      "name": "username",
      "is_set": false
    },
    {
      "name": "password",
      "is_set": false
    }
  ],
  "sles_auth_token": null
}

[root@secondary ~]# pulp rpm repository show --name application-eng
{
  "pulp_href": "/pulp/api/v3/repositories/rpm/rpm/36a955a4-f5d6-4b48-947f-46edbe396201/",
  "pulp_created": "2023-09-01T14:12:31.994781Z",
  "versions_href": "/pulp/api/v3/repositories/rpm/rpm/36a955a4-f5d6-4b48-947f-46edbe396201/versions/",
  "pulp_labels": {},
  "latest_version_href": "/pulp/api/v3/repositories/rpm/rpm/36a955a4-f5d6-4b48-947f-46edbe396201/versions/0/",
  "name": "application-eng",
  "description": "Mirror for: https://primary.env.company.com/pulp/content/application-x86_64-eng/",
  "retain_repo_versions": null,
  "remote": "/pulp/api/v3/remotes/rpm/rpm/8c4ee2f0-e095-4967-bc48-fa1a041d37e2/",
  "autopublish": true,
  "metadata_signing_service": null,
  "retain_package_versions": 0,
  "metadata_checksum_type": null,
  "package_checksum_type": null,
  "gpgcheck": 0,
  "repo_gpgcheck": 0,
  "sqlite_metadata": false
}

[root@secondary ~]# pulp rpm repository sync --name application-eng
Started background task /pulp/api/v3/tasks/676d1531-ba40-43f6-9c76-48bb3f4c4f87/
Error: Task /pulp/api/v3/tasks/676d1531-ba40-43f6-9c76-48bb3f4c4f87/ failed: '404, message='Not Found', url=URL('https://primary.env.company.com/pulp/content/application-x86_64-eng')'

[root@secondary ~]# curl --key pulp.key --cert pulp.pem https://primary.env.company.com/pulp/content/application-x86_64-eng/

<html>
<head><title>Index of /pulp/content/application-x86_64-eng/</title></head>
<body bgcolor="white">
<h1>Index of /pulp/content/application-x86_64-eng/</h1>
<hr><pre><a href="../">../</a>
<a href="Packages/">Packages/</a>                                                                                           29-Jun-2022 03:54
<a href="config.repo">config.repo</a>
<a href="repodata/">repodata/</a>                                                                                           25-Aug-2023 15:39
</pre><hr></body>
</html>

Expected behavior

I expected the call to pulp rpm repository sync on the secondary to have retained the tailing forward slash as defined in the rpm remote.

Stacktrace/Error log

[root@primary ~]# tail -n2 /var/log/nginx/access.log
XXX.XXX.XXX.XXX - - [01/Sep/2023:15:19:58 +0000] "GET /pulp/content/application-x86_64-eng HTTP/1.1" 404 14 "-" "pulpcore/3.22.1 (cpython 3.8.11-final0, Linux x86_64) (aiohttp 3.8.1)"
XXX.XXX.XXX.XXX - - [01/Sep/2023:15:21:33 +0000] "GET /pulp/content/application-x86_64-eng/ HTTP/1.1" 200 649 "-" "curl/7.29.0"

Pulp and pulp-cli version info

[root@primary ~]# pulp status
{
  "versions": [
    {
      "component": "core",
      "version": "3.22.1",
      "package": "pulpcore"
    },
    {
      "component": "rpm",
      "version": "3.19.7",
      "package": "pulp-rpm"
    },
...
[root@secondary ~]# pulp --version
pulp3 command line interface, version 0.19.2

Additonal context

@mdellweg
Copy link
Member

mdellweg commented Sep 1, 2023

It looks to me like the remote is configured correctly. And since the trailing slash is part of the url there, I cannot see that the cli is to blame either. Would you be able to provide a full stacktrace of this failure?
You should get that either from pulp task show or from the server logs.

@joey-grant
Copy link
Author

Sure thing, thanks for looking at this with me.

[root@secondary ~]# pulp task show --href /pulp/api/v3/tasks/676d1531-ba40-43f6-9c76-48bb3f4c4f87/
{
  "pulp_href": "/pulp/api/v3/tasks/676d1531-ba40-43f6-9c76-48bb3f4c4f87/",
  "pulp_created": "2023-09-01T15:19:58.646463Z",
  "state": "failed",
  "name": "pulp_rpm.app.tasks.synchronizing.synchronize",                                                                                     "logging_cid": "6cc165caf2204c5b97bf55399a0b057b",                                                                                          "started_at": "2023-09-01T15:19:58.769975Z",                                                                                                                   "finished_at": "2023-09-01T15:19:59.018980Z",
  "error": {
    "traceback": "  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 444, in _perform_task\n    result = func
(*args, **kwargs)\n  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 486, in synchronize\n    remote_url = f
etch_remote_url(remote, url)\n  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 305, in fetch_remote_url\n
  remote_url = fetch_mirror(remote)\n  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 254, in fetch_mirror\
n    result = downloader.fetch()\n  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/pulpcore/download/base.py\", line 175, in fetch\n    return done.pop
().result()\n  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/pulpcore/download/http.py\", line 273, in run\n    return await download_wrapper()\n  Fil
e \"/usr/local/lib/pulp/lib64/python3.8/site-packages/backoff/_async.py\", line 151, in retry\n    ret = await target(*args, **kwargs)\n  File \"/usr/local/lib/p
ulp/lib64/python3.8/site-packages/pulpcore/download/http.py\", line 258, in download_wrapper\n    return await self._run(extra_data=extra_data)\n  File \"/usr/lo
cal/lib/pulp/lib64/python3.8/site-packages/pulp_rpm/app/downloaders.py\", line 117, in _run\n    self.raise_for_status(response)\n  File \"/usr/local/lib/pulp/li
b64/python3.8/site-packages/pulp_rpm/app/downloaders.py\", line 102, in raise_for_status\n    response.raise_for_status()\n  File \"/usr/local/lib/pulp/lib64/python3.8/site-packages/aiohttp/client_reqrep.py\", line 1004, in raise_for_status\n    raise ClientResponseError(\n",
    "description": "404, message='Not Found', url=URL('https://primary.env.company.com/pulp/content/application-x86_64-eng')"
  },
  "worker": "/pulp/api/v3/workers/0cd785be-c0ad-4415-89e9-6a7c9d6161fb/",
  "parent_task": null,
  "child_tasks": [],
  "task_group": null,
  "progress_reports": [],
  "created_resources": [],
  "reserved_resources_record": [
    "/pulp/api/v3/repositories/rpm/rpm/36a955a4-f5d6-4b48-947f-46edbe396201/",
    "shared:/pulp/api/v3/remotes/rpm/rpm/8c4ee2f0-e095-4967-bc48-fa1a041d37e2/"
  ]
}

@mdellweg
Copy link
Member

mdellweg commented Sep 1, 2023

@dralley Does this look familiar to you? Would you agree reassigning this issue to pulp_rpm?

@dralley
Copy link
Contributor

dralley commented Sep 1, 2023

I'm fine with reassigning it to pulp_rpm. I agree that it probably isn't a CLI issue, at least.

@mdellweg mdellweg transferred this issue from pulp/pulp-cli Sep 2, 2023
@joey-grant
Copy link
Author

It appears that this line of code in synchronizing.py is the culprit. What is the purpose of stripping out the trailing forward slash deliberately?

downloader = remote.get_downloader(url=remote.url.rstrip("/"), urlencode=False)

@ipanova
Copy link
Member

ipanova commented Oct 12, 2023

What happens here is that your URL has been identified as mirrorlist, which is not true

remote_url = fetch_mirror(remote)

Can you share with us the contents of /repodata?
Is there repomd.xml present?

@dralley
Copy link
Contributor

dralley commented Oct 23, 2023

Related: pulp/pulpcore#3173

@pedro-psb pedro-psb self-assigned this Jan 23, 2024
@pedro-psb
Copy link
Member

pedro-psb commented Jan 24, 2024

Hello @munkey01,
@ipanova has a point, this error is only raised if there is an error finding repodata/repomd.xml (see the context). It would be really helpful to know if repomd.xml is there or not.

Another possible reason for get_repomd_file not "finding" the file (aka raising ClientResponseError) would be something related to the downloader configuration. It's a long shot, but I can look into that if the repomd.xml file is confirmed to be available at repodata/repomd.xml.

I think this is not related to slashes, although it may "look like" at first sight.

ps: just additional information, the 404 is expected when trying to get https://primary.env.company.com/pulp/content/application-x86_64-eng (no slash) before pulpcore 3.40.0, but that not meaningfully because the sync task only tries to hit this because its thinking its a mirrorlist in the first place, as Ina already said.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

5 participants