Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hadoop-winutils33: Add version 3.3.5 #1716

Merged
merged 2 commits into from
May 13, 2024

Conversation

nightscape
Copy link
Contributor

The author of the original hadoop-winutils is too busy to continue working on it, but there is a recommended fork that provides newer versions at cdarlint/winutils.
This PR adds version 3.3 from @cdarlint's fork.

Resolves ScoopInstaller/Extras#3009

Copy link
Contributor

github-actions bot commented May 3, 2024

All changes look good.

Wait for review from human collaborators.

hadoop-winutils33

  • Description
  • License
  • Hashes
  • Checkver
  • Autoupdate

@nightscape
Copy link
Contributor Author

@niheaven would you mind having a look?

@niheaven
Copy link
Member

niheaven commented May 8, 2024

You are downloading the whole repo, is there some method to download only the subdir?

@nightscape
Copy link
Contributor Author

@niheaven not trivially... Here's a SO thread regarding downloading a single directory from Git:
https://stackoverflow.com/questions/7106012/download-a-single-folder-or-directory-from-a-github-repo

The solutions require either

  • a third-party service like DownGit or ssgithub.com, which is not great from a security perspecive (e.g. Chrome displays a big red warning page when accessing DownGit)
  • using SVN to download the directory (which would then require installing SVN first)
  • using Git sparse checkouts (which would make the installation a non-standard multi-step process)

They all would make the installation more complicated or less safe.
Personally I would bite the bullet and accept the bigger download (~25Mb).
But if you say one of the three options above is preferrable nonetheless, I'll go for it.

@niheaven
Copy link
Member

niheaven commented May 8, 2024

Hmm, another way: use https://github.com/ScoopInstaller/Binary to store these bins. Please refer https://github.com/ScoopInstaller/Main/blob/master/bucket/dark.json to make a proper manifest and upload needed zip (or 7z) to Binary repo.

@nightscape
Copy link
Contributor Author

@niheaven so the approach would be to

  1. take the directories from https://github.com/cdarlint/winutils and zip each one
  2. create a PR against https://github.com/ScoopInstaller/Binary which adds the zips
  3. adapt this PR to download the artifacts from https://github.com/ScoopInstaller/Binary and maybe add the other minor versions (3.0, 3.1, 3.2) as well?

@niheaven
Copy link
Member

Hmm, another way to make manifest: download all files in that dir:

hadoop-winutils33
{
    "version": "3.3.5",
    "description": "Windows binaries for Hadoop versions.",
    "homepage": "https://github.com/cdarlint/winutils/",
    "license": "Apache-2.0",
    "suggest": {
        "JDK": "java/openjdk"
    },
    "url": [
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hadoop",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hadoop.cmd",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hadoop.dll",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hadoop.exp",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hadoop.lib",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hadoop.pdb",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hdfs",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/hdfs.cmd",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/libwinutils.lib",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/mapred",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/mapred.cmd",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/winutils.exe",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/winutils.pdb",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/yarn",
        "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-3.3.5/bin/yarn.cmd"
    ],
    "hash": [
        "e8a1f032a56beceb1989c8467b58109d2bff47f8c7bc5de3dde76cf2c6452abf",
        "2d7f5d9b5ea01189cb0e0dfcea9e06eb58a6427a35e67afe72a80da30f9fd324",
        "d3dd64afdc85f2a7eb5345abf2ecaa744b0a157de40859313337d47f81ee1c7b",
        "03d8a1ed662ee4fcb7393e83386c1c086897a4b2f5a5df20eee84bc215ac5311",
        "2923e73622a1b6ecc5785692f9fb0f32361862ea369adf931a186a079dbec220",
        "6b8ac1bf6985f73f59f5633a7f02c1e945fde23080a8056c11ed760fbb54dd9a",
        "db79a2c81efa42a1255232f1239b67da920130f6df04807212ab1ea59edfa0ff",
        "ddc96e03b6ff62bd551ff5b9ec54a8b5d228aaf4836a7a4156f6a2e1b1c23741",
        "5c35b97e49b639112135deaf94afc5753333d6f9fc815adbe3ef6a3843a36ae5",
        "11c9502db17e000b838664ae76ace002f7bd6607a61e73a010022b5ee6bd6566",
        "d743c658af11eebc5350768e9f29d0a1dca42bbfad6b60b1ebf023bbf9de24fc",
        "a0ca6e358357c41ef56ebdb02c38e4a4d55da7ca7a13001678bb2ef7d644adea",
        "ba630ced9e7d587e17bbf78b17df7e1c83f07b32a0eef3b16cb3bf442454e627",
        "56ac42988d7fe5758667c5ca6be0c4455cb0bf073fa14dbdb6a105e3b3d6f234",
        "ad3544310c6687376d8d6b5f796c4b0d9fe64edec3907e63dd517f82a861d0bf"
    ],
    "bin": "winutils.exe",
    "checkver": {
        "url": "https://api.github.com/repos/cdarlint/winutils/contents/",
        "jsonpath": "$..name",
        "reverse": true,
        "regex": "hadoop-(3.3\\.\\d+)"
    },
    "autoupdate": {
        "url": [
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hadoop",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hadoop.cmd",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hadoop.dll",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hadoop.exp",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hadoop.lib",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hadoop.pdb",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hdfs",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/hdfs.cmd",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/libwinutils.lib",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/mapred",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/mapred.cmd",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/winutils.exe",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/winutils.pdb",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/yarn",
            "https://raw.githubusercontent.com/cdarlint/winutils/master/hadoop-$version/bin/yarn.cmd"
        ]
    }
}

You could remove unneeded files above.

@nightscape
Copy link
Contributor Author

@niheaven I like the last approach best 🙂
It doesn't need any intermediate uploads or third-party tools, and the list of files hopefully won't change a lot.
I tested it locally and it works nicely. My last commit takes your proposed changes verbatim 👍
Regarding your question about unneeded files: Depending on the use case, each file might be required, so I'd leave them all in.
Good to merge from my side!

@niheaven
Copy link
Member

/verify

Copy link
Contributor

All changes look good.

Wait for review from human collaborators.

hadoop-winutils33

  • Description
  • License
  • Hashes
  • Checkver
  • Autoupdate

@niheaven niheaven merged commit 10788ae into ScoopInstaller:master May 13, 2024
2 checks passed
@nightscape nightscape deleted the hadoop-winutils33 branch May 13, 2024 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hadoop-winutils should probably point to new location
2 participants