Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combined workflow for building git-annex on Ubuntu and macOS #33

Merged
merged 12 commits into from Sep 25, 2020

Conversation

jwodder
Copy link
Member

@jwodder jwodder commented Sep 11, 2020

This is an alternative to #31 that combines the Ubuntu and macOS workflows into a single workflow.

Closes #30.

.github/workflows/build-git-annex.yaml Outdated Show resolved Hide resolved
.github/workflows/build-git-annex.yaml Outdated Show resolved Hide resolved
@yarikoptic
Copy link
Member

Please rebase on top of current master. Unfortunately I did manage to cause conflicts - there was a small change I pushed recently, which would need to be redone into this unified workflow.

$> git diff 9a8535654b558bbc908a09fdc29f40bb6a806c4e..0848032c1840dfaf4f9d3589ba1efe4d853fb087
diff --git a/.github/workflows/build-git-annex-debianstandalone.yaml b/.github/workflows/build-git-annex-debianstandalone.yaml
index 92f1c1d..99f5319 100644
--- a/.github/workflows/build-git-annex-debianstandalone.yaml
+++ b/.github/workflows/build-git-annex-debianstandalone.yaml
@@ -57,7 +57,9 @@ jobs:
     needs: build-package
     strategy:
       matrix:
-        flavor: [normal, crippled-tmp, crippled-home]
+        # TODO: add , nfs-home
+        # https://git-annex.branchable.com/bugs/running_tests_on_NFS_HOME_does_not_exit_cleanly__58___gpgtmp/
+        flavor: [normal, crippled-tmp, crippled-home, nfs-tmp]
       fail-fast: false
 
     steps:
@@ -79,6 +81,13 @@ jobs:
           if echo "${{ matrix.flavor }}" | grep -q "crippled" ; then
             scripts/ci/setup_crippledfs /crippledfs 500
           fi
+          if echo "${{ matrix.flavor }}" | grep -q "nfs" ; then
+            mkdir /tmp/nfsmount_ /tmp/nfsmount
+            echo "/tmp/nfsmount_ localhost(rw)" | sudo bash -c 'cat - > /etc/exports'
+            sudo apt-get install -y nfs-kernel-server
+            sudo exportfs -a
+            sudo mount -t nfs localhost:/tmp/nfsmount_ /tmp/nfsmount
+          fi
 
       - name: Run tests
         run: |
@@ -90,6 +99,12 @@ jobs:
             crippled-home)
               export HOME=/crippledfs
               ;;
+            nfs-tmp)
+              export TMPDIR=/tmp/nfsmount
+              ;;
+            nfs-home)
+              export HOME=/tmp/nfsmount
+              ;;
             normal)
               ;;
             *)

no need to add nfs to OSX run

@jwodder
Copy link
Member Author

jwodder commented Sep 14, 2020

@yarikoptic Rebased.

@yarikoptic
Copy link
Member

great, thank you @jwodder ! test-annex: didn't run at all even for OSX for which it managed to build. I guess it is because needs: build-package and that one failed for ubuntu. I will try to finally figure that out today so we could progress. Also I do not see any "download artifact" drop down even for OSX even though that step is all good ("Finished uploading artifact git-annex-macos-dmg. Reported size is 26666118 bytes. There were 0 items that failed to upload") -- may be also because some other matrix run failed (that is kinda inconvenient).

@yarikoptic
Copy link
Member

ok, singularity issue is fixed up now, so linux build succeeds, and tests are running now. So we could proceed further with this PR (some tests setup on OSX seems needing a bit more work).

@jwodder
Copy link
Member Author

jwodder commented Sep 18, 2020

There are at least two problems currently preventing the workflow from running successfully:

  • Sometimes, test-datalad on macos-latest fails on the "Set up SSH target" step (seemingly on the docker-machine create --driver virtualbox default command) with the message:

      Error with pre-create check: "failure getting a version tag from the Github API response (are you getting rate limited by Github?)"
    
  • Sometimes, test-datalad on macos-latest fails on the "Set up SSH target" step with ssh: connect to host localhost port 42241: Connection refused. I suspect a race condition of some sort.

  • Sometimes, test-datalad on macos-latest with the version "release" fails on the "install release datalad" step because the curl request to https://api.github.com/repos/datalad/datalad/releases/latest is returning "403 rate limit exceeded". This may be the same problem as the first bullet above.

@yarikoptic
Copy link
Member

I think actions/runner-images#602 (comment) shines the light

The issue comes from the way how GitHub counts requests for rate limit. For unauthorized requests, it limits by IP. All macOS VMs have the same IP address because of infrastructure.

and the resolution was to provide a GITHUB token

@yarikoptic
Copy link
Member

re ssh failure -- not sure yet, indeed may be some race. If you have specific log handy, please cut/paste details (could upload entire to smaug to share) to be able to comprehend what is going on. @kyleam might have ideas as well.

@jwodder
Copy link
Member Author

jwodder commented Sep 18, 2020

All macOS VMs have the same IP address because of infrastructure.

I'm skeptical whether this is still true when using GitHub Actions' built-in macOS environments instead of actions/virtual-environments. Of note, in the macOS-only workflow in #31, I don't believe docker-machine create ever failed due to rate limiting, but the curl request to get the latest datalad release did fail.

the resolution was to provide a GITHUB token

That can easily be done for curl requests, but there doesn't seem to be a way to do that when invoking docker-machine create.

If you have specific log handy, please cut/paste details

All I have is what's in the GitHub Action output. Basically the only thing about the failure in there is "ssh: connect to host localhost port 42241: Connection refused".

@kyleam
Copy link
Contributor

kyleam commented Sep 18, 2020

  • Set up SSH target" step with ssh: connect to host localhost port 42241: Connection refused. I suspect a race condition of some sort.

Yes, there is a race:

https://github.com/datalad/datalad/blob/454baaac1392dd7b6a4b6fe2711d08d11a94ab8a/tools/ci/prep-travis-forssh.sh#L31-L38

@yarikoptic
Copy link
Member

Note: actions/virtual-environments is the repository for underlying recipes of GitHub actions environments. So I thought it is a generic statement on how osx vms are setup for GitHub actions, this explains why we observe rate limiting only for osx.

We could start by adding to curl invocation.

@yarikoptic
Copy link
Member

Note: we already have a token in the secrets

@jwodder
Copy link
Member Author

jwodder commented Sep 21, 2020

Note: we already have a token in the secrets

What token is that? I don't have permission to view the repository settings. If you're referring to secrets.GITHUB_TOKEN, that can only be used for requests to the same repository, yet the failing curl request is for datalad/datalad.

@yarikoptic
Copy link
Member

yes -- that token. It is generated for my user, not for any specific repo, so should be usable across entirety of the github. It might lack some permissions allowed, but if that happens - I could add them.

@jwodder
Copy link
Member Author

jwodder commented Sep 21, 2020

I added an Authorization header to the one curl request to api.github.com, but now both the "master" and "maint" versions of test-datalad on macos-latest failed at the docker-machine create step due to rate limiting.

@jwodder
Copy link
Member Author

jwodder commented Sep 21, 2020

I tried passing the GitHub token to docker-machine as described in the README, but one of the tests is still failing with the same error.

@jwodder
Copy link
Member Author

jwodder commented Sep 21, 2020

I've submitted a PR to the datalad repo that should eliminate the race condition in prep-travis-forssh.sh: datalad/datalad#4940

@yarikoptic
Copy link
Member

woohoo !

Re "docker machine": it might be due to still open docker/machine#2296 and may be the workaround of docker/machine#2765 (comment) could help?

but overall, since we know that datalad still needs fixups but overall build on OSX and testing of annex succeeds -- lets just disable testing of datalad on OSX for now altogether and merge this, and initiate subsequent PR on top to poke periodically at to make sure that DataLad is getting green.

(Note that badges in README.md might need to be adjusted since workflow got renamed -- see https://github.com/datalad/datalad-extensions/blob/master/CONTRIBUTING.md on how to regenerate it after tune up)

@jwodder
Copy link
Member Author

jwodder commented Sep 21, 2020

I've disabled running test-datalad on macOS. Updating the badges doesn't seem to be necessary, as they refer to the workflow by its display name (which didn't change) rather than filename.

@jwodder jwodder marked this pull request as ready for review September 23, 2020 14:13
@yarikoptic yarikoptic merged commit d413c8c into master Sep 25, 2020
@jwodder jwodder deleted the gh-30b branch October 30, 2020 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Establish github workflow to build git-annex on OSX
3 participants