Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: conda cache does not work #1527

Open
1 of 2 tasks
gaocegege opened this issue Mar 14, 2023 · 5 comments
Open
1 of 2 tasks

bug: conda cache does not work #1527

gaocegege opened this issue Mar 14, 2023 · 5 comments
Assignees

Comments

@gaocegege
Copy link
Member

Are you use the envd server?

  • Yes, I am using the envd server.
  • No, I am not using the envd server.

Describe the bug

def build():
    config.repo(url="https://github.com/tensorchord/envd", description="gnn")
    base(os="ubuntu20.04", language="python3.7")

    install.cuda(version="11.3.1")

    install.python_packages(name = [
        "dgllife",
    ])

    install.conda_packages(
        name=[
            "pytorch",
            "cudatoolkit=11.3",
            "rdkit",
            "dgl-cuda11.3",
        ],
        channel=[
            "pytorch",
            "conda-forge",
            "dglteam",
        ],
    )
    shell("bash")

The conda cannot be cached

To Reproduce

  • run with the build.envd

Expected behavior

No response

The docker info output

None

The envd version output

v0.3.11

Additional context

No response

@gaocegege
Copy link
Member Author

@cutecutecat

Could you please have a look?

@Electronic-Waste
Copy link
Contributor

Electronic-Waste commented Jan 3, 2024

Maybe I can have a try.

@Electronic-Waste
Copy link
Contributor

I can't download dependencies... I wonder if it's due to my OS(macOS).

#32 [internal] /opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3 pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3
#32 73.23 done
#32 73.23 Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
#32 410.1 Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source.
#32 1866.9 Collecting package metadata (repodata.json): ...working... WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
#32 2157.3 done
#32 2157.3 Solving environment: ...working... DEBU[2024-01-09T13:15:48+08:00] stopping session                             

#32 ERROR: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
------
 > importing cache manifest from docker.io/tensorchord/python-cache:envd-v0.3.43-cuda-11.3.1-cudnn-8:
------
------
 > [internal] /opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3 pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3:
#32 2157.3 Solving environment: ...working... 

#0 1.682 Collecting package metadata (current_repodata.json): ...working... WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
#32 73.23 done
failed with initial frozen solve. Retrying with flexible solve.
WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
#32 2157.3 done
------
ERRO[2024-01-09T13:15:48+08:00] Buildkit error: failed to solve: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
(1) attached stack trace
  -- stack trace:
  | github.com/tensorchord/envd/pkg/builder.generalBuilder.build.func1
  |     /home/runner/work/envd/envd/pkg/builder/build.go:265
  | golang.org/x/sync/errgroup.(*Group).Go.func1
  |     /home/runner/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75
  | runtime.goexit
  |     /opt/hostedtoolcache/go/1.19.10/x64/src/runtime/asm_arm64.s:1172
Wraps: (2) Buildkit error
Wraps: (3) failed to solve: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
  | (1) failed to solve: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
  | Error types: (1) *builder.BuildkitdErr
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *builder.BuildkitdErr 
ERRO[2024-01-09T13:15:48+08:00]                                               error="failed to load docker image: Post \"http://%2Fvar%2Frun%2Fdocker.sock/v1.43/images/load?quiet=1\": context canceled" language-version=v0 tag="envd-quick-start:dev"
FATA[2024-01-09T13:15:48+08:00] exit                                          app=envd error="failed to build the image: failed to build: failed to wait error group: Buildkit error: failed to solve: process \"/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3\" did not complete successfully: exit code: 137" version=v0.3.43

My docker info:

Client:
 Version:    24.0.2
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.0
    Path:     /Users/x/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.19.1
    Path:     /Users/x/.docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/x/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.20
    Path:     /Users/x/.docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.6
    Path:     /Users/x/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/x/.docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /Users/x/.docker/cli-plugins/docker-scan
  scout: Command line tool for Docker Scout (Docker Inc.)
    Version:  0.16.1
    Path:     /Users/x/.docker/cli-plugins/docker-scout

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 3
 Server Version: 24.0.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.49-linuxkit-pr
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 5
 Total Memory: 7.667GiB
 Name: docker-desktop
 ID: 0a1c4432-d01a-4090-b9da-8cf7b4464c9d
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

@kemingy
Copy link
Member

kemingy commented Jan 9, 2024

@Electronic-Waste can you provide your envd build file?

@Electronic-Waste
Copy link
Contributor

Electronic-Waste commented Jan 9, 2024

My envd build file is the buggy file provided by @gaocegege . (In the beginning of this issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants