You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After applying the jupyterhub on EKS blueprint I am able to access the home screen via port forwarding. I can then select one of the provided options to setup a server:
Data Engineering (CPU)
Trainium (trn1)
Inferentia (inf2)
Data Science ...
...
All of these options immediately fail with the same/similar error messages:
Server requested
2024-05-13T13:55:13.094920Z [Warning] 0/4 nodes are available: 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..
2024-05-13T13:55:14Z [Warning] Failed to schedule pod, incompatible with nodepool "trainium", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate aws.amazon.com/neuroncore=true:NoSchedule; did not tolerate aws.amazon.com/neuron=true:NoSchedule; incompatible with nodepool "inferentia", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate aws.amazon.com/neuroncore=true:NoSchedule; did not tolerate aws.amazon.com/neuron=true:NoSchedule; incompatible with nodepool "gpu-ts", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate nvidia.com/gpu=:NoSchedule; incompatible with nodepool "gpu", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate nvidia.com/gpu=:NoSchedule; incompatible with nodepool "default", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, no instance type satisfied resources {"cpu":"2710m","memory":"8932Mi","pods":"9"} and requirements NodeGroupType In [default], NodePool In [default], hub.jupyter.org/node-purpose In [user], karpenter.k8s.aws/instance-family In [c5 m5 r5], karpenter.k8s.aws/instance-size In [16xlarge 24xlarge 2xlarge 4xlarge 8xlarge and 1 others], karpenter.sh/capacity-type In [on-demand spot], karpenter.sh/nodepool In [default], kubernetes.io/arch In [amd64] (no instance type met the scheduling requirements or had a required offering)
2024-05-13T13:55:23Z [Normal] pod didn't trigger scale-up: 1 node(s) didn't match Pod's node affinity/selector2024-05-13T14:00:35.378113Z [Warning] 0/4 nodes are available: 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..Spawn failed: pod jupyterhub/jupyter-user1 did not start in 1200 seconds!
The documentation does not mention to customize any node labels. The only changes made to the blueprint were replacing the VPC module by a VPC data source and updating the references to the VPC module in the other terraform files.
✋ I have searched the open/closed issues and my issue is not listed.
Update references to the VPC module and specify the subnet to deploy the workloads to
terraform apply
Expected behavior
JupyterHub resources are created
JupyterHub is reachable
Starting a server (e.g. Data Engineering (CPU)) creates a pod on a matching node or provisions a new node
Actual behavior
JupyterHub resources are created
JupyterHub is reachable
Starting a server (e.g. Data Engineering (CPU)) fails
Terminal Output Screenshot(s)
Server requested
2024-05-13T13:55:13.094920Z [Warning] 0/4 nodes are available: 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..
2024-05-13T13:55:14Z [Warning] Failed to schedule pod, incompatible with nodepool "trainium", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate aws.amazon.com/neuroncore=true:NoSchedule; did not tolerate aws.amazon.com/neuron=true:NoSchedule; incompatible with nodepool "inferentia", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate aws.amazon.com/neuroncore=true:NoSchedule; did not tolerate aws.amazon.com/neuron=true:NoSchedule; incompatible with nodepool "gpu-ts", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate nvidia.com/gpu=:NoSchedule; incompatible with nodepool "gpu", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, did not tolerate nvidia.com/gpu=:NoSchedule; incompatible with nodepool "default", daemonset overhead={"cpu":"710m","memory":"740Mi","pods":"8"}, no instance type satisfied resources {"cpu":"2710m","memory":"8932Mi","pods":"9"} and requirements NodeGroupType In [default], NodePool In [default], hub.jupyter.org/node-purpose In [user], karpenter.k8s.aws/instance-family In [c5 m5 r5], karpenter.k8s.aws/instance-size In [16xlarge 24xlarge 2xlarge 4xlarge 8xlarge and 1 others], karpenter.sh/capacity-type In [on-demand spot], karpenter.sh/nodepool In [default], kubernetes.io/arch In [amd64] (no instance type met the scheduling requirements or had a required offering)
2024-05-13T13:55:23Z [Normal] pod didn't trigger scale-up: 1 node(s) didn't match Pod's node affinity/selector2024-05-13T14:00:35.378113Z [Warning] 0/4 nodes are available: 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..Spawn failed: pod jupyterhub/jupyter-user1 did not start in 1200 seconds!
The text was updated successfully, but these errors were encountered:
Description
After applying the jupyterhub on EKS blueprint I am able to access the home screen via port forwarding. I can then select one of the provided options to setup a server:
All of these options immediately fail with the same/similar error messages:
The documentation does not mention to customize any node labels. The only changes made to the blueprint were replacing the VPC module by a VPC data source and updating the references to the VPC module in the other terraform files.
Versions
Module version [Required]: v1.0.2
Terraform version: v1.8.3
Provider version(s):
Reproduction Code [Required]
Steps to reproduce the behavior:
terraform apply
Expected behavior
Actual behavior
Terminal Output Screenshot(s)
The text was updated successfully, but these errors were encountered: