[BUG] - Config generated by guided init should contain default settings #2122

rsignell · 2023-11-18T07:49:08Z

Describe the bug

We generated a config using guided-init , and deployed Nebari on AWS with no apparent issues.

We then tried to use the deployment for training and students were not able to get the 30 workers they requested. We double checked our service quota in the region we were running the training and it was set to 500.
So we were mystified (and frustrated) why things weren't working.

Then after the training we realized that the generated config didn't contain the node groups section with the max limits for the different worker pools. In other words, the amazon section just looked like:

amazon_web_services:
  kubernetes_version: '1.26'
  region: us-west-2

So I imagine we were just getting the default max, which must have been something small.

When I added the section to:

amazon_web_services:
  kubernetes_version: '1.26'
  region: us-west-2
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
    user:
      instance: m5.xlarge
      min_nodes: 1
      max_nodes: 100
    worker:
      instance: m5.xlarge
      min_nodes: 1
      max_nodes: 450

and rerendered and redeployed, multiple students could spin up clusters of 30 again.

Expected behavior

The expected behaviour would be to have the guided-init create a config that contains the default settings so they could (1) show the user what the defaults were, and (2) provide a template that facilitates easy modification by the user.

OS and architecture in which you are running Nebari

aws

How to Reproduce the problem?

use the guided init on aws

Command output

No response

Versions and dependencies used.

nebari 2023.10.1

Compute environment

AWS

Integrations

No response

Anything else?

No response

The text was updated successfully, but these errors were encountered:

aktech · 2024-02-23T13:35:24Z

Yes, indeed. This would be super helpful.

Adam-D-Lewis · 2024-02-27T15:43:03Z

We discussed in the community meeting and we agree that displaying the entire config in nebari-config.yaml is better and we would accept PRs doing that (reverting to prior Nebari behavior).

Adam-D-Lewis · 2024-02-28T15:58:28Z

It looks like we're manually putting each thing we want to be in the config in src/_nebari/initialize.py. I think we should do something similar to config = read_configuration(config_filename, config_schema=config_schema) in src/_nebari/subcommands/deploy.py when we write the config file.

If we want to prevent some attributes from showing up in the config, I think we could do something where we add a private _hide_from_config var on each model where we don't want some fields to show up in the config by default (e.g. terraform overrides, maybe). I put together a small demo showing how this might work. I assume we'd modify Base class in src/nebari/schema.py. There's still more work, but here's the example so it's not lost.

from pydantic import BaseModel

class Base(BaseModel):
	_hide_from_config: set = {}
		
	@classmethod
	def _get_config_exclude_attrs(cls):
		exclude_dict = {}
		for attr_name, field in cls.__fields__.items():
			if attr_name in cls._hide_from_config:
				exclude_dict[attr_name] = True
			else:
				if hasattr(field.type_, '_get_config_exclude_attrs'):
					exclude_dict[attr_name] = field.type_._get_config_exclude_attrs()
		return exclude_dict

	def write_config(self):
		return self.json(exclude=self._get_config_exclude_attrs())

class SubSubModel(Base):
	_hide_from_config = {'hidden_sub_sub_model_value'}
	sub_sub_model_value: int = 2
	hidden_sub_sub_model_value: int = -9999
	

class SubModel(Base):
	_hide_from_config = {'hidden_sub_model_value'}
	sub_model_value: SubSubModel = SubSubModel()
	hidden_sub_model_value: int = -9999


class MyModel(Base):
	_hide_from_config: set = {'should_not_be_written'}
	should_be_written: SubModel
	should_not_be_written: str = None
	other_should_be_written: str = '1'


mm = MyModel(should_be_written=SubModel())
config = mm.write_config()
print('='*80)
print(mm.dict(exclude={
	'should_not_be_written': True, 
	'should_be_written': {
		'hidden_sub_model_value': True,
		'sub_model_value': {
			'hidden_sub_sub_model_value': True
		},
	}
}))
print('='*80)
print(mm.write_config())
print('='*80)

# Output looks like
# ================================================================================
# {'should_be_written': {'sub_model_value': {'sub_sub_model_value': 2}}, 'other_should_be_written': '1'}
# ================================================================================
# {"should_be_written": {"sub_model_value": {"sub_sub_model_value": 2}}, "other_should_be_written": "1"}
# ================================================================================

Update: Looked into this a bit more. Potential issues are:

key order (provider, version at the top of the config file, etc.)
- define an ordered dict with the desired order for deployment
complex logic about when fields should be displayed or not (e.g. only write other attributes of ci_cd if ci_cd.type is
not None, only show azure section if provider == 'azure')
Currently when updating nebari, if the latest nebari changes a default value for an unspecified (not in nebari-config.yaml file) the value will be changed to the latest nebari default value. If we put everything in the config then that won't be the case anymore.
- I doubt this was relied upon, so it's fine for this behavior to change.

Adam-D-Lewis · 2024-03-06T22:55:24Z

With a very small code change we could get the config file to look something like what I've pasted below.
PR here

prevent_deploy: false
nebari_version: 2024.3.1rc1
provider: local
namespace: dev
project_name: scratch-2efc
ci_cd:
  type: none
  branch: main
  commit_render: true
  before_script: []
  after_script: []
terraform_state:
  type: remote
  backend:
  config: {}
digital_ocean:
azure:
amazon_web_services:
google_cloud_platform:
existing:
local:
  kube_context:
  node_selectors:
    general:
      key: kubernetes.io/os
      value: linux
    user:
      key: kubernetes.io/os
      value: linux
    worker:
      key: kubernetes.io/os
      value: linux
external_container_reg:
  enabled: false
  access_key_id:
  secret_access_key:
  extcr_account:
  extcr_region:
dns:
  provider:
  auto_provision: false
ingress:
  terraform_overrides: {}
certificate:
  type: self-signed
  secret_name:
  acme_email:
  acme_server: https://acme-v02.api.letsencrypt.org/directory
domain:
security:
  authentication:
    type: password
  shared_users_group: true
  keycloak:
    initial_root_password: sltj9yhbg2uatvknvasd6kb8bnzp49q4
    overrides: {}
    realm_display_name: Nebari
jhub_apps:
  enabled: false
jupyterlab:
  default_settings: {}
  idle_culler:
    terminal_cull_inactive_timeout: 15
    terminal_cull_interval: 5
    kernel_cull_idle_timeout: 15
    kernel_cull_interval: 5
    kernel_cull_connected: true
    kernel_cull_busy: false
    server_shutdown_no_activity_timeout: 15
  initial_repositories: []
  preferred_dir:
jupyterhub:
  overrides: {}
telemetry:
  jupyterlab_pioneer:
    enabled: false
    log_format:
monitoring:
  enabled: true
argo_workflows:
  enabled: true
  overrides: {}
  nebari_workflow_controller:
    enabled: true
    image_tag: 2024.2.1rc2
conda_store:
  extra_settings: {}
  extra_config: ''
  image: quansight/conda-store-server
  image_tag: 2024.1.1
  default_namespace: nebari-git
  object_storage: 200Gi
environments:
  environment-dask.yaml:
    name: dask
    channels:
    - conda-forge
    dependencies:
    - python==3.11.6
    - ipykernel==6.26.0
    - ipywidgets==8.1.1
    - nebari-dask==2024.1.1
    - python-graphviz==0.20.1
    - pyarrow==14.0.1
    - s3fs==2023.10.0
    - gcsfs==2023.10.0
    - numpy=1.26.0
    - numba=0.58.1
    - pandas=2.1.3
    - xarray==2023.10.1
  environment-dashboard.yaml:
    name: dashboard
    channels:
    - conda-forge
    dependencies:
    - python==3.11.6
    - cufflinks-py==0.17.3
    - dash==2.14.1
    - geopandas==0.14.1
    - geopy==2.4.0
    - geoviews==1.11.0
    - gunicorn==21.2.0
    - holoviews==1.18.1
    - ipykernel==6.26.0
    - ipywidgets==8.1.1
    - jupyter==1.0.0
    - jupyter_bokeh==3.0.7
    - matplotlib==3.8.1
    - nebari-dask==2024.1.1
    - nodejs=20.8.1
    - numpy==1.26.0
    - openpyxl==3.1.2
    - pandas==2.1.3
    - panel==1.3.1
    - param==2.0.1
    - plotly==5.18.0
    - python-graphviz==0.20.1
    - rich==13.6.0
    - streamlit==1.28.1
    - sympy==1.12
    - voila==0.5.5
    - xarray==2023.10.1
    - pip==23.3.1
    - pip:
      - streamlit-image-comparison==0.0.4
      - noaa-coops==0.1.9
      - dash_core_components==2.0.0
      - dash_html_components==2.0.0
profiles:
  jupyterlab:
  - access: all
    display_name: Small Instance
    description: Stable environment with 2 cpu / 8 GB ram
    default: true
    users:
    groups:
    kubespawner_override:
      cpu_limit: 2
      cpu_guarantee: 1
      mem_limit: 8G
      mem_guarantee: 5G
  - access: all
    display_name: Medium Instance
    description: Stable environment with 4 cpu / 16 GB ram
    default: false
    users:
    groups:
    kubespawner_override:
      cpu_limit: 4
      cpu_guarantee: 3
      mem_limit: 16G
      mem_guarantee: 10G
  dask_worker:
    Small Worker:
      worker_cores_limit: 2
      worker_cores: 1
      worker_memory_limit: 8G
      worker_memory: 5G
      worker_threads: 2
    Medium Worker:
      worker_cores_limit: 4
      worker_cores: 3
      worker_memory_limit: 16G
      worker_memory: 10G
      worker_threads: 4
theme:
  jupyterhub:
    hub_title: Nebari - scratch-2efc
    hub_subtitle: Your open source data science platform, hosted
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs/welcome">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    logo: 
      https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg
    favicon: 
      https://raw.githubusercontent.com/nebari-dev/nebari-design/main/symbol/favicon.ico
    primary_color: '#4f4173'
    primary_color_dark: '#4f4173'
    secondary_color: '#957da6'
    secondary_color_dark: '#957da6'
    accent_color: '#32C574'
    accent_color_dark: '#32C574'
    text_color: '#111111'
    h1_color: '#652e8e'
    h2_color: '#652e8e'
    version: v2024.3.1rc1
    navbar_color: '#1c1d26'
    navbar_text_color: '#f1f1f6'
    navbar_hover_color: '#db96f3'
    display_version: 'True'
storage:
  conda_store: 200Gi
  shared_filesystem: 200Gi
default_images:
  jupyterhub: quay.io/nebari/nebari-jupyterhub:2024.2.1rc2
  jupyterlab: quay.io/nebari/nebari-jupyterlab:2024.2.1rc2
  dask_worker: quay.io/nebari/nebari-dask-worker:2024.2.1rc2
tf_extensions: []
helm_extensions: []

Issues:

having all cloud providers show up (most of them blank) is too confusing
certificate section is confusing with both secret_name and acme_* keys since you'll only need one or the other depending on the type.

I think we can solve this by using more specific Pydantic Models e.g. SelfSignedCertificate and SecretCertificate instead of capturing both cases in one Pydantic model.

Update: I think I fixed the certs using more specific Pydantic models. I'd want to fix the providers also though.

rsignell · 2024-03-07T00:26:37Z

Well I think even with its issues this would be a big improvement!

Adam-D-Lewis · 2024-03-09T20:07:09Z

A few more changes, the config now looks like what's below (got rid of extra aws, gcp, etc. sections)

prevent_deploy: false
nebari_version: 2024.3.1rc2.dev2+g44c9f71.d20240306
provider: local
namespace: dev
project_name: scratch-2efc
ci_cd:
  type: none
  branch: main
  commit_render: true
  before_script: []
  after_script: []
terraform_state:
  type: remote
  backend:
  config: {}
local:
  kube_context:
  node_selectors:
    general:
      key: kubernetes.io/os
      value: linux
    user:
      key: kubernetes.io/os
      value: linux
    worker:
      key: kubernetes.io/os
      value: linux
external_container_reg:
  enabled: false
  access_key_id:
  secret_access_key:
  extcr_account:
  extcr_region:
dns:
  provider:
  auto_provision: false
ingress:
  terraform_overrides: {}
certificate:
  type: self-signed
domain:
security:
  authentication:
    type: password
  shared_users_group: true
  keycloak:
    initial_root_password: syrl4br916u50lhxi7ifv3m921qkmckh
    overrides: {}
    realm_display_name: Nebari
jhub_apps:
  enabled: false
jupyterlab:
  default_settings: {}
  idle_culler:
    terminal_cull_inactive_timeout: 15
    terminal_cull_interval: 5
    kernel_cull_idle_timeout: 15
    kernel_cull_interval: 5
    kernel_cull_connected: true
    kernel_cull_busy: false
    server_shutdown_no_activity_timeout: 15
  initial_repositories: []
  preferred_dir:
jupyterhub:
  overrides: {}
telemetry:
  jupyterlab_pioneer:
    enabled: false
    log_format:
monitoring:
  enabled: true
argo_workflows:
  enabled: true
  overrides: {}
  nebari_workflow_controller:
    enabled: true
    image_tag: 2024.2.1rc2
conda_store:
  extra_settings: {}
  extra_config: ''
  image: quansight/conda-store-server
  image_tag: 2024.1.1
  default_namespace: nebari-git
  object_storage: 200Gi
environments:
  environment-dask.yaml:
    name: dask
    channels:
    - conda-forge
    dependencies:
    - python==3.11.6
    - ipykernel==6.26.0
    - ipywidgets==8.1.1
    - nebari-dask==2024.1.1
    - python-graphviz==0.20.1
    - pyarrow==14.0.1
    - s3fs==2023.10.0
    - gcsfs==2023.10.0
    - numpy=1.26.0
    - numba=0.58.1
    - pandas=2.1.3
    - xarray==2023.10.1
  environment-dashboard.yaml:
    name: dashboard
    channels:
    - conda-forge
    dependencies:
    - python==3.11.6
    - cufflinks-py==0.17.3
    - dash==2.14.1
    - geopandas==0.14.1
    - geopy==2.4.0
    - geoviews==1.11.0
    - gunicorn==21.2.0
    - holoviews==1.18.1
    - ipykernel==6.26.0
    - ipywidgets==8.1.1
    - jupyter==1.0.0
    - jupyter_bokeh==3.0.7
    - matplotlib==3.8.1
    - nebari-dask==2024.1.1
    - nodejs=20.8.1
    - numpy==1.26.0
    - openpyxl==3.1.2
    - pandas==2.1.3
    - panel==1.3.1
    - param==2.0.1
    - plotly==5.18.0
    - python-graphviz==0.20.1
    - rich==13.6.0
    - streamlit==1.28.1
    - sympy==1.12
    - voila==0.5.5
    - xarray==2023.10.1
    - pip==23.3.1
    - pip:
      - streamlit-image-comparison==0.0.4
      - noaa-coops==0.1.9
      - dash_core_components==2.0.0
      - dash_html_components==2.0.0
profiles:
  jupyterlab:
  - access: all
    display_name: Small Instance
    description: Stable environment with 2 cpu / 8 GB ram
    default: true
    users:
    groups:
    kubespawner_override:
      cpu_limit: 2
      cpu_guarantee: 1
      mem_limit: 8G
      mem_guarantee: 5G
  - access: all
    display_name: Medium Instance
    description: Stable environment with 4 cpu / 16 GB ram
    default: false
    users:
    groups:
    kubespawner_override:
      cpu_limit: 4
      cpu_guarantee: 3
      mem_limit: 16G
      mem_guarantee: 10G
  dask_worker:
    Small Worker:
      worker_cores_limit: 2
      worker_cores: 1
      worker_memory_limit: 8G
      worker_memory: 5G
      worker_threads: 2
    Medium Worker:
      worker_cores_limit: 4
      worker_cores: 3
      worker_memory_limit: 16G
      worker_memory: 10G
      worker_threads: 4
theme:
  jupyterhub:
    hub_title: Nebari - scratch-2efc
    hub_subtitle: Your open source data science platform, hosted
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs/welcome">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    logo: 
      https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg
    favicon: 
      https://raw.githubusercontent.com/nebari-dev/nebari-design/main/symbol/favicon.ico
    primary_color: '#4f4173'
    primary_color_dark: '#4f4173'
    secondary_color: '#957da6'
    secondary_color_dark: '#957da6'
    accent_color: '#32C574'
    accent_color_dark: '#32C574'
    text_color: '#111111'
    h1_color: '#652e8e'
    h2_color: '#652e8e'
    version: v2024.3.1rc2.dev2+g44c9f71.d20240306
    navbar_color: '#1c1d26'
    navbar_text_color: '#f1f1f6'
    navbar_hover_color: '#db96f3'
    display_version: 'True'
storage:
  conda_store: 200Gi
  shared_filesystem: 200Gi
default_images:
  jupyterhub: quay.io/nebari/nebari-jupyterhub:2024.2.1rc2
  jupyterlab: quay.io/nebari/nebari-jupyterlab:2024.2.1rc2
  dask_worker: quay.io/nebari/nebari-dask-worker:2024.2.1rc2
tf_extensions: []
helm_extensions: []

I want to take a look at the other config files (aws, gcp, etc.) and see if we should make any other changes.

Adam-D-Lewis · 2024-04-11T20:27:59Z

@rsignell We want to be careful to get this fix right so a PR is going to take more time. However, in the mean time I've created a PR to add node groups back to the nebari config which is a fairly straight forward change.

rsignell · 2024-04-12T06:42:07Z

@Adam-D-Lewis , awesome, that alone will be super useful!

rsignell added needs: triage 🚦 Someone needs to have a look at this issue and triage type: bug 🐛 Something isn't working labels Nov 18, 2023

Adam-D-Lewis linked a pull request Mar 6, 2024 that will close this issue

Explicit config [WIP] #2294

Open

10 tasks

Adam-D-Lewis mentioned this issue Apr 11, 2024

add default node groups to config #2398

Merged

10 tasks

Adam-D-Lewis mentioned this issue Apr 15, 2024

RFD - Proposal for more explicit nebari-config.yaml nebari-dev/governance#48

Open

marcelovilla added area: user experience 👩🏻‍💻 area: nebari-cli and removed needs: triage 🚦 Someone needs to have a look at this issue and triage labels May 9, 2024

viniciusdc mentioned this issue May 30, 2024

[ENH] - Update stages schema structure #2497

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] - Config generated by guided init should contain default settings #2122

[BUG] - Config generated by guided init should contain default settings #2122

rsignell commented Nov 18, 2023 •

edited

aktech commented Feb 23, 2024

Adam-D-Lewis commented Feb 27, 2024 •

edited

Adam-D-Lewis commented Feb 28, 2024 •

edited

Adam-D-Lewis commented Mar 6, 2024 •

edited

rsignell commented Mar 7, 2024

Adam-D-Lewis commented Mar 9, 2024 •

edited

Adam-D-Lewis commented Apr 11, 2024 •

edited

rsignell commented Apr 12, 2024

[BUG] - Config generated by guided init should contain default settings #2122

[BUG] - Config generated by guided init should contain default settings #2122

Comments

rsignell commented Nov 18, 2023 • edited

Describe the bug

Expected behavior

OS and architecture in which you are running Nebari

How to Reproduce the problem?

Command output

Versions and dependencies used.

Compute environment

Integrations

Anything else?

aktech commented Feb 23, 2024

Adam-D-Lewis commented Feb 27, 2024 • edited

Adam-D-Lewis commented Feb 28, 2024 • edited

Adam-D-Lewis commented Mar 6, 2024 • edited

rsignell commented Mar 7, 2024

Adam-D-Lewis commented Mar 9, 2024 • edited

Adam-D-Lewis commented Apr 11, 2024 • edited

rsignell commented Apr 12, 2024

rsignell commented Nov 18, 2023 •

edited

Adam-D-Lewis commented Feb 27, 2024 •

edited

Adam-D-Lewis commented Feb 28, 2024 •

edited

Adam-D-Lewis commented Mar 6, 2024 •

edited

Adam-D-Lewis commented Mar 9, 2024 •

edited

Adam-D-Lewis commented Apr 11, 2024 •

edited