Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on Autoscaling with distinct_hosts Constraint #797

Open
DTTerastar opened this issue Dec 11, 2023 · 3 comments
Open

Guidance on Autoscaling with distinct_hosts Constraint #797

DTTerastar opened this issue Dec 11, 2023 · 3 comments
Assignees
Labels
stage/waiting-reply theme/policy Policy source, parsing and validation type/question

Comments

@DTTerastar
Copy link

I am seeking advice on the best practices for using the Nomad Autoscaler in scenarios where the distinct_hosts constraint is applied in job configurations. Specifically, I'd like to understand how to effectively scale up a Nomad cluster using the autoscaler when each job instance must be placed on a separate host.

My primary concern is ensuring that the autoscaler responds appropriately to the unique requirements imposed by the distinct_hosts constraint. For instance, if a job is configured to launch instances across different hosts, how can the autoscaler be configured to ensure there are enough hosts in the cluster to accommodate scaling actions?

Any insights, recommendations, or examples of similar configurations would be greatly appreciated.

Thank you for your assistance and for the great work on Nomad Autoscaler.

@lgfa29
Copy link
Contributor

lgfa29 commented Dec 22, 2023

Hi @DTTerastar 👋

I'm not sure if I fully understood the question, so I will try to answer as best as I can, but let me know if I missed anything.

The autoscaler doesn't take any job constraint into consideration, it will only affect the count value of a group (in the case of horizontal application scaling) or the number of instances in a cluster group (like an AWS ASG, GCP MIG, Azure VMSS etc.).

It sounds like you want to have a 1:1 match between the number of allocations of a job and the number of clients in your cluster?

If that's the case then you need two components:

  1. A query that returns one of the numbers.
  2. A policy that uses the pass-through strategy.

Let's say that you will control the number of allocations manually, and so you want to have an equal number of clients. You can accomplish this with a policy like so:

scaling "match_job" {
  enabled = true
  min     = 1
  max     = 5

  policy {
    cooldown            = "2m"
    evaluation_interval = "5m"

    check "number_of_allocs" {
      source = "prometheus"
      query  = "sum(nomad_nomad_job_summary_queued{exported_job=~\"example\"} + nomad_nomad_job_summary_running{exported_job=~\"example\"}) OR on() vector(0)"

      strategy "pass-through" {}
    }

    target "aws-asg" {
      dry-run             = "false"
      aws_asg_name        = "hashistack-nomad_client"
      node_class          = "hashistack"
      node_drain_deadline = "15m"
    }
  }
}

Notice the query value. It's using the job_summary to sum the number of allocations running and those that are queued for a job called example. The queued allocations are those that Nomad tried to place but failed due to the distinct_host constraint. Using the pass-through strategy, we instruct the autoscaler to always match this sum.

So if you increase the job group count to a value higher than the current number of clients in your cluster, these unplaced allocations will increase the queued counter, triggering the autoscaler to create a new node. Similarly, if you reduce count to a value below the number of clients you have, the autoscaler will remove nodes.

Here's how that would look :
image

You could also try to invert this, and control the number of clients using the job count value. You would then apply the policy to the job:

job "example" {
  constraint {
    operator = "distinct_hosts"
    value    = "true"
  }

  group "cache" {
    count = 3

    scaling {
      min = 1
      max = 5


      policy {
        cooldown            = "2m"
        evaluation_interval = "10s"

        check "number_of_clients" {
          source = "prometheus"
          query  = "count(nomad_client_allocations_running)"

          strategy "pass-through" {}
        }
      }
    }

    task "redis" {
      driver = "docker"

      config {
        image = "redis:7"
        ports = ["db"]
      }
    }
  }
}

Unfortunately this doesn't work as well because the group policy will not be able to take into consideration the number of queued allocations. So you will be able to scale up the number of clients, but not down 😅

So the general idea is to try to isolate cause-and-effect. Instead of trying to figure out a way to execute multiple actions focus instead on simple "when X happens do Y". In the example above, "when the number of running and queue allocations go up/down, add/remove clients from the cluster".

Checkout this tutorial for more details: https://developer.hashicorp.com/nomad/tutorials/autoscaler/horizontal-cluster-scaling-on-demand-batch.

I hope this helps!

@Blefish
Copy link

Blefish commented Apr 8, 2024

Stumbled upon this myself. However it can be partially mitigated if the cluster has other jobs and only some of them need to be on separate hosts. So nomad will kick off some allocations that are more lenient in their constraints and make room for those that have the constriant.

Would it make sense to expose these distinct_hosts allocations via some kind of metrics from Nomad?
Similarly to current metrics for cpu & memory blocked evals, such as nomad_nomad_blocked_evals_distinct_hosts?

@DTTerastar
Copy link
Author

I keep seeing so many metrics we need but for some reason the architecture of nomad makes proper metrics difficult to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/waiting-reply theme/policy Policy source, parsing and validation type/question
Projects
None yet
Development

No branches or pull requests

3 participants