Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add autoscaler local exec option #895

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
6 changes: 5 additions & 1 deletion docs/src/guide/extensions_cluster_autoscaler.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ The following parameters may be added on each pool definition to enable manageme

Don't set `allow_autoscaler` and `autoscale` to `true` on the same pool. This will cause the cluster autoscaler pod to be unschedulable as the `oke.oraclecloud.com/cluster_autoscaler: managed` node label will override the `oke.oraclecloud.com/cluster_autoscaler: allowed` node label specified by the cluster autoscaler `nodeSelector` pod attribute.

If you aren't using the operator you can deploy the helm chart using your the same device that is running Terraform.
Just set `var.cluster_autoscaler_remote_exec` to `false`, and make sure your kubectl config is set via `KUBE_CONFIG_PATH`
environment variable.

### Usage
```javascript
{{#include ../../../examples/extensions/vars-extensions-cluster-autoscaler.auto.tfvars:4:}}
Expand All @@ -30,4 +34,4 @@ Don't set `allow_autoscaler` and `autoscale` to `true` on the same pool. This wi
* [Cluster Autoscaler Helm chart](https://github.com/kubernetes/autoscaler/tree/master/charts/cluster-autoscaler)
* [Autoscaling Kubernetes Node Pools and Pods](https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengautoscalingclusters.htm)
* [OCI Provider for Cluster Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/oci#cluster-autoscaler-for-oracle-cloud-infrastructure-oci)
* [Cluster Autoscaler FAQ](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md)
* [Cluster Autoscaler FAQ](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md)
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ cluster_autoscaler_namespace = "kube-system"
cluster_autoscaler_helm_version = "9.24.0"
cluster_autoscaler_helm_values = {}
cluster_autoscaler_helm_values_files = []
cluster_autoscaler_remote_exec = true
1 change: 1 addition & 0 deletions module-extensions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ module "extensions" {
cluster_autoscaler_helm_values = var.cluster_autoscaler_helm_values
cluster_autoscaler_helm_values_files = var.cluster_autoscaler_helm_values_files
expected_autoscale_worker_pools = coalesce(one(module.workers[*].worker_pool_autoscale_expected), 0)
cluster_autoscaler_remote_exec = var.cluster_autoscaler_remote_exec

# Gatekeeper
gatekeeper_install = var.gatekeeper_install
Expand Down
89 changes: 86 additions & 3 deletions modules/extensions/autoscaler.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,21 @@ locals {
worker_pools_autoscaling = { for k, v in var.worker_pools : k => v if tobool(lookup(v, "autoscale", false)) }

# Whether to enable cluster autoscaler deployment based on configuration, active nodes, and autoscaling pools
cluster_autoscaler_enabled = alltrue([
remote_cluster_autoscaler_enabled = alltrue([
var.cluster_autoscaler_install,
var.expected_node_count > 0,
var.expected_autoscale_worker_pools > 0,
var.cluster_autoscaler_remote_exec
])

local_cluster_autoscaler_enabled = alltrue([
var.cluster_autoscaler_install,
var.expected_node_count > 0,
var.expected_autoscale_worker_pools > 0,
var.cluster_autoscaler_remote_exec == false
])


# Templated Helm manifest values
cluster_autoscaler_manifest = sensitive(one(data.helm_template.cluster_autoscaler[*].manifest))
cluster_autoscaler_manifest_path = join("/", [local.yaml_manifest_path, "cluster_autoscaler.yaml"])
Expand All @@ -41,7 +50,7 @@ locals {
}

data "helm_template" "cluster_autoscaler" {
count = local.cluster_autoscaler_enabled ? 1 : 0
count = local.remote_cluster_autoscaler_enabled ? 1 : 0
chart = "cluster-autoscaler"
repository = "https://kubernetes.github.io/autoscaler"
version = var.cluster_autoscaler_helm_version
Expand Down Expand Up @@ -118,7 +127,7 @@ data "helm_template" "cluster_autoscaler" {
}

resource "null_resource" "cluster_autoscaler" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cleaner to change this to remote_cluster_autoscaler so it matches the new local_cluster_autoscaler, however I didn't want to break anything upstream. Options:

  • Change the name and add a moved {} block.
  • Leave the name as it is.

Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, then all extensions will need this kind of feature. I like your idea of using a variable to control remote or local exec. Perhaps just use the same null resource but different conditional blocks based on the variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up going with the helm_release resource, I think that is the better approach. I attempted to use a separate null_resource with a local-exec provisioner, but I would also have to add a local_file resource to create the manifest and then apply it with kubectl. Using kubectl makes sense when you can't use the helm_release resource remotely, but you can use it locally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the helm provider being initialized.
I recommend using a kube-config datasource and creating a local kube-config file instead of relying on the existing one and the currently configured context.

Note that this approach assumes that OCI CLI is already installed and configured locally.

count = local.cluster_autoscaler_enabled ? 1 : 0
count = local.remote_cluster_autoscaler_enabled ? 1 : 0

triggers = {
manifest_md5 = try(md5(local.cluster_autoscaler_manifest), null)
Expand Down Expand Up @@ -148,3 +157,77 @@ resource "null_resource" "cluster_autoscaler" {
inline = ["kubectl apply -f ${local.cluster_autoscaler_manifest_path}"]
}
}

resource "helm_release" "local_cluster_autoscaler" {
count = local.local_cluster_autoscaler_enabled ? 1 : 0
chart = "cluster-autoscaler"
repository = "https://kubernetes.github.io/autoscaler"
version = var.cluster_autoscaler_helm_version

name = "cluster-autoscaler"
namespace = var.cluster_autoscaler_namespace
create_namespace = true

values = length(var.cluster_autoscaler_helm_values_files) > 0 ? [
for path in var.cluster_autoscaler_helm_values_files : file(path)
] : null

set {
name = "nodeSelector.oke\\.oraclecloud\\.com/cluster_autoscaler"
value = "allowed"
}

dynamic "set" {
for_each = local.cluster_autoscaler_defaults
iterator = helm_value
content {
name = helm_value.key
value = helm_value.value
}
}

dynamic "set" {
for_each = var.cluster_autoscaler_helm_values
iterator = helm_value
content {
name = helm_value.key
value = helm_value.value
}
}

dynamic "set" {
for_each = local.worker_pools_autoscaling
iterator = pool
content {
name = "autoscalingGroups[${index(keys(local.worker_pools_autoscaling), pool.key)}].name"
value = lookup(pool.value, "id")
}
}

dynamic "set" {
for_each = local.worker_pools_autoscaling
iterator = pool
content {
name = "autoscalingGroups[${index(keys(local.worker_pools_autoscaling), pool.key)}].minSize"
value = lookup(pool.value, "min_size", lookup(pool.value, "size"))
}
}

dynamic "set" {
for_each = local.worker_pools_autoscaling
iterator = pool
content {
name = "autoscalingGroups[${index(keys(local.worker_pools_autoscaling), pool.key)}].maxSize"
value = lookup(pool.value, "max_size", lookup(pool.value, "size"))
}
}

lifecycle {
precondition {
condition = alltrue([for path in var.cluster_autoscaler_helm_values_files : fileexists(path)])
error_message = format("Missing Helm values files in configuration: %s",
jsonencode([for path in var.cluster_autoscaler_helm_values_files : path if !fileexists(path)])
)
}
}
}
1 change: 1 addition & 0 deletions modules/extensions/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ variable "cluster_autoscaler_helm_version" { type = string }
variable "cluster_autoscaler_helm_values" { type = map(string) }
variable "cluster_autoscaler_helm_values_files" { type = list(string) }
variable "expected_autoscale_worker_pools" { type = number }
variable "cluster_autoscaler_remote_exec" { type = bool }

# Prometheus
variable "prometheus_install" { type = bool }
Expand Down
6 changes: 6 additions & 0 deletions variables-extensions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,12 @@ variable "cluster_autoscaler_helm_values_files" {
type = list(string)
}

variable "cluster_autoscaler_remote_exec" {
default = true
description = "Whether to execute deploy the cluster autoscaler remotely via the operator server. If false, the cluster autoscaler helm chart will be installed on the same machine you are running Terraform from."
type = bool
}

# Prometheus

variable "prometheus_install" {
Expand Down