Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add autoscaler local exec option #895

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions module-extensions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ module "extensions" {
cluster_autoscaler_helm_values = var.cluster_autoscaler_helm_values
cluster_autoscaler_helm_values_files = var.cluster_autoscaler_helm_values_files
expected_autoscale_worker_pools = coalesce(one(module.workers[*].worker_pool_autoscale_expected), 0)
cluster_autoscaler_remote_exec = var.cluster_autoscaler_remote_exec

# Gatekeeper
gatekeeper_install = var.gatekeeper_install
Expand Down
27 changes: 24 additions & 3 deletions modules/extensions/autoscaler.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,21 @@ locals {
worker_pools_autoscaling = { for k, v in var.worker_pools : k => v if tobool(lookup(v, "autoscale", false)) }

# Whether to enable cluster autoscaler deployment based on configuration, active nodes, and autoscaling pools
cluster_autoscaler_enabled = alltrue([
remote_cluster_autoscaler_enabled = alltrue([
var.cluster_autoscaler_install,
var.expected_node_count > 0,
var.expected_autoscale_worker_pools > 0,
var.cluster_autoscaler_remote_exec
])

local_cluster_autoscaler_enabled = alltrue([
var.cluster_autoscaler_install,
var.expected_node_count > 0,
var.expected_autoscale_worker_pools > 0,
var.cluster_autoscaler_remote_exec == false
])


# Templated Helm manifest values
cluster_autoscaler_manifest = sensitive(one(data.helm_template.cluster_autoscaler[*].manifest))
cluster_autoscaler_manifest_path = join("/", [local.yaml_manifest_path, "cluster_autoscaler.yaml"])
Expand All @@ -41,7 +50,7 @@ locals {
}

data "helm_template" "cluster_autoscaler" {
count = local.cluster_autoscaler_enabled ? 1 : 0
count = local.remote_cluster_autoscaler_enabled || local.local_cluster_autoscaler_enabled ? 1 : 0
chart = "cluster-autoscaler"
repository = "https://kubernetes.github.io/autoscaler"
version = var.cluster_autoscaler_helm_version
Expand Down Expand Up @@ -118,7 +127,7 @@ data "helm_template" "cluster_autoscaler" {
}

resource "null_resource" "cluster_autoscaler" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cleaner to change this to remote_cluster_autoscaler so it matches the new local_cluster_autoscaler, however I didn't want to break anything upstream. Options:

  • Change the name and add a moved {} block.
  • Leave the name as it is.

Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, then all extensions will need this kind of feature. I like your idea of using a variable to control remote or local exec. Perhaps just use the same null resource but different conditional blocks based on the variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up going with the helm_release resource, I think that is the better approach. I attempted to use a separate null_resource with a local-exec provisioner, but I would also have to add a local_file resource to create the manifest and then apply it with kubectl. Using kubectl makes sense when you can't use the helm_release resource remotely, but you can use it locally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the helm provider being initialized.
I recommend using a kube-config datasource and creating a local kube-config file instead of relying on the existing one and the currently configured context.

Note that this approach assumes that OCI CLI is already installed and configured locally.

count = local.cluster_autoscaler_enabled ? 1 : 0
count = local.remote_cluster_autoscaler_enabled ? 1 : 0

triggers = {
manifest_md5 = try(md5(local.cluster_autoscaler_manifest), null)
Expand Down Expand Up @@ -148,3 +157,15 @@ resource "null_resource" "cluster_autoscaler" {
inline = ["kubectl apply -f ${local.cluster_autoscaler_manifest_path}"]
}
}

resource "null_resource" "local_cluster_autoscaler" {
count = local.local_cluster_autoscaler_enabled ? 1 : 0

triggers = {
manifest_md5 = try(md5(local.cluster_autoscaler_manifest), null)
}

provisioner "local-exec" {
command = "cat ${local.cluster_autoscaler_manifest} | kubectl apply --dry-run='client' -f -"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cat-ing that local variable into kubectl apply doesn't seem to work. I get an error like the following:

│       serviceAccountName: cluster-autoscaler-oci-oke-cluster-autoscaler                                      
│       tolerations:                                                                                                           
│         []                                                                                                                   
│  | kubectl apply --dry-run='client' -f -': exit status 2. Output: : -: not found                             
│ /bin/sh: 81: -: not found                                    
│ /bin/sh: 82: -: not found                                                                                                    
│ /bin/sh: 83: -: not found                                    
│ /bin/sh: 84: -: not found 

Maybe there are newlines in the variable that is messing it up?

Options:

  • Use the helm provider that is already in versions.tf to just apply the helm chart.
  • Use the local_file resource to actually save the manifest yaml, which would have a similar effect to what the remote-exec is doing.

}
}
1 change: 1 addition & 0 deletions modules/extensions/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ variable "cluster_autoscaler_helm_version" { type = string }
variable "cluster_autoscaler_helm_values" { type = map(string) }
variable "cluster_autoscaler_helm_values_files" { type = list(string) }
variable "expected_autoscale_worker_pools" { type = number }
variable "cluster_autoscaler_remote_exec" { type = bool }

# Prometheus
variable "prometheus_install" { type = bool }
Expand Down
6 changes: 6 additions & 0 deletions variables-extensions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,12 @@ variable "cluster_autoscaler_helm_values_files" {
type = list(string)
}

variable "cluster_autoscaler_remote_exec" {
default = true
description = "Whether to execute deploy the cluster autoscaler remotely via the operator server. If false, the cluster autoscaler helm chart will be installed on the same machine you are running Terraform from."
type = bool
}

# Prometheus

variable "prometheus_install" {
Expand Down