Skip to content

Commit

Permalink
fix: update docs with suggestions
Browse files Browse the repository at this point in the history
  • Loading branch information
Rafael Oliveira committed Feb 20, 2024
1 parent aa5d7c3 commit a202969
Show file tree
Hide file tree
Showing 7 changed files with 51 additions and 29 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -229,4 +229,4 @@ apply-crds: manifests kustomize ## Install CRDs into the K8s cluster specified i
$(KUSTOMIZE) build config/crd | $(KUBECTL) apply -f - --context kind-$(KIND_CLUSTER_NAME)

gen-docs-images:
$(CONTAINER_TOOL) run --rm -v $(pwd):/data dstockhammer/plantuml:1.2024.2 docs/**/*.puml
$(CONTAINER_TOOL) run --rm -v $(shell pwd):/data dstockhammer/plantuml:1.2024.2 docs/**/*.puml
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
# kubernetes-crossplane-infrastructure-operator

## What is Kubernetes Crossplane Infrastructure Operator?
Kubernetes Crossplane Infrastructure Operator is a operator designed to complement the [Kops Operator](https://github.com/topfreegames/kubernetes-kops-operator) by providing a way to manage infrastructure resources in AWS leveraging on [Crossplane](https://github.com/crossplane/crossplane).

Kubernetes Crossplane Infrastructure Operator is a operator designed initially to complement [Cluster API](https://github.com/kubernetes-sigs/cluster-api) providers, by providing a way to manage infrastructure resources in AWS leveraging on Crossplane. Today it only supports [Kops Operator](https://github.com/topfreegames/kubernetes-kops-operator) provider and will support to other cluster-api providers in the future.
See this [document](docs/README.md) for more details.

## Features
- Manages Security Groups using as reference [Kops Operator](https://github.com/topfreegames/kubernetes-kops-operator) custom resources.
- Manages mesh between Kubernetes clusters being managed by [Kops Operator](https://github.com/topfreegames/kubernetes-kops-operator)
- Manages Clustermesh between Kubernetes clusters being managed by [Kops Operator](https://github.com/topfreegames/kubernetes-kops-operator)


52 changes: 31 additions & 21 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,30 @@
## Glossary
### Clustermesh
Is a network abstraction that facilitates communication among a group of Kubernetes clusters. It offers the essential infrastructure to ensure traffic flow between these clusters. Currently, it exclusively supports clusters running on AWS and comprises AWS VPC Peerings, AWS Security Groups, and AWS Routes.

## Custom Resources
- [Clustermesh](#clustermesh)
This resource has a list of clusters that are part of the Clustermesh alongside with their information. The resource is auto-generated by the Clustermesh Controller when the required label and annotation are applied to the Cluster CRs, it's also to give visibility to the cloud resources created for the Clustermesh.

- Wildlife Security Group
This resource is used as an abstration of the Crossplane Security Group CR so we don't need to specify information like region and VPC ID. It has a list of ingress rules and a list of infrastructure references that are going to use the Security Group.

## Controllers
### Clustermesh Controller
#### Description
This is the controller responsible for managing the mesh between the Clusters. It watches `clusters.cluster.x-k8s.io` and create the necessary cloud resources to establish the mesh between them leveraging on Crossplane CRs.
- Clustermesh Controller
This is the controller responsible for managing the Clustermesh between the Clusters. It watches `clusters.cluster.x-k8s.io` and create the necessary cloud resources to establish the Clustermesh between them leveraging on Crossplane CRs.

- Wildlife Security Group Controller
This is the controller responsible for managing AWS Security Groups. It watches `securitygroups.ec2.aws.wildlife.io` which references Kops Operator custom resources and creates `securitygroups.ec2.aws.crossplane.io` CRs with that information. It also is responsible to attach the Security Groups to the referenced resources.

### Clustermesh Controller
#### How to use
To add a Cluster to a mesh it needs to have a label `clusterGroup` which is the name of the mesh it will be part of and an annotation `clustermesh.infrastructure.wildlife.io: true` to enable it.
To add a Cluster to a Clustermesh it needs to have a label `clusterGroup` which is the name of the Clustermesh it will be part of and an annotation `clustermesh.infrastructure.wildlife.io: true` to enable it.

#### Diagram
![Diagram](./clustermesh/clustermesh-controller.png)

#### Deep Dive
The controller watches for changes in the Cluster CRs and in order to avoid the complexity of having a broad view of everything in the mesh it only worries about the specific cluster that is being reconcilied at the moment, that means that will only create/update the cloud resources for the current cluster. The trick to make this work without generating drifts or missing resources is that we enqueue every other cluster that belongs to the mesh when a cluster is reconciled. This way we can focus only in the current cluster during the reconciliation.
The controller watches for changes in the Cluster CRs and in order to avoid the complexity of having a broad view of everything in the Clustermesh it only worries about the specific cluster that is being reconcilied at the moment, that means that will only create/update the cloud resources for the current cluster. The trick to make this work without generating drifts or missing resources is that we enqueue every other cluster that belongs to the Clustermesh when a cluster is reconciled. This way we can focus only in the current cluster during the reconciliation.

The cloud resources are created using Crossplane CRs and mapped using Kubernetes ownership relationships to be easier to retrieve and cleanup them. The resources are also mapped in the status of the clustermesh cr for visibility.

Expand All @@ -21,30 +35,27 @@ The normal reconciliation have four main steps:
- Reconcile the Routes

##### Retrieve the cluster information
When a reconciliation is triggered the controller retrieves the necessary information of the current cluster. If the cluster already belongs to the mesh it will update that information, if not it'll only update it.
When a reconciliation is triggered the controller retrieves the necessary information of the current Cluster. If the Cluster already belongs to the Clustermesh the controller will update that information, if not the Cluster will be added to the Clustermesh CR alongside with the needed information.

##### Reconcile the VPC peerings
A VPC peering connection is a one to one relationship between two VPCs, because of that we create the peering using <origin-cluster>-<destination-cluster> as the name of the peering so we can easily identify if the peering between two VPCs are already created.
A VPC peering connection is a one to one relationship between two VPCs, because of that we create the peering using `<origin-cluster>-<destination-cluster>` as the name of the peering so we can easily identify if the peering between two VPCs are already created.

##### Reconcile the Security Groups
Security Groups are created leveraging in the Security Group Controller, so the Clustermesh Controller only creates a Security Group CR for the cluster being reconciled allowing the vpc of the other clusters that belongs to the mesh and the Security Group Controller will create the Security Group in AWS and attach it to the necessary resources.
Security Groups are created leveraging the Security Group Controller. When the Clustermesh Controller reconciles a Cluster, it creates a Security Group CR for that Cluster. This CR allows the VPC of other Clusters within the Clustermesh to communicate. The Security Group Controller then creates the Security Group in AWS and attaches it to the necessary resources.

##### Reconcile the Routes
Routes are created using the VPC peering as the target, so after verifying that the VPC peering is already in a ready state, the controller creates a route in every route table of the cluster being reconciled to every VPC peering that is part of the mesh.
Routes are created using the VPC peering as the target, so after verifying that the VPC peering is already in a ready state, the controller creates a route in every route table of the cluster being reconciled to every VPC peering that is part of the Clustermesh.

The deletion happens when a cluster don't have both the label and annotation and still is part of a mesh, this is checked by going through every clustermesh spec and verifying its clusters. When that happens it first delete the security group associated with the cluster. After that it proceeds to remove the vpc peerings associated with the cluster, this will indirectly also remove the routes because of their ownership relationship. At the end if the cluster is the last one in the clustermesh CR it will delete the CR as well. The deletion process relies uses the status of the clustermesh resource as the source of truth.
The deletion happens when a cluster don't have both the label and annotation and still is part of a Clustermesh, this is checked by going through every clustermesh spec and verifying its clusters. When that happens it first delete the security group associated with the cluster. After that it proceeds to remove the vpc peerings associated with the cluster, this will indirectly also remove the routes because of their ownership relationship. At the end if the cluster is the last one in the clustermesh CR it will delete the CR as well. The deletion process relies uses the status of the clustermesh resource as the source of truth.

#### Limitations
- It only supports AWS as a cloud provider.
- It only supports Clusters using KopsControlPlane as a control plane provider.
- A Cluster can only belong to one mesh at a time.
- A Cluster can only belong to one Clustermesh at a time.

### Security Group Controller
#### Description
This is the controller responsible for managing AWS Security Groups. It watches `securitygroups.ec2.aws.wildlife.io` which references Kops Operator custom resources and creates `securitygroups.ec2.aws.crossplane.io` CRs with that information. It also is responsible to attach the Security Groups to the referenced resources.

#### How to use
Create a `securitygroups.ec2.aws.wildlife.io` CR with the desired configuration. Here's an example of a CR that creates a Security Group with two ingress rules and attaches it to two KopsMachinePools:
Create a `securitygroups.ec2.aws.wildlife.io` CR with the desired configuration. Here's an example of a CR that creates a Security Group with one ingress rule and attaches it to two KopsMachinePools:
```yaml
apiVersion: ec2.aws.wildlife.io/v1alpha2
kind: SecurityGroup
Expand Down Expand Up @@ -72,14 +83,13 @@ spec:
![Diagram](./security-group/sg-controller.png)
#### Deep Dive
The controller watches for changes in the Security Group CRs and creates the Crossplane CR that reflects it, after the Crossplane CR is reporting in its status that it's ready, the controller attaches them to the resources referenced in the CR. The reconciliation starts by retrieving the required information from the referenced resources they are the providerConfigName that it's going to be used as credentials in the Crossplane resource, the region and the VPC ID. The next step is to create the crossplane resource with the information retrieved and the desired configuration. After that the controller waits for the Crossplane resource to be ready by checking its status and enqueing the SecurityGroup CR again if it's not ready. When the Crossplane resource is ready the controller attaches it to the resources referenced in the SecurityGroup CR, the attachment is made in the AWS API using its SDK, both in the configuration resource like ASGs and LTs, but also in the EC2 instances to avoid drifts.
The controller watches for changes in the Security Group CRs and creates the Crossplane CR that reflects it, after the Crossplane CR is reporting in its status that it's ready, the controller attaches them to the resources referenced in the CR. The reconciliation starts by retrieving the required information from the referenced resources, they are the providerConfigName that is going to be used as credentials in the Crossplane resource, the region, and the VPC ID. The next step is to create the crossplane resource with the information retrieved and the desired configuration. After that the controller waits for the Crossplane resource to be ready by checking its status and enqueing the SecurityGroup CR again if it's not ready. When the Crossplane resource is ready the controller attaches it to the resources referenced in the SecurityGroup CR, the attachment is made in the AWS API using its SDK, both in the configuration resource like ASGs and LTs, but also in the EC2 instances to avoid drifts.
The SecurityGroup CR has a finalizer to allow to proper detachment of the Security Group from the resources referenced before it's deleted. The deletion process is similar to the creation process, but in reverse. The controller first detaches the Security Group from the resources and then deletes the Crossplane resource. After that it removes the finalizer from the SecurityGroup CR and deletes it.
#### Limitations
- It only supports AWS as a cloud provider.
- It only supports Clusters using KopsControlPlane as a control plane provider.
- A Cluster can only belong to one mesh at a time.
- Today only supports KopsMachinePool and KopsControlPlane as infrastructure references.
- It don't support egress rules yet.
## Running
Expand All @@ -105,6 +115,6 @@ KOPS_DOMAIN=your.k8s.domain KOPS_BUCKET=kops-config-bucket KOPS_ZONE=aws-zone ha
```

TODO:
- Talking about pause annotations
- Talk about how the credendials are being used
- Explanation about pause annotations
- Explanation about how the credendials are being used
- Complement the documentation with the work being done in the deletion process
Binary file modified docs/clustermesh/clustermesh-controller.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 11 additions & 3 deletions docs/clustermesh/clustermesh-controller.puml
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,17 @@ if (Is clustermesh enabled for that Cluster?) is (true) then
if (Is Cluster belongs to Clustermesh already?) is (false) then
:Add Cluster to Clustermesh CR;
endif
:Create a VPC Peering to each Cluster of the Clustermesh;
:Create or Update a Security Group to the Cluster allowing the Clusters of the Clustermesh;
:Create Routes for each VPC Peering;
if (Are VPC peering connections established between each cluster in the Clustermesh?) is (false) then
:Create VPC Peerings to each Cluster of the Clustermesh;
endif
if (Is Wildlife Security Group created for the Cluster?) is (false) then
:Create a Wildlife Security Group for the Cluster that permits communication with other clusters in the Clustermesh;
else
:Update the Wildlife Security Group of the Cluster permitting communication with other clusters in the Clustermesh;
endif
if (Are Routes created for each VPC Peering?) is (false) then
:Create Routes for each VPC Peering;
endif
:Validate Clustermesh;
else (false)
:Create Clustermesh CR with the Cluster;
Expand Down
Binary file modified docs/security-group/sg-controller.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion docs/security-group/sg-controller.puml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,15 @@ start
if (CR has Pause annotation?) is (false) then
if (Is Wildlife SecurityGroup CR marked for deletion?) is (false) then
:Add finalizer to Wildlife SecurityGroup CR;
:Create or Update Crossplane SecurityGroup;
if (Is Crossplane SecurityGroup created?) is (false) then
:Create Crossplane SecurityGroup;
else (true)
:Update Crossplane SecurityGroup;
endif
if (Is Crossplane SecurityGroup available?) is (true) then
:Attach SecurityGroup to InfrastructureRef;
else (false)
stop
endif
else (true)
:Get Crossplane SecurityGroup;
Expand Down

0 comments on commit a202969

Please sign in to comment.