Skip to content

A Kubernetes pod monitor for safely terminating pods with persistent volumes in case of node failures

License

Notifications You must be signed in to change notification settings

dell/karavi-resiliency

Repository files navigation

Dell Container Storage Modules (CSM) for Resiliency

Contributor Covenant License Podmam Pulls Go version GitHub release (latest by date including pre-releases) Releases

CSM for Resiliency is part of the CSM (Container Storage Modules) open-source suite of Kubernetes storage enablers for Dell products. CSM for Resiliency is a project designed to make Kubernetes Applications, including those that utilize persistent storage, more resilient to various failures. The first component of CSM for Resiliency is a pod monitor that is specifically designed to protect stateful applications from various failures. It is not a standalone application, but rather is deployed as a sidecar to Dell CSI (Container Storage Interface) drivers, in both the driver's controller pods and the driver's node pods. Deploying CSM for Resiliency as a sidecar allows it to make direct requests to the driver through the Unix domain socket that Kubernetes sidecars use to make CSI requests.

Some of the methods CSM for Resiliency invokes in the driver are standard CSI methods, such as NodeUnpublishVolume, NodeUnstageVolume, and ControllerUnpublishVolume. CSM for Resiliency also uses proprietary calls that are not part of the standard CSI specification. Currently, there is only one, ValidateVolumeHostConnectivity that returns information on whether a host is connected to the storage system and/or whether any I/O activity has happened in the recent past from a list of specified volumes. This allows CSM for Resiliency to make more accurate determinations about the state of the system and its persistent volumes.

Accordingly, CSM for Resiliency is adapted to, and qualified with each Dell CSI driver it is to be used with. Different storage systems have different nuances and characteristics that CSM for Resiliency must take into account.

For documentation, please visit Container Storage Modules documentation.

Table of Contents

Building CSM for Resiliency

If you wish to clone and build CSM for Resiliency, a Linux host is required with the following installed:

Component Version Additional Information
Podman v4.4.1+ Podman installation
Buildah v1.29.1+ Buildah installation
Golang v1.21+ Golang installation
git latest Git installation

Once all prerequisites are on the Linux host, follow the steps below to clone, build and deploy CSM for Resiliency:

  1. Clone the repository: git clone https://github.com/dell/karavi-resiliency.git

  2. Define and export the following environment variables to point to your Podman registry:

    export REGISTRY_HOST=<registry host>
    export REGISTRY_PORT=<registry port>
    export VERSION=<version>
  3. At the root of the source tree, run the following to build and deploy: make

Testing CSM for Resiliency

From the root directory where the repo was cloned, the unit tests can be executed as follows:

make unit-test

Versioning

This project is adhering to Semantic Versioning.

About

Dell Container Storage Modules (CSM) is 100% open source and community-driven. All components are available under Apache 2 License on GitHub.