-
Notifications
You must be signed in to change notification settings - Fork 1
Development notes ‐ operator‐sdk version
Scott Trent edited this page Aug 26, 2024
·
24 revisions
(ALWAYS UPDATING!!!!!!)
Changes to the main
branch automatically rebuild and push the container image via github action, but non-main branches or forked repos need to be hand-built and pushed to a developer specific location to avoid overwriting the official images.
Tips:
- As of August 26, 2024, it is recommended to locally install operator-sdk version 1.36.1.
- Set the default namespace to a known working location, such as
oc project default
. - To force a newly updated image to be used, bump up the version count in the
VERSION
file. - As desired, export
CONTAINER_TOOL
todocker
orpodman
before building. (Default isdocker
.) - If an official SusQL operator is installed on the cluster, be sure to uninstall it first.
export BUNDLE_IMG="REPOSITORYURL/REPOSITORYNAME/susql-controller:v$(cat VERSION)"
export IMG=REGISTRYURL/REPOSITORYNAME/susql-controller
export IMAGE_TAG_BASE=${IMG}
export CONTAINER_TOOL=podman
podman login
make all
make bundle-build bundle-push
make operator-build operator-push
make test
make run
(Use control-c to terminate make run
)
- Log in to cluster on the command line using command from "Copy login command" on upper right corner of OpenShift web console
- Be sure to remove previously installed SusQL operators.
operator-sdk cleanup susql-operator
operator-sdk run bundle ${BUNDLE_IMG}
cd susql-operator/test
oc create -f labelgroups.yaml
oc create -f training-job-1.yaml
oc create -f training-job-2.yaml
bash labelgroups.sh
sleep 10
bash labelgroups.sh
# remove test artifacts on completion
oc delete -f training-job-2.yaml
oc delete -f training-job-1.yaml
oc delete -f labelgroups.yaml
- Make sure user monitoring is set up correctly. (Messing with label settings can be an unsupported action, consider verification within a newly created namespace...)
- Is Kepler source correct?
- Verify configuration displayed at install and run time
- Double check that Kepler is functioning (e.g., expected output from OpenShift->Observe->Dashboards, etc)
- Try looking at OpenShift->Observe->Metrics searches such as:
kepler_container_joules_total
kepler_container_joules_total{container_namespace="default"}
- Standard Kepler troubleshooting: https://sustainable-computing.io/usage/trouble_shooting/
- Look at SusQL controller pod log output
Depending on how the operator is installed it may be in one of the following namespaces:
-
susql-operator-system
,openshift-operators
, ordefault
oc project default
oc logs $( oc get pod | grep susql-operator | cut -f 1 -d" " )
- Verify accessibility and contents of appropriate Prometheus databases.
- The log level can be changed by editing
zapcore.Level(-2)
incmd/main.go
and recreating the container image. (Eventually, log level will be configurable.) - To allow CLI access to the SusQL container change the final tag in the
gcr.io
line in theDockerfile
from:nonroot
to:debug
. For example:FROM gcr.io/distroless/static:debug
. This will include busybox and allow entry into the container by specifying an entrypoint ofsh
. (If you really want to use a full path it would be/busybox/sh
.) - With a debug container in place modify the
containers.command
spec inconfig/manager/manager.yaml
to replace- /manager
with- /debug-entrypoint.sh