Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] ON-15452: Add unit tests for "dirty" clusters #89

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

tcrawley-xilinx
Copy link
Contributor

@tcrawley-xilinx tcrawley-xilinx commented Nov 28, 2023

Adds new unit tests that test how the onload controller will handle operating in scenarios where it's operands already exist.

TODO:

  • Mid upgrade recovery at different stages
  • Check whether we can add fake nodes to the envtest cluster.

[1/2] Cleanup: Refactor envtest cluster creation

Previously the code to start/stop an envtest cluster only existed in the
BeforeSuite/AfterSuite functions, which meant that all tests in the same
suite would use the same cluster. This change moves this code into some
new functions that can allow for the creation of additional clusters in
the same suite.

[2/2] ON-15452: Add initial unit test for a dirty cluster

Unfortunately, we can't use the existing envtest infrastructure since it
starts the onload reconciler when the cluster is created thus there is
no time in which we can make it "dirty".

This change adds a new container for unit tests which will create a new
envtest cluster for each test without starting the onload reconciler.

As an initial test it will create the onload CR and the device plugin
before starting the reconciler and then check that it wasn't
deleted/re-created.

Testing done

None in cluster. All unit tests pass.

The added unit tests are quite slow, so I've tried comparing the performance of go test versus ginkgo run -p:
* go test can run the tests for each package in parallel, but within a package each test is ran sequentially.
* ginkgo run -p will parallelise the tests in each package, but packages are run serially.
Overall there was no significant difference between the two options, so I stuck will go test since it should be more universal.
I changed it in favour of ginkgo in newer version due to adding even more tests.

Previously the code to start/stop an envtest cluster only existed in the
BeforeSuite/AfterSuite functions, which meant that all tests in the same
suite would use the same cluster. This change moves this code into some
new functions that can allow for the creation of additional clusters in
the same suite.
Unfortunately, we can't use the existing envtest infrastructure since it
starts the onload reconciler when the cluster is created thus there is
no time in which we can make it "dirty".

This change adds a new container for unit tests which will create a new
envtest cluster for each test without starting the onload reconciler.

As an initial test it will create the onload CR and the device plugin
before starting the reconciler and then check that it wasn't
deleted/re-created.
The controller was missing some logic to handle the case where a node
had an onload version label that was out-of-sync with the Onload CR.

This commit will make the controller remove the label and requeue, this
will give the force the device plugin to be removed (and removes the
local files) before re-adding the label on the next reconciliation loop.
Copy link
Collaborator

@pcolledg-amd pcolledg-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful so far

Makefile Show resolved Hide resolved
controllers/onload_controller_test.go Outdated Show resolved Hide resolved
Adds two new tests that ensure the operands are updated as expected (but
not removed and re-added).

Also adds a new type of test that adds a Node to the envtest cluster.
These tests test how the controller handles starting when the operands
and node labels might not be as expected.
As well as being a test framework, ginkgo can be used as an application
to run the test. It will run the same tests as `go test`, but ginkgo
allows tests within a suite to be run in concurrently (`go test` can
run suites in parallel, but each test in a suite sequentially). Since
the majority of our tests are in the same suite (controllers/) it can
save time running with ginkgo directly
@ivatet-amd
Copy link
Collaborator

Looks good!

A couple of questions:

  1. Am I reading it right that the fundamental change is the startReconciler(), which now the tests can kick off after preparing a "dirty" cluster, whilst previously they would have competed with the Onload reconciler?
  2. How quicker ginkgo run than go test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants