Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cgroups v2 support to resourcemanager #1744

Merged
merged 11 commits into from
Sep 22, 2023
Merged
84 changes: 49 additions & 35 deletions arbnode/resourcemanager/resource_management.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,18 +90,28 @@ type limitChecker interface {
String() string
}

func isSupported(c limitChecker) bool {
_, err := c.isLimitExceeded()
return err == nil
}

// newLimitChecker attempts to auto-discover the mechanism by which it
// can check system limits. Currently Cgroups V1 is supported,
// with Cgroups V2 likely to be implmemented next. If no supported
// mechanism is discovered, it logs an error and fails open, ie
// it creates a trivialLimitChecker that does no checks.
// can check system limits. Currently Cgroups V1 and V2 are supported.
// If no supported mechanism is discovered, it logs an error and
// fails open, ie it creates a trivialLimitChecker that does no checks.
func newLimitChecker(conf *Config) limitChecker {
Copy link
Contributor

@anodar anodar Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've mentioned in original review where this was introduced, please return concrete type here, as returning interfaces are not idiomatic in Go. You can change this function to:

func newLimitChecker(conf *Config) (cgroupsMemoryLimitChecker, error) {
       if c := newCgroupsMemoryLimitChecker(cgroupsV1MemoryFiles, conf.MemoryLimitPercent); isSupported(c) {
		return c, nil
	}
      if c := newCgroupsMemoryLimitChecker(cgroupsV2MemoryFiles, conf.MemoryLimitPercent); isSupported(c) {
            return c, nil
      }
      return nil, errNotSupported
}

Then in the call site you can call this, check with error.Is(err, errNotSupported) to instantiate trivialLimitChecker{} when needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored it as suggested.

c := newCgroupsV1MemoryLimitChecker(DefaultCgroupsV1MemoryDirectory, conf.MemoryLimitPercent)
c := newCgroupsMemoryLimitChecker(cgroupsV1MemoryFiles, conf.MemoryLimitPercent)
if isSupported(c) {
log.Info("Cgroups v1 detected, enabling memory limit RPC throttling")
return c
}

c = newCgroupsMemoryLimitChecker(cgroupsV2MemoryFiles, conf.MemoryLimitPercent)
if isSupported(c) {
log.Info("Cgroups v2 detected, enabling memory limit RPC throttling")
return c
}

log.Error("No method for determining memory usage and limits was discovered, disabled memory limit RPC throttling")
return &trivialLimitChecker{}
}
Expand All @@ -115,28 +125,37 @@ func (_ trivialLimitChecker) isLimitExceeded() (bool, error) {

func (_ trivialLimitChecker) String() string { return "trivial" }

const DefaultCgroupsV1MemoryDirectory = "/sys/fs/cgroup/memory/"
type cgroupsMemoryFiles struct {
limitFile, usageFile, statsFile string
inactiveRe *regexp.Regexp
}

type cgroupsV1MemoryLimitChecker struct {
cgroupDir string
memoryLimitPercent int
const defaultCgroupsV1MemoryDirectory = "/sys/fs/cgroup/memory/"
const defaultCgroupsV2MemoryDirectory = "/sys/fs/cgroup/"

limitFile, usageFile, statsFile string
var cgroupsV1MemoryFiles = cgroupsMemoryFiles{
limitFile: defaultCgroupsV1MemoryDirectory + "/memory.limit_in_bytes",
usageFile: defaultCgroupsV1MemoryDirectory + "/memory.usage_in_bytes",
statsFile: defaultCgroupsV1MemoryDirectory + "/memory.stat",
inactiveRe: regexp.MustCompile(`total_inactive_file (\d+)`),
}
var cgroupsV2MemoryFiles = cgroupsMemoryFiles{
limitFile: defaultCgroupsV2MemoryDirectory + "/memory.max",
usageFile: defaultCgroupsV2MemoryDirectory + "/memory.current",
statsFile: defaultCgroupsV2MemoryDirectory + "/memory.stat",
inactiveRe: regexp.MustCompile(`inactive_file (\d+)`),
}

func newCgroupsV1MemoryLimitChecker(cgroupDir string, memoryLimitPercent int) *cgroupsV1MemoryLimitChecker {
return &cgroupsV1MemoryLimitChecker{
cgroupDir: cgroupDir,
memoryLimitPercent: memoryLimitPercent,
limitFile: cgroupDir + "/memory.limit_in_bytes",
usageFile: cgroupDir + "/memory.usage_in_bytes",
statsFile: cgroupDir + "/memory.stat",
}
type cgroupsMemoryLimitChecker struct {
files cgroupsMemoryFiles
memoryLimitPercent int
}

func isSupported(c limitChecker) bool {
_, err := c.isLimitExceeded()
return err == nil
func newCgroupsMemoryLimitChecker(files cgroupsMemoryFiles, memoryLimitPercent int) *cgroupsMemoryLimitChecker {
return &cgroupsMemoryLimitChecker{
files: files,
memoryLimitPercent: memoryLimitPercent,
}
}

// isLimitExceeded checks if the system memory used exceeds the limit
Expand All @@ -145,24 +164,25 @@ func isSupported(c limitChecker) bool {
// See the following page for details of calculating the memory used,
// which is reported as container_memory_working_set_bytes in prometheus:
// https://mihai-albert.com/2022/02/13/out-of-memory-oom-in-kubernetes-part-3-memory-metrics-sources-and-tools-to-collect-them/
func (c *cgroupsV1MemoryLimitChecker) isLimitExceeded() (bool, error) {
func (c *cgroupsMemoryLimitChecker) isLimitExceeded() (bool, error) {
var limit, usage, inactive int
var err error
limit, err = readIntFromFile(c.limitFile)
if err != nil {
if limit, err = readIntFromFile(c.files.limitFile); err != nil {
return false, err
}
usage, err = readIntFromFile(c.usageFile)
if err != nil {
if usage, err = readIntFromFile(c.files.usageFile); err != nil {
return false, err
}
inactive, err = readInactive(c.statsFile)
if err != nil {
if inactive, err = readInactive(c.files.statsFile, c.files.inactiveRe); err != nil {
return false, err
}
return usage-inactive >= ((limit * c.memoryLimitPercent) / 100), nil
}

func (c cgroupsMemoryLimitChecker) String() string {
return "CgroupsMemoryLimitChecker"
}

func readIntFromFile(fileName string) (int, error) {
file, err := os.Open(fileName)
if err != nil {
Expand All @@ -176,9 +196,7 @@ func readIntFromFile(fileName string) (int, error) {
return limit, nil
}

var re = regexp.MustCompile(`total_inactive_file (\d+)`)

func readInactive(fileName string) (int, error) {
func readInactive(fileName string, re *regexp.Regexp) (int, error) {
file, err := os.Open(fileName)
if err != nil {
return 0, err
Expand All @@ -201,7 +219,3 @@ func readInactive(fileName string) (int, error) {

return 0, errors.New("total_inactive_file not found in " + fileName)
}

func (c cgroupsV1MemoryLimitChecker) String() string {
return "CgroupsV1MemoryLimitChecker"
}
36 changes: 20 additions & 16 deletions arbnode/resourcemanager/resource_management_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,52 +6,57 @@ package resourcemanager
import (
"fmt"
"os"
"regexp"
"testing"
)

func updateFakeCgroupv1Files(c *cgroupsV1MemoryLimitChecker, limit, usage, inactive int) error {
limitFile, err := os.Create(c.limitFile)
func updateFakeCgroupFiles(c *cgroupsMemoryLimitChecker, limit, usage, inactive int) error {
limitFile, err := os.Create(c.files.limitFile)
if err != nil {
return err
}
_, err = fmt.Fprintf(limitFile, "%d\n", limit)
if err != nil {
if _, err = fmt.Fprintf(limitFile, "%d\n", limit); err != nil {
return err
}

usageFile, err := os.Create(c.usageFile)
usageFile, err := os.Create(c.files.usageFile)
if err != nil {
return err
}
_, err = fmt.Fprintf(usageFile, "%d\n", usage)
if err != nil {
if _, err = fmt.Fprintf(usageFile, "%d\n", usage); err != nil {
return err
}

statsFile, err := os.Create(c.statsFile)
statsFile, err := os.Create(c.files.statsFile)
if err != nil {
return err
}
_, err = fmt.Fprintf(statsFile, `total_cache 1029980160
if _, err = fmt.Fprintf(statsFile, `total_cache 1029980160
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

_, err = ....
return err

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

total_rss 1016209408
total_inactive_file %d
total_active_file 321544192
`, inactive)
if err != nil {
`, inactive); err != nil {
return err
}
return nil
}

func TestCgroupsv1MemoryLimit(t *testing.T) {
func TestCgroupsMemoryLimit(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test basically has duplicate code for 3 subtests. You can just describe each as a test-case and de-duplicate the code. (Sorry overlooked on the first review).

https://go.dev/blog/subtests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

cgroupDir := t.TempDir()
c := newCgroupsV1MemoryLimitChecker(cgroupDir, 95)
testFiles := cgroupsMemoryFiles{
limitFile: cgroupDir + "/memory.limit_in_bytes",
usageFile: cgroupDir + "/memory.usage_in_bytes",
statsFile: cgroupDir + "/memory.stat",
inactiveRe: regexp.MustCompile(`total_inactive_file (\d+)`),
}

c := newCgroupsMemoryLimitChecker(testFiles, 95)
_, err := c.isLimitExceeded()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inline. Same below

if _, err := ... ; err == nil {
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if err == nil {
t.Error("Should fail open if can't read files")
}

err = updateFakeCgroupv1Files(c, 1000, 1000, 51)
err = updateFakeCgroupFiles(c, 1000, 1000, 51)
if err != nil {
t.Error(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When errors are not nil, there's probably no use of continuing the test, so this should be Fatal instead.

Same below/up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I had used the wrong one by mistake.

}
Expand All @@ -63,7 +68,7 @@ func TestCgroupsv1MemoryLimit(t *testing.T) {
t.Error("Expected under limit")
}

err = updateFakeCgroupv1Files(c, 1000, 1000, 50)
err = updateFakeCgroupFiles(c, 1000, 1000, 50)
if err != nil {
t.Error(err)
}
Expand All @@ -74,5 +79,4 @@ func TestCgroupsv1MemoryLimit(t *testing.T) {
if !exceeded {
t.Error("Expected over limit")
}

}
Loading