Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/ups: shutdown UPS at host shutdown #344940

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

bjornfor
Copy link
Contributor

@bjornfor bjornfor commented Sep 27, 2024

Description of changes

Implement the missing bit of the NUT shutdown design[1]. This ensures
that machines come back up automatically after a power outage. (Without
this change they will only come back up if the UPS completely empties
its battery.)

[1] https://networkupstools.org/docs/user-manual.chunked/Configuration_notes.html#Shutdown_design

Tested out-of-tree in NixOS 24.05. (This PR cannot be cherry-picked to
release-24.05 / has conflicts.)

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

They're listed twice (documentation and implementation) and this change
makes it easier to compare the attrsets.
Copy link
Contributor

@Majiir Majiir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested this yet, but it generally looks good.

Are there any risks to introducing this as a default for existing users? Should we put the POWERDOWNFLAG default behind a stateVersion check?

@@ -334,6 +335,7 @@ let
MINSUPPLIES = lib.mkDefault 1;
MONITOR = lib.flip lib.mapAttrsToList cfg.upsmon.monitor (name: monitor: with monitor; [ system powerValue user "\"@upsmon_password_${name}@\"" type ]);
NOTIFYCMD = lib.mkDefault "${pkgs.nut}/bin/upssched";
POWERDOWNFLAG = "/run/killpower";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about putting this in /run/nut/ along with the configs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might work now, because of how /run/nut gets created. But I think it should be a RuntimeDirectory=, which means it'll be unmounted by the time the killpower service runs. We need to store the flag somewhere that outlives all normal services.

Note that upstream suggests /etc/killpower. I opted for /run because I prefer ramfs when that is possible. (And upsmon supports both volatile and non-volatile filesystems -- it'll remove the POWERDOWNFLAG on startup if it exists.)

@@ -575,6 +577,23 @@ in
];
};

systemd.services.ups-killpower = {
enable = cfg.upsd.enable;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check cfg.upsmon.settings.POWERDOWNFLAG != null here, so that users can easily disable this behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Or if we want to make this configurable (and discoverable), maybe it's better to have an explicit option? power.ups.allowUpsKillPower? Or tell users to set systemd.services.ups-killpower.enable = false?

Is it overdesigning though? Why would someone not want to kill the UPS power? It's needed to ensure the systems come back up, so...

But I guess allowing null is an easy way to handle it -- it's not very obvious, but at least it's doable.

Data point: the apcupsd module does not have option to disable killpower. Nobody complained in 10+ years.

@bjornfor
Copy link
Contributor Author

Are there any risks to introducing this as a default for existing users? Should we put the POWERDOWNFLAG default behind a stateVersion check?

I guess a risk could be if the UPS firmware is broken and turns power off immediately, instead of waiting a few seconds? But the system should have unmounted all the normal filesystems by then, so I don't think it should cause any damage.

But maybe I should write a release note entry and mention the POWERDOWNFLAG = null as a way to keep the old functionality?

@bjornfor
Copy link
Contributor Author

The latest push has these changes:

  • Support nullable POWERDOWNFLAG.
  • Add ups-killpower.service to the new system-ups.slice, like the other ups/nut services.

@bjornfor
Copy link
Contributor Author

Forgot the release notes entry. Added now.

Implement the missing bit of the NUT shutdown design[1]. This ensures
that machines come back up automatically after a power outage. (Without
this change they will only come back up if the UPS completely empties
its battery.)

[1] https://networkupstools.org/docs/user-manual.chunked/Configuration_notes.html#Shutdown_design
@bjornfor
Copy link
Contributor Author

Last two pushes:

  • Fix typo/grammar in release notes entry.
  • Use lib.mkDefault on POWERDOWNFLAG default value, so that users don't need to use lib.mkForce to disable it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants