Capsul outage mitigation: need a way to shutdown the server #13
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Can we change the default behaviour we observed during the incident? (it tries to shut down the Vms 1 by 1 with a long timeout).
Like, could we make it shut them off 10 at a time with a much shorter timeout?
Maybe we monkey-patch all the default shutdown commands to insert our own vm-halting logic? This is related to the ACPI gracefull shutdown of VMs task.
Set up systemd drop in:
create modified version of default libvirt-guests script:
^ that should be part of our deployment plan for the next deployment
We tried to deploy the systemd drop-in for fixing the shutdown process of libvirt-guests, but it was still calling the old script not our new one.. systemctl status was showing the drop in, but it wasn't actually overriding the ExecStop script we had specified.