Capsul outage mitigation: need a way to shutdown the server #13

Open
opened 2021-09-18 07:40:43 +00:00 by j3s · 3 comments
Owner

Can we change the default behaviour we observed during the incident? (it tries to shut down the Vms 1 by 1 with a long timeout).

Like, could we make it shut them off 10 at a time with a much shorter timeout?

Maybe we monkey-patch all the default shutdown commands to insert our own vm-halting logic? This is related to the ACPI gracefull shutdown of VMs task.

Can we change the default behaviour we observed during the incident? (it tries to shut down the Vms 1 by 1 with a long timeout). Like, could we make it shut them off 10 at a time with a much shorter timeout? Maybe we monkey-patch all the default shutdown commands to insert our own vm-halting logic? This is related to the ACPI gracefull shutdown of VMs task.
Owner

forest (he/him)
we should also implement #13

Capsul outage mitigation: need a way to shutdown the server
How do we wanna approach that? My 1st thought was to add a button to the admin panel that stops all VMs
Obviously modifying the default behaviour would be better because then you can't forget to shut them down -- it's built in

f0x
you can (partly) override an existing systemd service
adding conf files in /etc/systemd/system/.service.d/ apparently

techieb0y
'drop-ins' is the name of that feature, it's kinda handy

Set up systemd drop in:

mkdir -p /etc/systemd/system/libvirt-guests.service.d/
echo '
# NOTE this is a systemd "drop-in" which modifies the ExecStart and ExecStop properties of the existing libvirt-guests
 service unit 

[Service]
ExecStart=/opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh start
ExecStop=/opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh stop
' > /etc/systemd/system/libvirt-guests.service.d/libvirt-guests.conf

create modified version of default libvirt-guests script:

cp /usr/lib/libvirt/libvirt-guests.sh /opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh
and then edit /opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh to do virsh destroy instead of virsh shutdown
> **forest (he/him)** > we should also implement https://git.cyberia.club/cyberia/capsul-flask/issues/13 > > Capsul outage mitigation: need a way to shutdown the server > How do we wanna approach that? My 1st thought was to add a button to the admin panel that stops all VMs > Obviously modifying the default behaviour would be better because then you can't forget to shut them down -- it's built in > **f0x** > you can (partly) override an existing systemd service > adding conf files in /etc/systemd/system/<unit-name>.service.d/ apparently > **techieb0y** > 'drop-ins' is the name of that feature, it's kinda handy Set up systemd drop in: ``` mkdir -p /etc/systemd/system/libvirt-guests.service.d/ echo ' # NOTE this is a systemd "drop-in" which modifies the ExecStart and ExecStop properties of the existing libvirt-guests service unit [Service] ExecStart=/opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh start ExecStop=/opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh stop ' > /etc/systemd/system/libvirt-guests.service.d/libvirt-guests.conf ``` create modified version of default libvirt-guests script: ``` cp /usr/lib/libvirt/libvirt-guests.sh /opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh and then edit /opt/capsul-flask/capsulflask/shell_scripts/libvirt-guests.sh to do virsh destroy instead of virsh shutdown ```
Owner

^ that should be part of our deployment plan for the next deployment

^ that should be part of our deployment plan for the next deployment
Owner

We tried to deploy the systemd drop-in for fixing the shutdown process of libvirt-guests, but it was still calling the old script not our new one.. systemctl status was showing the drop in, but it wasn't actually overriding the ExecStop script we had specified.

We tried to deploy the systemd drop-in for fixing the shutdown process of libvirt-guests, but it was still calling the old script not our new one.. systemctl status was showing the drop in, but it wasn't actually overriding the ExecStop script we had specified.
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cyberia/capsul-flask#13
No description provided.