Skip to content

Improve talos upgrade command #12152

@astro-stan

Description

@astro-stan

Feature Request

Description

Context: #12132

TL;DR:

talos upgrade performs no validation if the chosen image is compatible with the node. To make matters worse, the --image option has a default argument, which upgrades the node to a "default" image.

The "default" image is seemingly randomly chosen as it is unlikely to have the same schematic ID as the one on the node or might be an older version than what is currently installed.

An implicit upgrade to a "default" image would be unexpected for a Talos administrator, but should be generally safe as it is assumed it is a reversible action due to Talos' A/B upgrades and the existence of the talosctl rollback command.

However, as shown in the discussion linked above - there is an edge case. For Talos nodes with secure boot enabled, upgrading to a non-secure boot image completely bricks the node, as systemd-boot gets replaced.

To recover from such situation, at the very least requires physical access and a live USB. However, if combined with TPM-encrypted partitions/disks, booting from an USB causes the node to get stuck in a "booting" state, as the TPM will refuse to unseal the keys for the partitions/disks. Thus wiping the STATE and EPHEMERAL partitions and starting fresh becomes the only option, leading to data loss

Feature Request

With that in mind, I would like to suggest a few ideas on how to improve the upgrade command, so that you cannot accidentally shoot yourself in the foot:

  • Remove the default value for the --image option
  • Perform validation checks when upgrading and require --force (or something to that effect) if upgrading to an older image, from a secure boot to a non-secure boot image, or if changing image arch.
  • Change the default --image value to be "what is currently specified in the machine config" or require --image to be provided if --insecure is used

Bonus: Add a "boot into maintanance mode without wiping the system disk" boot entry to the Talos ISO

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions