Sunday, January 15, 2023

Ubuntu 22.04 LTS servers and phased apt updates

I was working on getting one of our 22.04 LTS servers up to date, even for packages we normally hold, when I hit a mystery and posted about it on the Fediverse:

Why does apt on this 22.04 Ubuntu machine want to hold back a bunch of package updates even with '--with-new-pkgs --ignore-hold'? Who knows, it won't tell me why it doesn't like any or all of:

open-vm-tools openssh-client openssh-server openssh-sftp-server osinfo-db python3-software-properties software-properties-common

(Apt is not my favorite package manager for many reasons, this among them.)

Steve suggested that it was Ubuntu's "Phased Update" system, which is what it turned out to be. This set me off to do some investigations, and it turns out that phased (apt) updates explain some other anomalies we've seen with package updates on our Ubuntu 22.04 machines.

The basic idea of phased updates is explained in the "Phasing" section of Ubuntu's page on Stable Release Updates (SRUs); it's a progressive rollout of the package to more and more of the system base. Ubuntu introduced phased updates in 2013 (cf) but initially they weren't directly supported by apt, only by the desktop upgrade programs. Ubuntu 21.04 added apt support for phased updates and Ubuntu 22.04 LTS is thus the first LTS version to subject servers to phased updates. More explanations of phased updates are in this askubuntu answer, which includes one way to work around them.

(Note that as far as I know and have seen, security updates are not released as phased updates; if it's a security update, everyone gets it right away. Phased updates are only used for regular, non-security updates.)

Unfortunately apt (or apt-get) won't tell you if an update is being held back because of phasing. This user-hostile apt issue is tracked in Ubuntu bug #1988819 and you should add yourself as someone it affects if this is relevant to you. Ubuntu has a web page on what updates are currently in phased release, although packages are removed from this page once they reach 100%. Having reached 100%, such a package is no longer a phased update, which will become relevant soon. If you can't see a reason for a package to be held back, it's probably a phased update but you can check the page to be sure.

(As covered in the "Phasing" section, packages normally move forward through the phased rollout every six hours, so you can have a package held back on some server in the morning and then be not-held in the afternoon. This is great fun for troubleshooting why a given server didn't get a particular update.)

Your place in a phased update is randomized across both different servers and different packages. If you have a fleet of servers, they will get each phased update at different times, and the order won't be consistent from package to package. This explains an anomaly we've been seeing in our package updates for some time, where different 22.04 servers would get updates at different times without any consistent pattern.

The phased update related apt settings available and some of the technical details are mostly explained in this askubuntu answer. If you want to opt out of phased updates entirely, you have two options; you can have your servers install all phased updates right away (basically putting you at the 0% start line), or you can skip all phased updates and only install such packages when they reach 100% and stop being considered phased updates at all. Unfortunately, as of 22.04 there's no explicit option to set your servers to have a particular order within all updates (so that you can have, for example, a 'canary' server that always installs updates at 0% or 10%, ahead of the rest of the fleet).

For any given package update, machines are randomized based on the contents of /etc/machine-id, which can be overridden for apt by setting APT::Machine-ID to a 32 hex digit value of your choice (the current version of apt appears to only use the machine ID for phased updates). If you set this to the same value across your fleet, your fleet will update in sync (although not at a predictable point in the phase process); you can also set subsets of your fleet to different shared values so that the groups will update at different times. The assignment of a particular machine to a point in the phased rollout is done through a relatively straightforward approach; the package name, version, and machine ID are all combined into a seed for a random number generator, and then the random number generator is used to produce a 0 to 100 value, which is your position in the phased rollout. The inclusion of the package name and version means that a given machine ID will be at different positions in the phased update for different packages. All of this turns out to be officially documented in the "Phased Updates" section of apt_preferences(5), although not in much detail.

(There is a somewhat different mechanism for desktop updates, covered in the previously mentioned askubuntu answer.)

As far as I can see from looking at the current apt source code, apt doesn't log anything at any verbosity if it holds a package back because the package is a phased update and your machine doesn't qualify for it yet. The fact that a package was a phased update the last time apt looked may possibly be recorded in /var/log/apt/eipp.log.xz, but documentation on this file is sparse.

Now that I've looked at all of this and read about APT::Machine-ID, we'll probably set it to a single value across all of our fleet because we find different machines getting updates at different times to be confusing and annoying (and it potentially complicates troubleshooting problems that are reported to us, since we normally assume that all 22.04 machines have the same version of things like OpenSSH). If we could directly control the position within a phased rollout we'd probably set up some canary machines, but since we can't I don't think there's a strong reason to have more than one machine-id group of machines.

(We could set some very important machines to only get updates when packages reach 100% and stop being phased updates, but Ubuntu has a good record of not blowing things up with eg OpenSSH updates.)



from Hacker News https://ift.tt/ZxjLDqt

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.