Tuesday, March 30, 2021

Systemd 248 Released: A concept of system extension images is introduced

systemd System and Service Manager CHANGES WITH 248: * A concept of system extension images is introduced. Such images may be used to extend the /usr/ and /opt/ directory hierarchies at runtime with additional files (even if the file system is read-only). When a system extension image is activated, its /usr/ and /opt/ hierarchies and os-release information are combined via overlayfs with the file system hierarchy of the host OS. A new systemd-sysext tool can be used to merge, unmerge, list, and refresh system extension hierarchies. See https://ift.tt/3sI6vtu. The systemd-sysext.service automatically merges installed system extensions during boot (before basic.target, but not in very early boot, since various file systems have to be mounted first). The SYSEXT_LEVEL= field in os-release(5) may be used to specify the supported system extension level. * A new ExtensionImages= unit setting can be used to apply the same system extension image concept from systemd-sysext to the namespaced file hierarchy of specific services, following the same rules and constraints. * Support for a new special "root=tmpfs" kernel command-line option has been added. When specified, a tmpfs is mounted on /, and mount.usr= should be used to point to the operating system implementation. * A new configuration file /etc/veritytab may be used to configure dm-verity integrity protection for block devices. Each line is in the format "volume-name data-device hash-device roothash options", similar to /etc/crypttab. * A new kernel command-line option systemd.verity.root_options= may be used to configure dm-verity behaviour for the root device. * The key file specified in /etc/crypttab (the third field) may now refer to an AF_UNIX/SOCK_STREAM socket in the file system. The key is acquired by connecting to that socket and reading from it. This allows the implementation of a service to provide key information dynamically, at the moment when it is needed. * When the hostname is set explicitly to "localhost", systemd-hostnamed will respect this. Previously such a setting would be mostly silently ignored. The goal is to honour configuration as specified by the user. * The fallback hostname that will be used by the system manager and systemd-hostnamed can now be configured in two new ways: by setting DEFAULT_HOSTNAME= in os-release(5), or by setting $SYSTEMD_DEFAULT_HOSTNAME in the environment block. As before, it can also be configured during compilation. The environment variable is intended for testing and local overrides, the os-release(5) field is intended to allow customization by different variants of a distribution that share the same compiled packages. * The environment block of the manager itself may be configured through a new ManagerEnvironment= setting in system.conf or user.conf. This complements existing ways to set the environment block (the kernel command line for the system manager, the inherited environment and user@.service unit file settings for the user manager). * systemd-hostnamed now exports the default hostname and the source of the configured hostname ("static", "transient", or "default") as D-Bus properties. * systemd-hostnamed now exports the "HardwareVendor" and "HardwareModel" D-Bus properties, which are supposed to contain a pair of cleaned up, human readable strings describing the system's vendor and model. It's typically sourced from the firmware's DMI tables, but may be augmented from a new hwdb database. hostnamectl shows this in the status output. * Support has been added to systemd-cryptsetup for extracting the PKCS#11 token URI and encrypted key from the LUKS2 JSON embedded metadata header. This allows the information how to open the encrypted device to be embedded directly in the device and obviates the need for configuration in an external file. * systemd-cryptsetup gained support for unlocking LUKS2 volumes using TPM2 hardware, as well as FIDO2 security tokens (in addition to the pre-existing support for PKCS#11 security tokens). * systemd-repart may enroll encrypted partitions using TPM2 hardware. This may be useful for example to create an encrypted /var partition bound to the machine on first boot. * A new systemd-cryptenroll tool has been added to enroll TPM2, FIDO2 and PKCS#11 security tokens to LUKS volumes, list and destroy them. See: https://ift.tt/38DZ0fq It also supports enrolling "recovery keys" and regular passphrases. * The libfido2 dependency is now based on dlopen(), so that the library is used at runtime when installed, but is not a hard runtime dependency. * systemd-cryptsetup gained support for two new options in /etc/crypttab: "no-write-workqueue" and "no-read-workqueue" which request synchronous processing of encryption/decryption IO. * The manager may be configured at compile time to use the fexecve() instead of the execve() system call when spawning processes. Using fexecve() closes a window between checking the security context of an executable and spawning it, but unfortunately the kernel displays stale information in the process' "comm" field, which impacts ps output and such. * The configuration option -Dcompat-gateway-hostname has been dropped. "_gateway" is now the only supported name. * The ConditionSecurity=tpm2 unit file setting may be used to check if the system has at least one TPM2 (tpmrm class) device. * A new ConditionCPUFeature= has been added that may be used to conditionalize units based on CPU features. For example, ConditionCPUFeature=rdrand will condition a unit so that it is only run when the system CPU supports the RDRAND opcode. * The existing ConditionControlGroupController= setting has been extended with two new values "v1" and "v2". "v2" means that the unified v2 cgroup hierarchy is used, and "v1" means that legacy v1 hierarchy or the hybrid hierarchy are used. * A new PrivateIPC= setting on a unit file allows executed processes to be moved into a private IPC namespace, with separate System V IPC identifiers and POSIX message queues. A new IPCNamespacePath= allows the unit to be joined to an existing IPC namespace. * The tables of system calls in seccomp filters are now automatically generated from kernel lists exported on https://ift.tt/1VXMPv1. The following architectures should now have complete lists: alpha, arc, arm64, arm, i386, ia64, m68k, mips64n32, mips64, mipso32, powerpc, powerpc64, s390, s390x, tilegx, sparc, x86_64, x32. * The MountAPIVFS= service file setting now additionally mounts a tmpfs on /run/ if it is not already a mount point. A writable /run/ has always been a requirement for a functioning system, but this was not guaranteed when using a read-only image. Users can always specify BindPaths= or InaccessiblePaths= as overrides, and they will take precedence. If the host's root mount point is used, there is no change in behaviour. * New bind mounts and file system image mounts may be injected into the mount namespace of a service (without restarting it). This is exposed respectively as 'systemctl bind …' and 'systemctl mount-image



from Hacker News https://ift.tt/3sCr3DJ

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.