Thursday, August 26, 2021

There is no such thing as a “glibc based alpine image”

For whatever reason, the alpine-glibc project is apparently being used in production.  Worse yet, some are led to believe that Alpine officially supports or at least approves of its usage.  For the reasons I am about to outline, we don’t.  I have also proposed an update to Alpine which will block the installation of the glibcpackages produced by the alpine-glibc project, and have referred acceptance of that update to the TSC to determine if we actually want to put our foot down or not.  I have additionally suggested that the TSC may wish to have the Alpine Council reach out to the alpine-glibc project to find a solution which appropriately communicates that the project is not supported in any way by Alpine.  It should be hopefully clear that there is no such thing as a “glibc based alpine image” because Alpine does not use glibc, it uses musl.

What the alpine-glibc project actually does

The alpine-glibc project attempts to package the GNU C library (glibc) in such a way that it can be used on Alpine transparently.  However, it is conceptually flawed, because it uses system libraries where available, which have been compiled against the musl C library.  Combining code built for musl with code built for glibc is like trying to run Windows programs on OS/2: both understand .EXE files to some extent, but they are otherwise very different.

But why are they different?  They are both libraries designed to run ELF binaries, after all.  The answer is due to differences in the application binary interface, also known as an ABI.  Specifically, glibc supports and heavily uses a backwards compatibility technique called symbol versioning, and musl does not support it at all.

How symbol versioning works

Binary programs, such as those compiled against musl or glibc, have something called a symbol table.  The symbol table contains a list of symbols needed from the system libraries, for example the C library functions like printf are known as symbols.  When a binary program is run, it is not executed directly by the kernel: instead, a special program known as an ELF interpreter is loaded, which sets up the mapping from symbols in the symbol table to the actual locations where those symbols exist.  That mapping is known as the global object table.

On a system with symbol versioning, additional data in the symbol table designates what version of a symbol is actually wanted.  For example, when you request printf on a glibc system, you might actually wind up requesting printf@GLIBC_2_34 or some other kind of versioned symbol.  This allows newer programs to prefer the newer printf function, while older programs can reference an older version of the implementation.  That allows for low-cost backwards compatibility: all you have to do is keep around the old versions of the routines until you decide to drop support in the ABI for them.

Why mixing these two worlds is bad

However, if you combine a world expecting symbol versioning and one which does not, you wind up with undefined behavior.  For very simple programs, it appears to work, but for more complicated programs, you will wind up with strange behavior and possible crashes, as the global object table references routines with different behavior than expected by the program.  For example, a program expecting a C99 compliant printf routine will get one on musl if it asks for printf.  But a program expecting a C99 compliant printf routine on glibc will ask for printf@GLIBC_2_12 or similar.

The symbol versioning problem spreads to the system libraries too: on Alpine, libraries don’t provide versioned symbols: instead, you get the latest version of each symbol.  But if a glibc program is expecting foo to be an older routine without the semantics of the current implementation of foo, then it will either crash or do something weird.

This has security impacts: the lack of consistency for whether versioned symbols are actually supported by the system basically turns any interaction with versioned symbols into what is called a weird machine.  This means that an attacker possibly controls more attack surface than they would in a situation where the system ran either pure glibc or pure musl.

Alternatives to alpine-glibc

As alpine-glibc is primarily discussed in the context of containers, we will keep this conversation largely focused on that.  There are a few options if you want a small container to run a binary blob linked against glibc in a container that are far better than using alpine-glibc.  For example, you can use Google’s distroless tools, which will build a Debian-based container with only the application and its runtime dependencies, which allows for a fairly small container.  You can also try to use the gcompat package, which emulates the GNU C library ABI in the same way that WINE emulates the Windows ABI.

But whatever you do, you shouldn’t use the alpine-glibc project to do this.  You will wind up with something completely broken.

Related



from Hacker News https://ift.tt/2UMLYIq

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.