Saturday, June 26, 2021

The Evolution of the Unix System Architecture

Unix has evolved for more than five decades, shaping modern operating systems, key software technologies, and development practices. Studying the evolution of this remarkable system from an architectural perspective can provide insights on how to manage the growth of large, complex, and long-lived software systems. In 2016 my colleague Paris Avgeriou and I embarked on this study aiming to combine his software architecture insights with my software analytics skills. Here is a brief summary of the study, which was published this month in the IEEE Transactions on Software Engineering.

What we did

Starting from the 1970 PDP-7 Unix version (2489 lines of kernel code and 9095 lines of programs; all written in assembly language), we studied main Unix releases leading to the FreeBSD lineage (currently more than 20 million lines of code) shown as orange boxes in the figure below.

A simplified diagram of Unix variants and releases related through code. We studied the lineage of the highlighted elements

A simplified diagram of Unix variants and releases related through code. We studied the lineage of the highlighted elements

Along those releases we examined core architectural design decisions, the number of features, and code complexity. We did that based on the analysis of the following:

What we found

Growth in size has been uniform, with some notable outliers, while cyclomatic complexity has been religiously safeguarded.

Mean cyclomatic complexity of user-space commands, C libraries, and kernel

Mean cyclomatic complexity of user-space commands, C libraries, and kernel

A large number of Unix-defining design decisions were implemented right from the very early beginning, with most of them still playing a major role. This is apparent if one compares the high-level architecture diagrams of the first released Unix version with the current one. Impressively, the basic shape of the 1972 First Research Edition architecture has a similar structure and shares many elements with the modern FreeBSD version. (The common elements are share the same color.)

High-level architecture of the First Research Edition

High-level architecture of the First Research Edition

High-level architecture of FreeBSD 11.0

High-level architecture of FreeBSD 11.0

As is apparent in the timeline diagram below, Unix continues to evolve from an architectural perspective, but the rate of architectural innovation has slowed down over the system's lifetime.

Timeline of Unix’s major releases and architectural design decisions

Timeline of Unix’s major releases and architectural design decisions

Architectural technical debt has accrued in the forms of functionality duplication and unused facilities, but in terms of cyclomatic complexity it is systematically being paid back through what appears to be a self-correcting process. Some unsung architectural forces that shaped Unix are:

  • the emphasis on conventions over rigid enforcement (think of documented file formats or environment variables),
  • the drive for portability (nowadays Unix runs on systems ranging from the $15 Rasperry Pi Zero to supercomputers),
  • a sophisticated ecosystem of other operating systems and development organizations (we documented millions of code lines coming from third-party subsystems and more than ten third-party contributors), and
  • the emergence of a federated architecture, often through the adoption of third-party subsystems (see for example the I/O subsystem elements in the modern architecture diagram above).

More details

The complete 30-page study is openly available in the IEEE Xplore library: D. Spinellis and P. Avgeriou, "Evolution of the Unix System Architecture: An Exploratory Case Study," in IEEE Transactions on Software Engineering, vol. 47, no. 6, pp. 1134–1163, June 2021, doi: 10.1109/TSE.2019.2892149.

An online supplement provides details regarding two things. First, 90 precise, hyperlinked references to Unix source code that support data provided or arguments made in the main text. Second, listings of 107 code snippets that were used to derive specific numbers or tables appearing in the study's text.

Read and post comments.


from Hacker News https://ift.tt/3A0D75R

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.