This draft has been floating around un-checked-in for too long now

2022-10-21 12:22:59 +02:00 · 2022-10-21 12:22:59 +02:00 · 4d360e65a6
parent 472d483f3e
commit 4d360e65a6
4 changed files with 434 additions and 4 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -73,4 +73,5 @@

 - [Syndicated Actor Model](./syndicated-actor-model.md)
 - [Protocol specification](./protocol.md)
- [System layer analysis]()
+- [The System Layer](./system-layer.md)
+- [Synit as a System Layer](./synit-as-system-layer.md)
--- a/src/glossary.md
+++ b/src/glossary.md
@ -787,9 +787,9 @@ available.

 ## System Layer

-The *system layer* is an essential part of an operating system, mediating between user-facing
-programs and the kernel. It provides the technical foundation for many qualities relevant to
-system security, resilience, connectivity, maintainability and usability.
+The [*system layer*](system-layer.md) is an essential part of an operating system, mediating
+between user-facing programs and the kernel. It provides the technical foundation for many
+qualities relevant to system security, resilience, connectivity, maintainability and usability.

 The concept of a system layer has only been recently recognised—the term itself was [coined by
 Benno Rice in a 2019 conference
--- a/src/synit-as-system-layer.md
+++ b/src/synit-as-system-layer.md
@ -0,0 +1,14 @@
+# Synit as a System Layer
+
+I will then design dataspace-based interaction protocols that realize
+this functionality. These protocols will form the heart of the system:
+each component will perform one or more roles as
+described.
+
+At the same time, the protocol descriptions will serve as internal and
+external APIs and API documentation for the system layer. The
+project’s thesis predicts that dataspace protocol descriptions will be
+at the correct level to effectively capture the concepts intrinsic to
+a system layer.
+
+ - Protocols capturing a synthesis of system layer behaviours, based on the analysis
--- a/src/system-layer.md
+++ b/src/system-layer.md
@ -0,0 +1,415 @@
+# The System Layer
+
+*Tony Garnock-Jones  
+October 2022*
+
+The [*system layer*](glossary.md#system-layer) ([Rice 2019][]; [Corbet 2019][]) is an essential
+part of an operating system, mediating between user-facing programs and the kernel. Its
+importance lies in its role as the technical foundation for many qualities[^qualities] relevant
+to system security, resilience, connectivity, maintainability and usability.
+
+In the Linux world, existing system layer realizations cross-cut many, many projects:
+NetworkManager, GNOME, DBus, systemd, OpenRC, apt, apk, and so on. Each project has its own
+role in the overall system layer, and none takes a strong stance on the overall architecture
+that results from their combination. However, there are a group of basic concepts involved in a
+system layer that transcend individual subprojects, relating to issues of IPC, discovery, and
+whole-machine and application state management.
+
+This document examines the architecture of system layers in general, touching on
+responsibilities currently handled at each of these levels, with the aim of bringing the
+concept of "system layer" into sharper focus.
+
+## What is a system layer?
+
+The term "system layer" was coined[^as-far-as-i-know] by Benno Rice in
+[a 2019 talk](https://youtu.be/o_AIw9bGogo). Here's an excerpt from
+[the relevant portion of Rice's talk](https://youtu.be/o_AIw9bGogo?t=911):[^cleaned-up-automated-transcript]
+
+> ... dynamic DHCP, IPv6 auto config, all these kinds of things are
+> more dynamic. Time is more dynamic. Some aspects of device handling,
+> you know, all of these things are a lot more dynamic now, and we
+> need a way of strapping these things together so we can manage them
+> that doesn't involve installing 15 different packages that all
+> behave differently.
+>
+> <small>[15:08]</small> **And so what that ends up becoming, is what
+> I term the system layer.** Which is a bunch of stuff which might be
+> running in user space or might be running in kernel space but is
+> **providing systemic level stuff** as opposed to the stuff that
+> you're writing or using directly. So this could include things like
+> NetworkManager, and udev, and a whole bunch of things.
+>
+> Systemd as a project ends up **complementing the Linux kernel by
+> providing all of this user space system layer**.
+
+(It's a really good talk.) The system layer idea seems to have been
+latent for a long time, and only recently to have been given a name.
+
+Some examples include:
+
+ - The Mac OS frameworks above the kernel level
+ - The Android system with its APIs and SDKs
+ - Various combinations of package manager, init system, service manager, support daemons, and
+   user interface (be it ever so minimal); for example, debian+systemd+udevd+GNOME, or
+   alpine+OpenRC+eudev+SSH.
+
+Both Android and Mac OS embody substantially complete visions of a system layer, while the
+visions are much more fragmented in the Linux world. Even in cases where systemd makes up a
+good fraction of a particular system layer, most systems augment it with a wide variety of
+other software.
+
+## What does a system layer do?
+
+A system layer addresses myriad system-level problems that applications face that are
+out-of-scope for the operating system kernel.
+
+It solves these problems so that application developers can rely on shared vocabulary, common
+interfaces, and on communal development effort. The result is improved interoperability,
+compositionality, securability, etc., and reduced duplication of effort, less scope for design
+flaws, and so on.
+
+The scope of the system layer changes with time as the needs of applications and users change
+and grow. The problems it addresses range from the highly abstract to the relatively concrete.
+For example, a system layer may:
+
+ - supply services in response to static or dynamic demand
+ - monitor and react to changes in system state
+ - give higher-level perspectives to users and applications on system state and resources
+ - offer access control mechanisms and enforce access control policies
+ - offer a coherent, system-wide approach to security and privacy
+ - offer inter-process communication media
+ - provide name-binding and name resolution services
+ - provide job-queueing and -scheduling services, including calendar-like and time-based scheduling
+ - provide user interface facilities
+ - provide system-wide "cut-and-paste" services for user-controlled IPC
+ - provide system configuration and user preference databases
+ - support software package installation, upgrade, and removal
+ - offer state (data, configuration) replication services
+ - provide data backup facilities
+
+among other things. All of these areas are common *across* applications, unique to none of
+them.
+
+To come up with this list, I surveyed a number of existing open systems such as Linux
+distributions, desktop environments, and so on, plus (in a limited way) Android and Mac OS,
+looking for commonalities and differences. That is, the list was developed in a largely
+informal way. Despite this, I've found it a fruitful starting point for an investigation of the
+properties of system layers in general. I welcome additional perspectives that others might
+bring.
+
+In the remainder of this document, I'll use each of the topics in the list above as a
+perspective from which to examine existing software. I'll then attempt a synthesis of the
+results of this analysis into a firmer idea of what form a system layer could and perhaps
+should take.
+
+## Service management and system reactivity
+
+An *extremely* common reoccuring pair of related themes in system layers of all sorts is
+**service management** and **system reactivity**. That is, the system layer takes on the tasks
+of starting and stopping services in response to static or dynamic demand, and of monitoring
+and reacting to changes in system state. While the kernel offers raw sense data plus a
+low-level vocabulary for managing the collection of running processes on a system, applications
+and users need a higher-level vocabulary for managing running software in terms of services and
+service relationships.
+
+These tasks can be broken down into smaller, but still general, pieces:
+
+ - primitive ability to start and stop service instances
+ - declaration of singleton service instances, service classes, and instances of service classes
+ - declaration of relationships (including runtime dependencies) among services
+ - facility for managing service names and connecting service names to service instances
+ - user interface for examining the service namespace and the collection of running and runnable services
+ - facility for noticing and a medium for publishing and subscribing to changes in system state
+
+Concrete examples include:
+
+ - starting services in response to statically-configured runlevels (OpenRC, systemd, SysV init, etc.)
+ - starting dependencies before dependent services (OpenRC, systemd, SysV init, etc.)
+ - restarting terminated or failed services in a supervision hierarchy (daemontools, s6, etc.; Erlang/OTP)
+ - starting services by service name on demand (DBus, etc.)
+ - starting services by socket activation (systemd, etc.)
+ - virtual-machine and container lifecycles, including supervision and restart of containers (docker, docker-compose, etc.)
+ - reacting to hotplugging of a device by installing a driver or starting a program (udevd, etc.)
+ - reacting to system metrics (e.g. temperature, load average, memory pressure) by changing something
+ - reacting to network connectivity changes (NetworkManager, etc.)
+ - setup and naming of devices and network routes (udevd, NetworkManager, etc.)
+
+Laurent Bercot has produced an excellent [comparison
+table](https://skarnet.com/projects/service-manager.html#comparison) in a page describing [a
+new service manager for Linux
+distributions](https://skarnet.com/projects/service-manager.html).
+
+## Higher-level perspectives on and control over system state and resources
+
+An essential system layer task is to give users and applications higher-level perspectives on
+system state, resource availability and resource consumption than those offered by the kernel.
+
+For example, the kernel's [`NETLINK_ROUTE`](https://en.wikipedia.org/wiki/Netlink) sockets
+allow processes to observe changes in network interface and routing configuration, but
+applications often do not need the fine detail on offer: instead, they need higher-level
+knowledge such as "a usable default route for IPv4 exists", or "IPv4 connectivity is available,
+but metered".
+
+Breaking this task down into smaller pieces yields:
+
+ - access to low-level descriptions of system state, resource availability, and resource usage
+ - ability to either poll for or subscribe to changes in such state
+ - ability to compute relevant higher-level perspectives on the state
+ - a medium for communicating such changes to users and applications
+
+Concrete examples include:
+
+ - computing default-route availability from `NETLINK_ROUTE` events over `netlink` sockets, as discussed
+ - use of `NETLINK_KOBJECT_UEVENT` by udev to configure and expose hotplugged devices to userland
+ - interrogation of disk devices and partition tables to provide views on and control over available filesystems (gnome-disks, etc.)
+ - interrogation of audio devices and audio routing options to provide high-level views and control over audio setup (pipewire, pulseaudio, etc.), e.g. volume level display and volume controls, mute, select input/output channel, play/pause, skip, rewind etc.
+ - high-level perspectives on devices such as displays, printers, mice, keyboards, touchpads, accelerometers, proximity sensors, temperature monitors and so on (GNOME, XFCE4, KDE, cups, etc.), communicated via DBus and friends
+ - system configuration databases (`/etc`, Windows' Registry, GNOME configuration databases)
+ - location services mapping from low-level GPS and wifi information to medium-level concrete location coordinates to high-level "you are at home", "you are in the office"-style knowledge about location
+ - telephony services exposing high-level call management interfaces backed by low-level modem operations
+
+Slightly harder to see, but still certainly an example of the subject of this section, is the
+collection of userland tools commonly associated with Unix-like operating systems more
+generally. The file system, for example, is firmly a systems concern and not an
+application-level concern, so the system layer provides general tools for manipulating,
+examining, and repairing the file system. This includes not only tools such as `fsck`, `df`,
+and `mount`, but facilities such as automounting, mounting and `fsck`ing at boot, scanning and
+manipulating partition tables, configuring `lvm`, and even the humble `ls`, `cp` and friends.
+On systems such as Mac OS, the Finder and Disk Utility programs and their associated underlying
+system services are analogous parts of the system layer.
+
+## Access control mechanisms and policies, security, and privacy
+
+ - offer access control mechanisms and enforce access control policies
+ - offer a coherent, system-wide approach to security and privacy
+
+ - *access control*
+    - resource allocation services
+    - ACL-based access control for system services and DBus objects
+
+### Security and privacy
+
+Existing system layers rely on single-machine approaches to security
+and securability that do not scale well: for example, Unix ACLs and
+user- and group-ID-based permissions. The theory of object
+capabilities (“ocaps”), exemplified in languages such as E and
+programming models such as Actors, offers a fine-grained approach that
+can be made to scale further than a single machine. However, ocaps
+only control access to shared programs. Access controls for shared
+data are left implicit. In addition, ideas of location and system
+boundary are left implicit in ocap systems.
+
+I will adapt ocaps to syndicated actors. Because the Syndicated Actor model includes a
+first-class notion of shared data as well as a layered conception of locations and location
+boundaries, syndicated capabilities will reflect these ideas directly. I will generalize the
+Syndicated Actor model’s existing notions of place, connecting capabilities not to individual
+actors but to individual places and the data held therein. I will draw on existing ocap
+literature, including in particular the recent notion of Macaroons ([Birgisson et al 2014][])
+and older ideas from SPKI/SDSI ([Ylonen et al 1999][]; [Ellison 1999][]).
+
+**Q. How do you feel dataspaces would most enhance privacy or trust?**
+
+Capability technology offers strong, flexible control over access to any given dataspace
+without getting lost in the weeds of identity management: identity is an application-local,
+application-private concern.
+
+Dataspaces default to being closed, "invite-only" networks, meaning casual observation of
+activity in a dataspace is not possible. But the necessary extension of the capability model to
+handle the data-sharing aspects of dataspaces gives benefit in terms of privacy and trust that
+goes beyond the already considerable benefits a traditional capability model offers.
+
+Traditional capabilities directly control access to behavioural objects, and only indirectly
+control access to data held within such objects. Syndicated capabilities, by contrast, directly
+control access to shared data held within a space - changes to which may trigger activity in
+"objects" participating in the dataspace.
+
+In other words, traditional capabilities encode data access controls in terms of object access
+controls; syndicated capabilities, vice versa.
+
+This ability to directly express access to shared data gives system designers a powerful tool
+for thinking about permitted information flows, including questions of privacy. Furthermore,
+*attenuating* the authority of syndicated capabilities before passing them on to some other
+principal allows for strong partitioning of access within a dataspace, offering fine-grained,
+local, compositional decisions about access to shared data. Finally, it becomes possible to
+expose capabilities to end-users (roughly analogous to URLs), putting that power in their hands
+also.
+
+I should also mention that dataspaces can scale from managing activity within a single OS
+process up to coordinating activity between machines around the world. A distributed dataspace
+could be an excellent foundation for collaborative applications, where privacy concerns come to
+the forefront. In effect, a dataspace can become a richly-structured "VPN", containing
+application-specific shared data and with application- or schema-specific access controls.
+
+
+## Inter-process communication and networking
+
+ - offer inter-process communication media
+
+ - *inter-process communication*
+    - DBus as a program-to-program communication bus
+    - email for use by system services
+
+X11 for IPC
+
+## Name-binding, name-resolution, and namespaces
+
+ - provide name-binding and name resolution services
+
+udev - /dev namespace
+
+ - *naming services*
+    - publishing names for intra-machine services on this system
+    - publishing names for LAN services on this system
+    - resolving names of intra-machine services on this system
+    - resolving names of services on other systems[^libc-resolver]
+
+
+## Job queueing and job scheduling
+
+ - provide job-queueing and -scheduling services, including calendar-like and time-based scheduling
+
+cron
+at
+systemd timers
+
+cups, lpd
+
+mail queue management?
+
+## User interface
+
+ - provide user interface facilities
+
+(TO APPLICATIONS but I guess also for the system layer itself)
+
+ - provide system-wide "cut-and-paste" services for user-controlled IPC
+
+email for talking to users
+notifications - system tray
+
+ - ui facilities
+    - the thing that asks for user input during apt configuration
+    - the alert/prompt boxes in a web browser (?)
+    - notifications
+    - system tray, applets
+
+## System configuration and user preferences
+
+ - provide system configuration and user preference databases
+
+ - system configuration database
+    - system settings manager
+
+## Software management
+
+ - support software package installation, upgrade, and removal
+
+cc
+apt
+apk
+
+## State replication and data backup
+
+ - offer state replication services
+ - provide data backup facilities
+
+ - state replication services
+    - contact book, address book
+    - file replication across machines
+    - sticky-notes, google keep
+    - todo list
+
+ - backup facilities
+    - Time Machine
+
+## Synthesis, or, Toward a Complete Vision of a System Layer
+
+Want to make it *easy* integrate portions of a system layer together. The core of the core has
+to be good IPC and state-management and -introspection.
+
+ - systemd/udev/D-Bus/NetworkManager/dhcpcd/etc., as sketched above
+ - init/inetd/crond/etc., the traditional Unix system layer
+ - daemontools/runit/s6: service supervision software
+ - OpenRC/[s6-rc](https://skarnet.com/projects/service-manager.html):
+   service manager and supervisor used in Alpine
+ - Android architecture components
+ - Erlang's OTP, the system layer for the Erlang virtual operating system
+
+| Component            | SM | RX | HL | AC | PR | IPC | NS | JQ | UI | CF | RR | BK |
+|----------------------|----|----|----|----|----|-----|----|----|----|----|----|----|
+| Linux kernel         | ✓  | ✓  | ✓  | ✓  | ✓  | ✓   | ✓  | ✓  |    |    |    |    |
+| udev                 |    | ✓  |    | ✓  |    |     | ✓  |    |    |    |    |    |
+| D-Bus                | ✓  |    |    | ✓  |    | ✓   | ✓  |    |    |    |    |    |
+| NetworkManager       |    | ✓  | ✓  | ✓  |    |     |    |    |    |    |    |    |
+| dhcpcd               |    |    |    |    |    |     |    |    |    |    |    |    |
+| systemd              | ✓  | ✓  |    |    |    |     | ✓  | ✓  |    |    |    |    |
+| daemontools/runit/s6 | ✓  |    |    |    |    |     |    |    |    |    |    |    |
+| OpenRC               | ✓  |    |    |    |    |     |    |    |    |    |    |    |
+| OTP (Erlang)         | ✓  |    |    |    |    | ✓   | ✓  | ✓  | ✓  |    |    |    |
+| X11                  |    |    |    | ✓  |    | ✓   | ✓  |    | ✓  |    |    |    |
+| Time Machine         |    |    |    |    |    |     |    |    |    |    |    | ✓  |
+| Nextcloud            |    |    |    | ✓  |    | ✓   | ✓  |    | ✓  |    | ✓  |    |
+| Syncthing            |    |    |    | ✓  |    |     | ✓  |    |    |    | ✓  |    |
+| Windows Registry     |    |    |    |    |    |     |    |    |    | ✓  |    |    |
+| GNOME                |    | ✓  | ✓  | ✓  |    |     |    |    | ✓  | ✓  |    |    |
+| Android              | ✓  | ✓  | ✓  | ✓  | ✓  | ✓   | ✓  | ✓  | ✓  | ✓  |    |    |
+
+
+
+## References
+
+[Bass et al 1998]: #ref:bass98
+[**Bass et al 1998**] <span id="ref:bass98"> Bass, Len, Paul Clements, and Rick
+Kazman. Software Architecture in Practice. Addison-Wesley, 1998.</span>
+
+[Birgisson et al 2014]: #ref:birgisson14
+[**Birgisson et al 2014**] <span id="ref:birgisson14"> Birgisson, Arnar, Joe Gibbs Politz,
+Úlfar Erlingsson, Ankur Taly, Michael Vrable, and Mark Lentczner. “Macaroons: Cookies with
+Contextual Caveats for Decentralized Authorization in the Cloud.” In Network and Distributed
+System Security Symposium. San Diego, California: Internet Society, 2014.</span>
+
+[Clements et al 2001]: #ref:clements01
+[**Clements et al 2001**] <span id="ref:clements01"> Clements, Paul, Rick Kazman, and Mark
+Klein. Evaluating Software Architectures: Methods and Case Studies. Addison-Wesley,
+2001.</span>
+
+[Corbet 2019]: #ref:corbet19
+[**Corbet 2019**] <span id="ref:corbet19"> Corbet, Jonathan. “Systemd as Tragedy.” LWN.Net,
+January 28, 2019. <https://lwn.net/Articles/777595/>.</span>
+
+[Ellison 1999]: #ref:ellison99
+[**Ellison 1999**] <span id="ref:ellison99"> Ellison, Carl. SPKI Requirements. Request for
+Comments 2692. RFC Editor, 1999. <https://doi.org/10.17487/RFC2692>.</span>
+
+[Rice 2019]: #ref:rice19
+[**Rice 2019**] <span id="ref:rice19"> Rice, Benno. “The Tragedy of Systemd.” Conference
+Presentation at linux.conf.au, Christchurch, New Zealand, January 24, 2019.
+<https://www.youtube.com/watch?v=o_AIw9bGogo>.</span>
+
+[Ylonen et al 1999]: #ref:ylonen99
+[**Ylonen et al 1999**] <span id="ref:ylonen99"> Ylonen, Tatu, Brian Thomas, Butler Lampson,
+Carl Ellison, Ronald L. Rivest, and William S. Frantz. SPKI Certificate Theory. Request for
+Comments 2693. RFC Editor, 1999. <https://doi.org/10.17487/RFC2693>.</span>
+
+---
+
+#### Notes
+
+[^qualities]: Known in the literature as “-ilities”; see e.g.
+    [Bass et al 1998][] or
+    [Clements et al 2001][].
+
+[^as-far-as-i-know]: I wrote to Benno Rice to ask him about the term. He replied that he
+    doesn't know of any earlier use of "system layer" for this particular bundle of ideas.
+    Quoted (with permission) from his email to me: <q>I’m not going to claim to be the first
+    who thought of the idea but the name was something I came up with to describe the services
+    that run in userspace but provide system-level services. I’m happy to own it if nobody else
+    had the idea first. 🙃</q> It looks to me, then, like the term originated with him in 2019.
+
+[^cleaned-up-automated-transcript]: I cut and pasted the automated
+    YouTube transcript of the talk, and then cleaned it up.
+    (Emphasis mine.)
+
+[^libc-resolver]: The resolver built in to libc plays the major part in this; but things like
+    dnsmasq play a role too, especially when combined with virtual machines running within a
+    host.