More system layer stuff

2022-10-21 15:41:20 +02:00 · 2022-10-21 15:41:20 +02:00 · f95e8d90fe
parent 4d360e65a6
commit f95e8d90fe
2 changed files with 225 additions and 114 deletions
--- a/src/synit-as-system-layer.md
+++ b/src/synit-as-system-layer.md
@ -12,3 +12,55 @@ at the correct level to effectively capture the concepts intrinsic to
 a system layer.

 - Protocols capturing a synthesis of system layer behaviours, based on the analysis
+
+
+
+The theory of object
+capabilities (“ocaps”), exemplified in languages such as E and
+programming models such as Actors, offers a fine-grained approach that
+can be made to scale further than a single machine. However, ocaps
+only control access to shared programs. Access controls for shared
+data are left implicit. In addition, ideas of location and system
+boundary are left implicit in ocap systems.
+
+I will adapt ocaps to syndicated actors. Because the Syndicated Actor model includes a
+first-class notion of shared data as well as a layered conception of locations and location
+boundaries, syndicated capabilities will reflect these ideas directly. I will generalize the
+Syndicated Actor model’s existing notions of place, connecting capabilities not to individual
+actors but to individual places and the data held therein. I will draw on existing ocap
+literature, including in particular the recent notion of Macaroons ([Birgisson et al 2014][])
+and older ideas from SPKI/SDSI ([Ylonen et al 1999][]; [Ellison 1999][]).
+
+
+**Q. How do you feel dataspaces would most enhance privacy or trust?**
+
+Capability technology offers strong, flexible control over access to any given dataspace
+without getting lost in the weeds of identity management: identity is an application-local,
+application-private concern.
+
+Dataspaces default to being closed, "invite-only" networks, meaning casual observation of
+activity in a dataspace is not possible. But the necessary extension of the capability model to
+handle the data-sharing aspects of dataspaces gives benefit in terms of privacy and trust that
+goes beyond the already considerable benefits a traditional capability model offers.
+
+Traditional capabilities directly control access to behavioural objects, and only indirectly
+control access to data held within such objects. Syndicated capabilities, by contrast, directly
+control access to shared data held within a space - changes to which may trigger activity in
+"objects" participating in the dataspace.
+
+In other words, traditional capabilities encode data access controls in terms of object access
+controls; syndicated capabilities, vice versa.
+
+This ability to directly express access to shared data gives system designers a powerful tool
+for thinking about permitted information flows, including questions of privacy. Furthermore,
+*attenuating* the authority of syndicated capabilities before passing them on to some other
+principal allows for strong partitioning of access within a dataspace, offering fine-grained,
+local, compositional decisions about access to shared data. Finally, it becomes possible to
+expose capabilities to end-users (roughly analogous to URLs), putting that power in their hands
+also.
+
+I should also mention that dataspaces can scale from managing activity within a single OS
+process up to coordinating activity between machines around the world. A distributed dataspace
+could be an excellent foundation for collaborative applications, where privacy concerns come to
+the forefront. In effect, a dataspace can become a richly-structured "VPN", containing
+application-specific shared data and with application- or schema-specific access controls.
--- a/src/system-layer.md
+++ b/src/system-layer.md
@ -9,7 +9,7 @@ importance lies in its role as the technical foundation for many qualities[^qual
 to system security, resilience, connectivity, maintainability and usability.

 In the Linux world, existing system layer realizations cross-cut many, many projects:
-NetworkManager, GNOME, DBus, systemd, OpenRC, apt, apk, and so on. Each project has its own
+NetworkManager, GNOME, D-Bus, systemd, OpenRC, apt, apk, and so on. Each project has its own
 role in the overall system layer, and none takes a strong stance on the overall architecture
 that results from their combination. However, there are a group of basic concepts involved in a
 system layer that transcend individual subprojects, relating to issues of IPC, discovery, and
@ -124,9 +124,9 @@ These tasks can be broken down into smaller, but still general, pieces:
 Concrete examples include:

 - starting services in response to statically-configured runlevels (OpenRC, systemd, SysV init, etc.)
- - starting dependencies before dependent services (OpenRC, systemd, SysV init, etc.)
+ - starting dependencies before dependent services (OpenRC, systemd, SysV init, etc.), including readiness-detection and -signalling
 - restarting terminated or failed services in a supervision hierarchy (daemontools, s6, etc.; Erlang/OTP)
- - starting services by service name on demand (DBus, etc.)
+ - starting services by service name on demand (D-Bus, etc.)
 - starting services by socket activation (systemd, etc.)
 - virtual-machine and container lifecycles, including supervision and restart of containers (docker, docker-compose, etc.)
 - reacting to hotplugging of a device by installing a driver or starting a program (udevd, etc.)
@ -141,14 +141,22 @@ distributions](https://skarnet.com/projects/service-manager.html).

 ## Higher-level perspectives on and control over system state and resources

-An essential system layer task is to give users and applications higher-level perspectives on
-system state, resource availability and resource consumption than those offered by the kernel.
+An essential system layer task is to give users and applications **higher-level perspectives**
+on system state, resource availability and resource consumption than those offered by the
+kernel. This has two parts: refining low-level information about system state into higher-level
+knowledge, and reflecting user (or application) preferences expressed in terms of the
+higher-level perspective back into concrete actions to perform at the lower level.

-For example, the kernel's [`NETLINK_ROUTE`](https://en.wikipedia.org/wiki/Netlink) sockets
-allow processes to observe changes in network interface and routing configuration, but
-applications often do not need the fine detail on offer: instead, they need higher-level
-knowledge such as "a usable default route for IPv4 exists", or "IPv4 connectivity is available,
-but metered".
+As an example of the first, the kernel's
+[`NETLINK_ROUTE`](https://en.wikipedia.org/wiki/Netlink) sockets allow processes to observe
+changes in network interface and routing configuration, but applications often do not need the
+fine detail on offer: instead, they need higher-level knowledge such as "a usable default route
+for IPv4 exists", or "IPv4 connectivity is available, but metered".
+
+As an example of the second, NetworkManager allows users to set policy for wifi connection
+establishment in terms of a priority ordering over SSIDs and conditions for when and whether to
+use a particular network. NetworkManager's job is to translate this into a sequence of
+low-level wifi scans, associations and disconnections.

 Breaking this task down into smaller pieces yields:

@ -156,6 +164,8 @@ Breaking this task down into smaller pieces yields:
 - ability to either poll for or subscribe to changes in such state
 - ability to compute relevant higher-level perspectives on the state
 - a medium for communicating such changes to users and applications
+ - a medium for retrieving preferences and actions from users and applications
+ - ability to perform actions on low-level system resources

 Concrete examples include:

@ -163,7 +173,7 @@ Concrete examples include:
 - use of `NETLINK_KOBJECT_UEVENT` by udev to configure and expose hotplugged devices to userland
 - interrogation of disk devices and partition tables to provide views on and control over available filesystems (gnome-disks, etc.)
 - interrogation of audio devices and audio routing options to provide high-level views and control over audio setup (pipewire, pulseaudio, etc.), e.g. volume level display and volume controls, mute, select input/output channel, play/pause, skip, rewind etc.
- - high-level perspectives on devices such as displays, printers, mice, keyboards, touchpads, accelerometers, proximity sensors, temperature monitors and so on (GNOME, XFCE4, KDE, cups, etc.), communicated via DBus and friends
+ - high-level perspectives on devices such as displays, printers, mice, keyboards, touchpads, accelerometers, proximity sensors, temperature monitors and so on (GNOME, XFCE4, KDE, cups, etc.), communicated via D-Bus and friends
 - system configuration databases (`/etc`, Windows' Registry, GNOME configuration databases)
 - location services mapping from low-level GPS and wifi information to medium-level concrete location coordinates to high-level "you are at home", "you are in the office"-style knowledge about location
 - telephony services exposing high-level call management interfaces backed by low-level modem operations
@ -180,147 +190,187 @@ system services are analogous parts of the system layer.

 ## Access control mechanisms and policies, security, and privacy

- - offer access control mechanisms and enforce access control policies
- - offer a coherent, system-wide approach to security and privacy
+An inescapable concern when composing software across trust domains is **access control**.
+System layers provide mechanisms for controlling access to software resources and data, allow
+users and applications to specify access control policies, and enforce those policies on their
+behalf.

- - *access control*
-    - resource allocation services
-    - ACL-based access control for system services and DBus objects
+Given the increasingly blurry lines between local and cloud-based personal computing, the scope
+of access controls can be broad, including confidentiality and integrity protections for user
+data and careful control over user privacy.

-### Security and privacy
+Multiple trust domains appear even in a single-user personal computing system: the kernel is
+its own trust domain; its daemon representatives within the system layer are at least one
+other; the user is a trust domain, and its system-layer representatives another; and each
+application is a trust domain, particularly when it is a third-party application acting on
+behalf of a user, perhaps bringing cloud services into the picture. Moving from a single- to a
+multiple-user system then adds only minor complexity.

-Existing system layers rely on single-machine approaches to security
-and securability that do not scale well: for example, Unix ACLs and
-user- and group-ID-based permissions. The theory of object
-capabilities (“ocaps”), exemplified in languages such as E and
-programming models such as Actors, offers a fine-grained approach that
-can be made to scale further than a single machine. However, ocaps
-only control access to shared programs. Access controls for shared
-data are left implicit. In addition, ideas of location and system
-boundary are left implicit in ocap systems.
+Existing system layer realizations, at least within the Linux world, tend to address access
+control, security and particularly privacy at a relatively primitive level, relying on
+single-machine approaches to security and securability that do not scale well: for example,
+Unix [ACLs](https://en.wikipedia.org/wiki/Access-control_list) and user- and group-ID-based
+permissions.

-I will adapt ocaps to syndicated actors. Because the Syndicated Actor model includes a
-first-class notion of shared data as well as a layered conception of locations and location
-boundaries, syndicated capabilities will reflect these ideas directly. I will generalize the
-Syndicated Actor model’s existing notions of place, connecting capabilities not to individual
-actors but to individual places and the data held therein. I will draw on existing ocap
-literature, including in particular the recent notion of Macaroons ([Birgisson et al 2014][])
-and older ideas from SPKI/SDSI ([Ylonen et al 1999][]; [Ellison 1999][]).
+ - Debian, Alpine, and other Unix-like Linux distributions offer little or no access controls
+   other than those provided by the kernel

-**Q. How do you feel dataspaces would most enhance privacy or trust?**
+ - Android uses the kernel user ID mechanism in a different way, giving an effective
+   improvement in separation between trust domains when compared to traditional Unix approaches

-Capability technology offers strong, flexible control over access to any given dataspace
-without getting lost in the weeds of identity management: identity is an application-local,
-application-private concern.
-
-Dataspaces default to being closed, "invite-only" networks, meaning casual observation of
-activity in a dataspace is not possible. But the necessary extension of the capability model to
-handle the data-sharing aspects of dataspaces gives benefit in terms of privacy and trust that
-goes beyond the already considerable benefits a traditional capability model offers.
-
-Traditional capabilities directly control access to behavioural objects, and only indirectly
-control access to data held within such objects. Syndicated capabilities, by contrast, directly
-control access to shared data held within a space - changes to which may trigger activity in
-"objects" participating in the dataspace.
-
-In other words, traditional capabilities encode data access controls in terms of object access
-controls; syndicated capabilities, vice versa.
-
-This ability to directly express access to shared data gives system designers a powerful tool
-for thinking about permitted information flows, including questions of privacy. Furthermore,
-*attenuating* the authority of syndicated capabilities before passing them on to some other
-principal allows for strong partitioning of access within a dataspace, offering fine-grained,
-local, compositional decisions about access to shared data. Finally, it becomes possible to
-expose capabilities to end-users (roughly analogous to URLs), putting that power in their hands
-also.
-
-I should also mention that dataspaces can scale from managing activity within a single OS
-process up to coordinating activity between machines around the world. A distributed dataspace
-could be an excellent foundation for collaborative applications, where privacy concerns come to
-the forefront. In effect, a dataspace can become a richly-structured "VPN", containing
-application-specific shared data and with application- or schema-specific access controls.
+ - D-Bus authenticates each connection separately, usually mapping principal identities onto
+   Unix user IDs; within the scope of a connection, it uses ACLs to make authorization
+   decisions

+ - Some isolation among trust domains can be achieved with careful use of [kernel
+   namespaces](https://en.wikipedia.org/wiki/Linux_namespaces); however, namespaces are not
+   fine-grained and are awkward to use for privacy-protection purposes. They see use primarily
+   for resource isolation in containerization systems.

 ## Inter-process communication and networking

- - offer inter-process communication media
+> Networking is interprocess communication.
+> *—Robert Metcalfe, 1972, quoted in [Day 2008][]*

- - *inter-process communication*
-    - DBus as a program-to-program communication bus
-    - email for use by system services
+A key part of an operating system is the selection of communications media it offers its
+applications. The kernel itself offers a plethora of communication channels, from the file
+system itself through SysV IPC, shared memory, and pipes up to sockets in multiple flavours.

-X11 for IPC
+System layers need richer facilities in order to handle the reactivity, publish-subscribe,
+name-discovery and -management and access control needs previously discussed. In addition, the
+concept of an "address" within a system layer is often more complex than the low-level endpoint
+addresses on offer by the kernel: for example, D-Bus object names, email addresses and aliases,
+and Docker container names do not fit easily into kernel constructs, and this applies double
+for the addresses of fine-grained resources (e.g. single objects) within a process.
+
+ - Traditional Unix-like system layers configure *email* for use by system services, primarily
+   for system-to-user communication but also in principle for program-to-program communication.
+
+ - D-Bus is a coarse-grained, ACL-based message bus with an ad-hoc object model and
+   publish-subscribe mechanism. It has been used as the foundation for a lot of system layer
+   software such as the components in the GNOME desktop environment and the building-blocks of
+   NetworkManager and similar services.
+
+ - X11 offers multiple methods by which clients can communicate with each other. Primary
+   applications include shared clipboard management and window management, but the selection
+   and property change notification mechanisms are general-purpose and could in principle form
+   an interesting substrate for organising software components.
+
+ - <span id="binder"></span>Android IPC is (if I understand correctly!) primarily based around
+   [binder](https://elinux.org/Android_Binder) and layers a number of communication
+   "personalities" on top of it (such as
+   [AIDL](https://developer.android.com/guide/components/aidl),
+   [Broadcasts](https://developer.android.com/guide/components/broadcasts), and
+   [Messenger](https://developer.android.com/reference/android/os/Messenger)s). Binder is
+   apparently ([1](https://elinux.org/Android_Binder), [2](https://lkml.org/lkml/2009/6/25/3),
+   [3](https://lwn.net/Articles/466304/)) a (mostly) object-capability ("ocap") system, with
+   fine-grained object passing, failure-signalling (a "link to death" facility, much like
+   Erlang's [links and
+   monitors](https://www.erlang.org/docs/22/reference_manual/processes.html#links)), and
+   distributed garbage-collection[^binder-vs-syndicate] that is extremely widely used in
+   Android.
+
+   From a [2009 email from Dianne Hackborne](https://lkml.org/lkml/2009/6/25/3):
+   <q>For a rough idea of the scope of the binder's use in Android, here is a list of the basic
+   system services that are implemented on top of it: package manager, telephony manager, app
+   widgets, audio services, search manager, location manager, notification manager,
+   accessibility manager, connectivity manager, wifi manager, input method manager, clipboard,
+   status bar, window manager, sensor service, alarm manager, content service, activity
+   manager, power manager, surface compositor.</q>

 ## Name-binding, name-resolution, and namespaces

- - provide name-binding and name resolution services
-
-udev - /dev namespace
-
- - *naming services*
-    - publishing names for intra-machine services on this system
-    - publishing names for LAN services on this system
-    - resolving names of intra-machine services on this system
-    - resolving names of services on other systems[^libc-resolver]
-
+Many of the services offered by a system layer involve management and querying of mappings
+between high-level *names* and (zero or more) lower-level *addresses* ([Day 2008][]). These
+appear in many different guises, from the directories in the file system, to DNS names (mDNS
+services like [avahi](https://www.avahi.org/); the libc resolver; services like dnsmasq), to
+device names (managed by udev), to object names (DBus), to service names, to preconfigured
+connection settings (NetworkManager), to user and group names and so on. Namespace management
+is a core feature of a system layer.

 ## Job queueing and job scheduling

- - provide job-queueing and -scheduling services, including calendar-like and time-based scheduling
+System layers frequently provide job-queueing and -scheduling services, including calendar-like
+and time-based scheduling. As a corollary, they also provide job- and schedule-management
+interfaces.

-cron
-at
-systemd timers
+ - Traditional Unix has `cron` and `at` for job scheduling.

-cups, lpd
+ - Android has system [alarm services](https://developer.android.com/reference/android/app/AlarmManager).

-mail queue management?
+ - systemd has [timers](https://www.freedesktop.org/software/systemd/man/systemd.timer.html) as
+   a replacement for `cron`.
+
+ - systemd also has a [job
+   engine](https://www.freedesktop.org/software/systemd/man/systemd-run.html) (see also
+   [here](https://www.freedesktop.org/software/systemd/man/systemctl.html#Job%20Commands) and
+   [here](https://bl33pbl0p.github.io/systemd.html)) for decoupling work in space and time.
+
+ - print queues like `lpd` and `cups` are job management engines at heart
+
+ - you can even see the mail queue as a kind of job queue (and if you squint *very* hard, you
+   can see all the intermediate buffers in a networking or IPC system as job queues; cf [Day
+   2008][]).

 ## User interface

- - provide user interface facilities
+The user interface is a classic example of a system facility that cross-cuts individual
+applications and tasks. A system layer must provide some kind of user interface service to
+applications (and to its own system services).

-(TO APPLICATIONS but I guess also for the system layer itself)
+ - At a minimum, Unix-like kernels offer `tty`s. Access to a system via `ssh` is a natural next
+   step.

- - provide system-wide "cut-and-paste" services for user-controlled IPC
+ - X11 is the traditional Unix user interface, with its own IPC protocol and ad-hoc object
+   model; wayland is a recent entrant into a similar space, also with its own IPC protocol and
+   ad-hoc object model. Android offers [SurfaceFlinger and
+   WindowManager](https://source.android.com/docs/core/graphics/surfaceflinger-windowmanager)
+   along with a large library of user interface widgets; the underlying IPC is presumably
+   binder ([see above](#binder)).

-email for talking to users
-notifications - system tray
+ - In Smalltalk-80-derived systems (like [squeak](https://squeak.org/)), the user interface is
+   tightly integrated with the multiprocessing and IPC facilities (such as they are). Squeak
+   also offers simple, quick-and-dirty "alert" and "prompt" APIs to applications, similar to
+   the
+   [`alert`](https://developer.mozilla.org/en-US/docs/Web/API/Window/alert)/[`prompt`](https://developer.mozilla.org/en-US/docs/Web/API/Window/prompt)/[`confirm`](https://developer.mozilla.org/en-US/docs/Web/API/Window/confirm)
+   functions included in web browsers.

- - ui facilities
-    - the thing that asks for user input during apt configuration
-    - the alert/prompt boxes in a web browser (?)
-    - notifications
-    - system tray, applets
+ - Many, but not all, system layers provide a system-wide "cut and paste" service as part of
+   their user interface, for *user-controlled* IPC. X11 applications have a clipboard
+   convention; Mac OS, Windows, Android etc. have a standard clipboard.

-## System configuration and user preferences
+ - System-level *email* can be seen as a form of user interface for reaching users (system
+   administrators).

- - provide system configuration and user preference databases
+ - Many desktop environments include *notifications* and some form of *system tray* giving
+   quick reference to high-level perspectives on system status as previously discussed.

- - system configuration database
-    - system settings manager
+ - Some system-layer administration tasks require user interface: for example, user input
+   during `apt` package configuration.

 ## Software management

- - support software package installation, upgrade, and removal
-
-cc
-apt
-apk
+System management involves upgrade of system code and installation, management and removal of
+application code. Android has a solid story around software management. Linux distributions
+tend to have package management tools (e.g. `apt`, `apk`, `yum` etc.). Stretching a little
+further, one might include the system programming language and its development environment as
+part of the software management portion of a system layer: for example, many Unix-like systems
+include `cc`, and Smalltalk systems make the system programming language (Smalltalk) available
+from any text input field.

 ## State replication and data backup

- - offer state replication services
- - provide data backup facilities
+The notion of state replication appears in many different contexts. For example, user
+contact/address databases must often be replicated and accessible across devices. System
+configuration data is often shared across servers in a cloud deployment (ansible, puppet). Many
+add-on applications like Dropbox, NextCloud, Syncthing etc. add file replication to a system.
+Applications like Google Keep, to-do list applications, and other sticky-notes/reminder apps
+replicate their databases across machines. Very few system layer realizations offer a coherent
+data replication facility, despite its clear cross-application utility.

- - state replication services
-    - contact book, address book
-    - file replication across machines
-    - sticky-notes, google keep
-    - todo list
-
- - backup facilities
-    - Time Machine
+Relatedly, preserving user data in case of calamity is a core operating system feature. Despite
+this, few whole systems offer a coherent data backup facility. Exceptions include Apple's Time
+Machine and Google's Android backup support libraries.

 ## Synthesis, or, Toward a Complete Vision of a System Layer

@ -354,6 +404,8 @@ to be good IPC and state-management and -introspection.
 | GNOME                |    | ✓  | ✓  | ✓  |    |     |    |    | ✓  | ✓  |    |    |
 | Android              | ✓  | ✓  | ✓  | ✓  | ✓  | ✓   | ✓  | ✓  | ✓  | ✓  |    |    |

+ - ideally, a system layer's security mechanisms would offer a coherent, system-wide approach
+   to security and privacy. few do so


 ## References
@ -377,6 +429,10 @@ Klein. Evaluating Software Architectures: Methods and Case Studies. Addison-Wesl
 [**Corbet 2019**] <span id="ref:corbet19"> Corbet, Jonathan. “Systemd as Tragedy.” LWN.Net,
 January 28, 2019. <https://lwn.net/Articles/777595/>.</span>

+[Day 2008]: #ref:day08
+[**Day 2008**] <span id="ref:day08"> Day, John. Patterns in Network Architecture: A Return to
+Fundamentals. Prentice Hall, 2008.</span>
+
 [Ellison 1999]: #ref:ellison99
 [**Ellison 1999**] <span id="ref:ellison99"> Ellison, Carl. SPKI Requirements. Request for
 Comments 2692. RFC Editor, 1999. <https://doi.org/10.17487/RFC2692>.</span>
@ -413,3 +469,6 @@ Comments 2693. RFC Editor, 1999. <https://doi.org/10.17487/RFC2693>.</span>
 [^libc-resolver]: The resolver built in to libc plays the major part in this; but things like
    dnsmasq play a role too, especially when combined with virtual machines running within a
    host.
+
+[^binder-vs-syndicate]: Looking at binder, I see *strong* similarities with the [Syndicated
+    Actor Model](syndicated-actor-model.md) and its [protocol](protocol.md)!