More system layer stuff

This commit is contained in:
Tony Garnock-Jones 2022-10-21 15:41:20 +02:00
parent 4d360e65a6
commit f95e8d90fe
2 changed files with 225 additions and 114 deletions

View File

@ -12,3 +12,55 @@ at the correct level to effectively capture the concepts intrinsic to
a system layer.
- Protocols capturing a synthesis of system layer behaviours, based on the analysis
The theory of object
capabilities (“ocaps”), exemplified in languages such as E and
programming models such as Actors, offers a fine-grained approach that
can be made to scale further than a single machine. However, ocaps
only control access to shared programs. Access controls for shared
data are left implicit. In addition, ideas of location and system
boundary are left implicit in ocap systems.
I will adapt ocaps to syndicated actors. Because the Syndicated Actor model includes a
first-class notion of shared data as well as a layered conception of locations and location
boundaries, syndicated capabilities will reflect these ideas directly. I will generalize the
Syndicated Actor models existing notions of place, connecting capabilities not to individual
actors but to individual places and the data held therein. I will draw on existing ocap
literature, including in particular the recent notion of Macaroons ([Birgisson et al 2014][])
and older ideas from SPKI/SDSI ([Ylonen et al 1999][]; [Ellison 1999][]).
**Q. How do you feel dataspaces would most enhance privacy or trust?**
Capability technology offers strong, flexible control over access to any given dataspace
without getting lost in the weeds of identity management: identity is an application-local,
application-private concern.
Dataspaces default to being closed, "invite-only" networks, meaning casual observation of
activity in a dataspace is not possible. But the necessary extension of the capability model to
handle the data-sharing aspects of dataspaces gives benefit in terms of privacy and trust that
goes beyond the already considerable benefits a traditional capability model offers.
Traditional capabilities directly control access to behavioural objects, and only indirectly
control access to data held within such objects. Syndicated capabilities, by contrast, directly
control access to shared data held within a space - changes to which may trigger activity in
"objects" participating in the dataspace.
In other words, traditional capabilities encode data access controls in terms of object access
controls; syndicated capabilities, vice versa.
This ability to directly express access to shared data gives system designers a powerful tool
for thinking about permitted information flows, including questions of privacy. Furthermore,
*attenuating* the authority of syndicated capabilities before passing them on to some other
principal allows for strong partitioning of access within a dataspace, offering fine-grained,
local, compositional decisions about access to shared data. Finally, it becomes possible to
expose capabilities to end-users (roughly analogous to URLs), putting that power in their hands
also.
I should also mention that dataspaces can scale from managing activity within a single OS
process up to coordinating activity between machines around the world. A distributed dataspace
could be an excellent foundation for collaborative applications, where privacy concerns come to
the forefront. In effect, a dataspace can become a richly-structured "VPN", containing
application-specific shared data and with application- or schema-specific access controls.

View File

@ -9,7 +9,7 @@ importance lies in its role as the technical foundation for many qualities[^qual
to system security, resilience, connectivity, maintainability and usability.
In the Linux world, existing system layer realizations cross-cut many, many projects:
NetworkManager, GNOME, DBus, systemd, OpenRC, apt, apk, and so on. Each project has its own
NetworkManager, GNOME, D-Bus, systemd, OpenRC, apt, apk, and so on. Each project has its own
role in the overall system layer, and none takes a strong stance on the overall architecture
that results from their combination. However, there are a group of basic concepts involved in a
system layer that transcend individual subprojects, relating to issues of IPC, discovery, and
@ -124,9 +124,9 @@ These tasks can be broken down into smaller, but still general, pieces:
Concrete examples include:
- starting services in response to statically-configured runlevels (OpenRC, systemd, SysV init, etc.)
- starting dependencies before dependent services (OpenRC, systemd, SysV init, etc.)
- starting dependencies before dependent services (OpenRC, systemd, SysV init, etc.), including readiness-detection and -signalling
- restarting terminated or failed services in a supervision hierarchy (daemontools, s6, etc.; Erlang/OTP)
- starting services by service name on demand (DBus, etc.)
- starting services by service name on demand (D-Bus, etc.)
- starting services by socket activation (systemd, etc.)
- virtual-machine and container lifecycles, including supervision and restart of containers (docker, docker-compose, etc.)
- reacting to hotplugging of a device by installing a driver or starting a program (udevd, etc.)
@ -141,14 +141,22 @@ distributions](https://skarnet.com/projects/service-manager.html).
## Higher-level perspectives on and control over system state and resources
An essential system layer task is to give users and applications higher-level perspectives on
system state, resource availability and resource consumption than those offered by the kernel.
An essential system layer task is to give users and applications **higher-level perspectives**
on system state, resource availability and resource consumption than those offered by the
kernel. This has two parts: refining low-level information about system state into higher-level
knowledge, and reflecting user (or application) preferences expressed in terms of the
higher-level perspective back into concrete actions to perform at the lower level.
For example, the kernel's [`NETLINK_ROUTE`](https://en.wikipedia.org/wiki/Netlink) sockets
allow processes to observe changes in network interface and routing configuration, but
applications often do not need the fine detail on offer: instead, they need higher-level
knowledge such as "a usable default route for IPv4 exists", or "IPv4 connectivity is available,
but metered".
As an example of the first, the kernel's
[`NETLINK_ROUTE`](https://en.wikipedia.org/wiki/Netlink) sockets allow processes to observe
changes in network interface and routing configuration, but applications often do not need the
fine detail on offer: instead, they need higher-level knowledge such as "a usable default route
for IPv4 exists", or "IPv4 connectivity is available, but metered".
As an example of the second, NetworkManager allows users to set policy for wifi connection
establishment in terms of a priority ordering over SSIDs and conditions for when and whether to
use a particular network. NetworkManager's job is to translate this into a sequence of
low-level wifi scans, associations and disconnections.
Breaking this task down into smaller pieces yields:
@ -156,6 +164,8 @@ Breaking this task down into smaller pieces yields:
- ability to either poll for or subscribe to changes in such state
- ability to compute relevant higher-level perspectives on the state
- a medium for communicating such changes to users and applications
- a medium for retrieving preferences and actions from users and applications
- ability to perform actions on low-level system resources
Concrete examples include:
@ -163,7 +173,7 @@ Concrete examples include:
- use of `NETLINK_KOBJECT_UEVENT` by udev to configure and expose hotplugged devices to userland
- interrogation of disk devices and partition tables to provide views on and control over available filesystems (gnome-disks, etc.)
- interrogation of audio devices and audio routing options to provide high-level views and control over audio setup (pipewire, pulseaudio, etc.), e.g. volume level display and volume controls, mute, select input/output channel, play/pause, skip, rewind etc.
- high-level perspectives on devices such as displays, printers, mice, keyboards, touchpads, accelerometers, proximity sensors, temperature monitors and so on (GNOME, XFCE4, KDE, cups, etc.), communicated via DBus and friends
- high-level perspectives on devices such as displays, printers, mice, keyboards, touchpads, accelerometers, proximity sensors, temperature monitors and so on (GNOME, XFCE4, KDE, cups, etc.), communicated via D-Bus and friends
- system configuration databases (`/etc`, Windows' Registry, GNOME configuration databases)
- location services mapping from low-level GPS and wifi information to medium-level concrete location coordinates to high-level "you are at home", "you are in the office"-style knowledge about location
- telephony services exposing high-level call management interfaces backed by low-level modem operations
@ -180,147 +190,187 @@ system services are analogous parts of the system layer.
## Access control mechanisms and policies, security, and privacy
- offer access control mechanisms and enforce access control policies
- offer a coherent, system-wide approach to security and privacy
An inescapable concern when composing software across trust domains is **access control**.
System layers provide mechanisms for controlling access to software resources and data, allow
users and applications to specify access control policies, and enforce those policies on their
behalf.
- *access control*
- resource allocation services
- ACL-based access control for system services and DBus objects
Given the increasingly blurry lines between local and cloud-based personal computing, the scope
of access controls can be broad, including confidentiality and integrity protections for user
data and careful control over user privacy.
### Security and privacy
Multiple trust domains appear even in a single-user personal computing system: the kernel is
its own trust domain; its daemon representatives within the system layer are at least one
other; the user is a trust domain, and its system-layer representatives another; and each
application is a trust domain, particularly when it is a third-party application acting on
behalf of a user, perhaps bringing cloud services into the picture. Moving from a single- to a
multiple-user system then adds only minor complexity.
Existing system layers rely on single-machine approaches to security
and securability that do not scale well: for example, Unix ACLs and
user- and group-ID-based permissions. The theory of object
capabilities (“ocaps”), exemplified in languages such as E and
programming models such as Actors, offers a fine-grained approach that
can be made to scale further than a single machine. However, ocaps
only control access to shared programs. Access controls for shared
data are left implicit. In addition, ideas of location and system
boundary are left implicit in ocap systems.
Existing system layer realizations, at least within the Linux world, tend to address access
control, security and particularly privacy at a relatively primitive level, relying on
single-machine approaches to security and securability that do not scale well: for example,
Unix [ACLs](https://en.wikipedia.org/wiki/Access-control_list) and user- and group-ID-based
permissions.
I will adapt ocaps to syndicated actors. Because the Syndicated Actor model includes a
first-class notion of shared data as well as a layered conception of locations and location
boundaries, syndicated capabilities will reflect these ideas directly. I will generalize the
Syndicated Actor models existing notions of place, connecting capabilities not to individual
actors but to individual places and the data held therein. I will draw on existing ocap
literature, including in particular the recent notion of Macaroons ([Birgisson et al 2014][])
and older ideas from SPKI/SDSI ([Ylonen et al 1999][]; [Ellison 1999][]).
- Debian, Alpine, and other Unix-like Linux distributions offer little or no access controls
other than those provided by the kernel
**Q. How do you feel dataspaces would most enhance privacy or trust?**
- Android uses the kernel user ID mechanism in a different way, giving an effective
improvement in separation between trust domains when compared to traditional Unix approaches
Capability technology offers strong, flexible control over access to any given dataspace
without getting lost in the weeds of identity management: identity is an application-local,
application-private concern.
Dataspaces default to being closed, "invite-only" networks, meaning casual observation of
activity in a dataspace is not possible. But the necessary extension of the capability model to
handle the data-sharing aspects of dataspaces gives benefit in terms of privacy and trust that
goes beyond the already considerable benefits a traditional capability model offers.
Traditional capabilities directly control access to behavioural objects, and only indirectly
control access to data held within such objects. Syndicated capabilities, by contrast, directly
control access to shared data held within a space - changes to which may trigger activity in
"objects" participating in the dataspace.
In other words, traditional capabilities encode data access controls in terms of object access
controls; syndicated capabilities, vice versa.
This ability to directly express access to shared data gives system designers a powerful tool
for thinking about permitted information flows, including questions of privacy. Furthermore,
*attenuating* the authority of syndicated capabilities before passing them on to some other
principal allows for strong partitioning of access within a dataspace, offering fine-grained,
local, compositional decisions about access to shared data. Finally, it becomes possible to
expose capabilities to end-users (roughly analogous to URLs), putting that power in their hands
also.
I should also mention that dataspaces can scale from managing activity within a single OS
process up to coordinating activity between machines around the world. A distributed dataspace
could be an excellent foundation for collaborative applications, where privacy concerns come to
the forefront. In effect, a dataspace can become a richly-structured "VPN", containing
application-specific shared data and with application- or schema-specific access controls.
- D-Bus authenticates each connection separately, usually mapping principal identities onto
Unix user IDs; within the scope of a connection, it uses ACLs to make authorization
decisions
- Some isolation among trust domains can be achieved with careful use of [kernel
namespaces](https://en.wikipedia.org/wiki/Linux_namespaces); however, namespaces are not
fine-grained and are awkward to use for privacy-protection purposes. They see use primarily
for resource isolation in containerization systems.
## Inter-process communication and networking
- offer inter-process communication media
> Networking is interprocess communication.
> *—Robert Metcalfe, 1972, quoted in [Day 2008][]*
- *inter-process communication*
- DBus as a program-to-program communication bus
- email for use by system services
A key part of an operating system is the selection of communications media it offers its
applications. The kernel itself offers a plethora of communication channels, from the file
system itself through SysV IPC, shared memory, and pipes up to sockets in multiple flavours.
X11 for IPC
System layers need richer facilities in order to handle the reactivity, publish-subscribe,
name-discovery and -management and access control needs previously discussed. In addition, the
concept of an "address" within a system layer is often more complex than the low-level endpoint
addresses on offer by the kernel: for example, D-Bus object names, email addresses and aliases,
and Docker container names do not fit easily into kernel constructs, and this applies double
for the addresses of fine-grained resources (e.g. single objects) within a process.
- Traditional Unix-like system layers configure *email* for use by system services, primarily
for system-to-user communication but also in principle for program-to-program communication.
- D-Bus is a coarse-grained, ACL-based message bus with an ad-hoc object model and
publish-subscribe mechanism. It has been used as the foundation for a lot of system layer
software such as the components in the GNOME desktop environment and the building-blocks of
NetworkManager and similar services.
- X11 offers multiple methods by which clients can communicate with each other. Primary
applications include shared clipboard management and window management, but the selection
and property change notification mechanisms are general-purpose and could in principle form
an interesting substrate for organising software components.
- <span id="binder"></span>Android IPC is (if I understand correctly!) primarily based around
[binder](https://elinux.org/Android_Binder) and layers a number of communication
"personalities" on top of it (such as
[AIDL](https://developer.android.com/guide/components/aidl),
[Broadcasts](https://developer.android.com/guide/components/broadcasts), and
[Messenger](https://developer.android.com/reference/android/os/Messenger)s). Binder is
apparently ([1](https://elinux.org/Android_Binder), [2](https://lkml.org/lkml/2009/6/25/3),
[3](https://lwn.net/Articles/466304/)) a (mostly) object-capability ("ocap") system, with
fine-grained object passing, failure-signalling (a "link to death" facility, much like
Erlang's [links and
monitors](https://www.erlang.org/docs/22/reference_manual/processes.html#links)), and
distributed garbage-collection[^binder-vs-syndicate] that is extremely widely used in
Android.
From a [2009 email from Dianne Hackborne](https://lkml.org/lkml/2009/6/25/3):
<q>For a rough idea of the scope of the binder's use in Android, here is a list of the basic
system services that are implemented on top of it: package manager, telephony manager, app
widgets, audio services, search manager, location manager, notification manager,
accessibility manager, connectivity manager, wifi manager, input method manager, clipboard,
status bar, window manager, sensor service, alarm manager, content service, activity
manager, power manager, surface compositor.</q>
## Name-binding, name-resolution, and namespaces
- provide name-binding and name resolution services
udev - /dev namespace
- *naming services*
- publishing names for intra-machine services on this system
- publishing names for LAN services on this system
- resolving names of intra-machine services on this system
- resolving names of services on other systems[^libc-resolver]
Many of the services offered by a system layer involve management and querying of mappings
between high-level *names* and (zero or more) lower-level *addresses* ([Day 2008][]). These
appear in many different guises, from the directories in the file system, to DNS names (mDNS
services like [avahi](https://www.avahi.org/); the libc resolver; services like dnsmasq), to
device names (managed by udev), to object names (DBus), to service names, to preconfigured
connection settings (NetworkManager), to user and group names and so on. Namespace management
is a core feature of a system layer.
## Job queueing and job scheduling
- provide job-queueing and -scheduling services, including calendar-like and time-based scheduling
System layers frequently provide job-queueing and -scheduling services, including calendar-like
and time-based scheduling. As a corollary, they also provide job- and schedule-management
interfaces.
cron
at
systemd timers
- Traditional Unix has `cron` and `at` for job scheduling.
cups, lpd
- Android has system [alarm services](https://developer.android.com/reference/android/app/AlarmManager).
mail queue management?
- systemd has [timers](https://www.freedesktop.org/software/systemd/man/systemd.timer.html) as
a replacement for `cron`.
- systemd also has a [job
engine](https://www.freedesktop.org/software/systemd/man/systemd-run.html) (see also
[here](https://www.freedesktop.org/software/systemd/man/systemctl.html#Job%20Commands) and
[here](https://bl33pbl0p.github.io/systemd.html)) for decoupling work in space and time.
- print queues like `lpd` and `cups` are job management engines at heart
- you can even see the mail queue as a kind of job queue (and if you squint *very* hard, you
can see all the intermediate buffers in a networking or IPC system as job queues; cf [Day
2008][]).
## User interface
- provide user interface facilities
The user interface is a classic example of a system facility that cross-cuts individual
applications and tasks. A system layer must provide some kind of user interface service to
applications (and to its own system services).
(TO APPLICATIONS but I guess also for the system layer itself)
- At a minimum, Unix-like kernels offer `tty`s. Access to a system via `ssh` is a natural next
step.
- provide system-wide "cut-and-paste" services for user-controlled IPC
- X11 is the traditional Unix user interface, with its own IPC protocol and ad-hoc object
model; wayland is a recent entrant into a similar space, also with its own IPC protocol and
ad-hoc object model. Android offers [SurfaceFlinger and
WindowManager](https://source.android.com/docs/core/graphics/surfaceflinger-windowmanager)
along with a large library of user interface widgets; the underlying IPC is presumably
binder ([see above](#binder)).
email for talking to users
notifications - system tray
- In Smalltalk-80-derived systems (like [squeak](https://squeak.org/)), the user interface is
tightly integrated with the multiprocessing and IPC facilities (such as they are). Squeak
also offers simple, quick-and-dirty "alert" and "prompt" APIs to applications, similar to
the
[`alert`](https://developer.mozilla.org/en-US/docs/Web/API/Window/alert)/[`prompt`](https://developer.mozilla.org/en-US/docs/Web/API/Window/prompt)/[`confirm`](https://developer.mozilla.org/en-US/docs/Web/API/Window/confirm)
functions included in web browsers.
- ui facilities
- the thing that asks for user input during apt configuration
- the alert/prompt boxes in a web browser (?)
- notifications
- system tray, applets
- Many, but not all, system layers provide a system-wide "cut and paste" service as part of
their user interface, for *user-controlled* IPC. X11 applications have a clipboard
convention; Mac OS, Windows, Android etc. have a standard clipboard.
## System configuration and user preferences
- System-level *email* can be seen as a form of user interface for reaching users (system
administrators).
- provide system configuration and user preference databases
- Many desktop environments include *notifications* and some form of *system tray* giving
quick reference to high-level perspectives on system status as previously discussed.
- system configuration database
- system settings manager
- Some system-layer administration tasks require user interface: for example, user input
during `apt` package configuration.
## Software management
- support software package installation, upgrade, and removal
cc
apt
apk
System management involves upgrade of system code and installation, management and removal of
application code. Android has a solid story around software management. Linux distributions
tend to have package management tools (e.g. `apt`, `apk`, `yum` etc.). Stretching a little
further, one might include the system programming language and its development environment as
part of the software management portion of a system layer: for example, many Unix-like systems
include `cc`, and Smalltalk systems make the system programming language (Smalltalk) available
from any text input field.
## State replication and data backup
- offer state replication services
- provide data backup facilities
The notion of state replication appears in many different contexts. For example, user
contact/address databases must often be replicated and accessible across devices. System
configuration data is often shared across servers in a cloud deployment (ansible, puppet). Many
add-on applications like Dropbox, NextCloud, Syncthing etc. add file replication to a system.
Applications like Google Keep, to-do list applications, and other sticky-notes/reminder apps
replicate their databases across machines. Very few system layer realizations offer a coherent
data replication facility, despite its clear cross-application utility.
- state replication services
- contact book, address book
- file replication across machines
- sticky-notes, google keep
- todo list
- backup facilities
- Time Machine
Relatedly, preserving user data in case of calamity is a core operating system feature. Despite
this, few whole systems offer a coherent data backup facility. Exceptions include Apple's Time
Machine and Google's Android backup support libraries.
## Synthesis, or, Toward a Complete Vision of a System Layer
@ -354,6 +404,8 @@ to be good IPC and state-management and -introspection.
| GNOME | | ✓ | ✓ | ✓ | | | | | ✓ | ✓ | | |
| Android | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | |
- ideally, a system layer's security mechanisms would offer a coherent, system-wide approach
to security and privacy. few do so
## References
@ -377,6 +429,10 @@ Klein. Evaluating Software Architectures: Methods and Case Studies. Addison-Wesl
[**Corbet 2019**] <span id="ref:corbet19"> Corbet, Jonathan. “Systemd as Tragedy.” LWN.Net,
January 28, 2019. <https://lwn.net/Articles/777595/>.</span>
[Day 2008]: #ref:day08
[**Day 2008**] <span id="ref:day08"> Day, John. Patterns in Network Architecture: A Return to
Fundamentals. Prentice Hall, 2008.</span>
[Ellison 1999]: #ref:ellison99
[**Ellison 1999**] <span id="ref:ellison99"> Ellison, Carl. SPKI Requirements. Request for
Comments 2692. RFC Editor, 1999. <https://doi.org/10.17487/RFC2692>.</span>
@ -413,3 +469,6 @@ Comments 2693. RFC Editor, 1999. <https://doi.org/10.17487/RFC2693>.</span>
[^libc-resolver]: The resolver built in to libc plays the major part in this; but things like
dnsmasq play a role too, especially when combined with virtual machines running within a
host.
[^binder-vs-syndicate]: Looking at binder, I see *strong* similarities with the [Syndicated
Actor Model](syndicated-actor-model.md) and its [protocol](protocol.md)!