synit-manual/src/protocol.md

972 lines
44 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Syndicate Protocol
Actors that share a local [scope](./glossary#scope) can communicate directly. To communicate
further afield, scopes are *connected* using [relay
actors](./glossary.md#relay).[^analogy-to-subnets] Relays allow *indirect* communication:
distant entities can be addressed as if they were local.
Relays exchange *Syndicate Protocol* messages across a [transport](./glossary.md#transport). A
*transport* is the underlying medium connecting one relay to its counterpart(s). For example, a
TLS-on-TCP/IP socket may connect a pair of relays to one another, or a UDP multicast socket may
connect an entire group of relays across an ethernet.[^relaying-over-syndicate]
<!--
```ditaa protocol-transport
+-------------+ +-------------+
| | transport | |
| +-----+ +------------------------+ +-----+ |
|. . . |Relay|<-->| TLS/TCP/IP socket |<-->|Relay| . . . |
| +-----+ +------------------------+ +-----+ |
| | | |
+-------------+ +-------------+
```
-->
## Transport requirements
Transports must
- be able to carry [Preserves](./glossary.md#preserves) values back and forth,
- be reliable and in-order,
- have a well-defined session lifecycle (created → connected → disconnected), and
- assure confidentiality, integrity, authenticity, and replay-resistance.
This document focuses primarily on point-to-point transports, discussing multicast and
in-memory variations briefly toward the end.
## Roles and session lifecycle
The protocol is completely symmetric, aside from [certain conventions detailed
below](#well-known-oids) about the entities available for use immediately upon connection
establishment. It is *not* a client/server protocol.
**Session startup.** To begin a session on a newly-established point-to-point link, a relay
simply starts sending packets. Each peer starts the session with an empty entity reference map
([see below](#membranes)) and making no assertions in either the outbound (on behalf of local
entities) or inbound (on behalf of the remote peer) directions.
**Session teardown.** At the end of a session, terminated normally or abnormally, cleanly or
through involuntary transport disconnection, all published assertions are
retracted.[^automatic-when-implemented-with-sam] This is in keeping with the essence of the
[Syndicated Actor Model (SAM)](./glossary.md#syndicated-actor-model).
## Packet definitions
Packets exchanged by relays are [Preserves](./glossary.md#preserves) values defined using
Preserves [schema](./glossary.md#schema).
```preserves-schema
Packet = Turn / Error / Extension .
```
A packet may be a *turn*, an *error*, or an *extension*.
Packets are neither commands nor responses; they are *events*.
### Extension packets
```preserves-schema
Extension = <<rec> @label any @fields [any ...]> .
```
An extension packet must be a Preserves [record](./glossary.md#record), but is otherwise
unconstrained.
**Handling.**
Peers MUST ignore extensions that they do not understand.[^no-extensions-yet]
### Error packets
```preserves-schema
Error = <error @message string @detail any>.
```
**Handling.**
An error packet describes something that went wrong on the other end of the connection. Error
packets are primarily intended for debugging.
Receipt of an error packet denotes that the sender has terminated (crashed) and will not
respond further; the connection will usually be closed shortly thereafter.
Error packets are *optional*: connections may simply be closed without comment.
### Turn packets
```preserves-schema
Turn = [TurnEvent ...].
TurnEvent = [@oid Oid @event Event].
Event = Assert / Retract / Message / Sync .
Assert = <A @assertion Assertion @handle Handle>.
Retract = <R @handle Handle>.
Message = <M @body Assertion>.
Sync = <S @peer #:#t>.
Assertion = any .
Handle = int .
Oid = int .
```
A `Turn` is the most important packet variant. It directly reflects the
[SAM](./glossary.md#syndicated-actor-model) notion of a [turn](./glossary.md#turn).
**Handling.** Each `Turn` carries [events](./glossary.md#event) to be delivered to
[entities](./glossary.md#entity) residing in the scope at the receiving end of the transport.
Each event is either publication of an assertion, retraction of a previously-published
assertion, delivery of a single message, or a [synchronization](./glossary.md#synchronization)
event.
Upon receipt of a `Turn`, the sequence of `TurnEvent`s is examined. The
[OID](./glossary.md#oid) in each `TurnEvent` selects an entity known to the recipient. If a
particular `TurnEvent`'s OID is not mapped to an entity, the `TurnEvent` is silently ignored,
and the remaining `TurnEvent`s in the `Turn` are processed.
The `assertion` fields of `Assert` events and the `body` fields of `Message` events may contain
any Preserves value, including embedded entity references. On the wire, these will always be
formatted [as described below](#capabilities-on-the-wire). As each `Assert` or `Message` is
processed, embedded references are mapped to internal references. Symmetrically, internal
references are mapped to their external form prior to transmission. [The mapping procedure to
follow is detailed below.](#membranes)
**Turn boundaries.** In the case that the receiving party is structured internally using the
SAM, it is important to preserve turn boundaries. Since turn boundaries are a
per-[actor](./glossary.md#actor) concept, but a `Turn` mentions only entities, the receiver
must map entities to actors, group `TurnEvent`s into per-actor queues, and deliver those queues
to each actor in a single SAM turn for each actor.
**Uniqueness.** `Handle`s are meaningful only within the scope of a particular transport
connection. Each `Handle` refers to at most one published assertion at a time, within that
connection. Each `Assert` event causes its `Handle` to denote the corresponding `Assertion`;
the `Handle` MUST be unused at the time of processing of the event. Similarly, each `Retract`
event unbinds its `Handle`; the `Handle` MUST denote an assertion at the time of processing.
## Capabilities on the wire
References embedded in `Turn` packets denote *capabilities* for interacting with some entity.
For example, assertion of a capability-bearing record could appear as the following `Event`:
```preserves
<A <please-reply-to #:[0 555]> 1093>
```
The `#:[0 555]` is [concrete Preserves text syntax](./guide/preserves.md#concrete-syntax) for
an embedded (`#:`) value (`[0 555]`).
In the Syndicate Protocol, these embedded values MUST conform to the `WireRef`
schema:[^slightly-silly-wireref]
```preserves-schema
WireRef = @mine [0 @oid Oid] / @yours [1 @oid Oid @attenuation Caveat ...].
Oid = int .
```
The `mine` variant denotes capability references managed by the *sender* of a given packet; the
`yours` variant, the *receiver* of the packet. A relay receiving a packet mentioning `#:[0
555]` will use `#:[1 555]` in later responses that refer to that same entity, and *vice versa*.
### Attenuation of authority
A `yours`-variant capability may include a request[^attenuation-is-not-enforceable] to impose
additional conditions on the receiver's use of its own capability, known as an
[attenuation](./glossary.md#attenuation) of the capability's authority.
An attenuation is a chain of `Caveat`s.[^caveat-terminology-macaroon] A `Caveat` acts as a
function that, given a Preserves value representing an assertion or message body, yields either
a possibly-rewritten value, or no value at all.[^affine-caveats] In the latter case, the value
has been *rejected*. In the former case, the rewritten value is used as input to the next
`Caveat` in the chain, or as the final assertion or message body for delivery to the entity
backing the capability.
The chain of `Caveats` in an attenuation is written down in *reverse* order: newer `Caveat`s
are appended to the sequence, and each `Caveat`'s output is fed into the input of the next
leftward `Caveat` in the sequence. If no `Caveat`s are present, the capability is unattenuated,
and inputs are passed through to the backing capability unmodified.
```preserves-schema
Caveat = Rewrite / Alts / Reject / @unknown any .
Rewrite = <rewrite @pattern Pattern @template Template> .
Reject = <reject @pattern Pattern> .
Alts = <or @alternatives [Rewrite ...]>.
```
A `Caveat` can be:
- a single `Rewrite`[^single-rewrite-meaning], or a sequence of alternative possible rewrites
`Alts`, to be tried in left-to-right order until one of them accepts the input or there are
none left to try;
- a `Reject`, which passes all inputs unmodified except those matching the contained pattern; or
- an `unknown` caveat, which rejects all inputs.
Each `Rewrite` applies its `Pattern` to its input. If the `Pattern` matches, the bindings
captured by the pattern are gathered together and used in instantiation of the `Rewrite`'s
`Template`, yielding the output from the `Caveat`. If the pattern does not match, the `Rewrite`
has rejected the input, and other `alternatives` are tried until none remain, at which point
the whole `Caveat` has rejected the input and processing of the triggering event stops.
### Patterns
A `Pattern` within a rewrite can be any of the following variants:
```preserves-schema
Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound .
```
**Wildcard.** `PDiscard` matches any value:
```preserves-schema
PDiscard = <_>.
```
**Atomic type.** `PAtom` requires that a matched value be a boolean, a double-precision float, an
integer, a string, a binary blob, or a symbol, respectively:
```preserves-schema
PAtom = =Boolean / =Double / =SignedInteger / =String / =ByteString / =Symbol .
```
**Embedded value.** `PEmbedded` requires that a matched value be an embedded capability:
```preserves-schema
PEmbedded = =Embedded .
```
**Binding.** `PBind` first *captures* the matched value, adding it to the bindings vector, and then applies
the nested `pattern`. If the subpattern matches, the `PBind` succeeds; otherwise, it fails:
```preserves-schema
PBind = <bind @pattern Pattern>.
```
**Conjunction.** `PAnd` is a conjunction of patterns; every pattern in `patterns` must match for the `PAnd` to
match:
```preserves-schema
PAnd = <and @patterns [Pattern ...]>.
```
**Negation.** `PNot` is a pattern negation: if `pattern` matches, the `PNot` fails to match, and *vice
versa*. It is an error for `pattern` to include any `PBind` subpatterns.
```preserves-schema
PNot = <not @pattern Pattern>.
```
**Literal.** `Lit` is an exact match pattern. If the matched value is exactly equal to `value` (according to
Preserves' own built-in equivalence relation), the match succeeds; otherwise, it fails:
```preserves-schema
Lit = <lit @value any>.
```
**Compound.** Finally, `PCompound` patterns match compound data structures. The `rec` variant demands that a
matched value be a record, with label exactly equal to `label` and fields one-for-one matching
the `Pattern`s in `fields`; the `arr` variant demands a sequence, with each element matching
the corresponding element of `items`; and `dict` demands a dictionary having *at least* entries
named by the keys of the `entries` dictionary, each matching the corresponding `Pattern`.
```preserves-schema
PCompound =
/ @rec <rec @label any @fields [Pattern ...]>
/ @arr <arr @items [Pattern ...]>
/ @dict <dict @entries { any: Pattern ...:... }> .
```
### Bindings
Matching notionally produces a sequence of values, one for each `PBind` in the pattern.
When a `PBind` pattern is seen, the matcher *first* appends the matched value to the binding
sequence and *then* recurses on the nested subpattern. This makes binding *indexes* appear in
left-to-right order as a `Pattern` is read.
**Example.** Given the pattern `<bind <arr [<bind <_>>, <bind <_>>]>>` and the matched value
`["a" "b"]`, the resulting captured values are, in order, `["a" "b"]`, `"a"`, and `"b"`; the
template `<ref 0>` will be instantiated to `["1" "2"]`, `<ref 1>` to `"a"`, and `<ref 2>` to
`"b"`.
### Templates
A `Template` within a rewrite produces a concrete Preserves value when instantiated with a
vector of captured binding values. Template instantiation may fail, yielding no value.
A given `Template` may be any of the following variants:
```preserves-schema
Template = TAttenuate / TRef / Lit / TCompound .
```
`TAttenuate` first instantiates the sub-`template`. If it yields a value, and if that value is
an embedded reference (i.e. a capability), the `Caveat`s in `attenuation` are appended to the
(possibly-empty) sequence of `Caveat`s already present in the embedded capability. The
resulting possibly-attenuated capability is the final result of instantiation of the
`TAttenuate`.
```preserves-schema
TAttenuate = <attenuate @template Template @attenuation [Caveat ...]>.
```
`TRef` retrieves the `binding`th (0-based) index into the bindings vector, yielding the
associated captured value as the result of instantiation. It is an error if `binding` is less
than zero, or greater than or equal to the number of bindings in the bindings vector.
```preserves-schema
TRef = <ref @binding int>.
```
`Lit` (the same definition as used in the grammar for `Pattern` above) instantiates to exactly
its `value` argument:
```preserves-schema
Lit = <lit @value any>.
```
Finally, `TCompound` instantiates to compound data. The `rec` variant produces a record with
the given `label` and `fields`; `arr` produces an array; and `dict` a dictionary:
```preserves-schema
TCompound =
/ @rec <rec @label any @fields [Template ...]>
/ @arr <arr @items [Template ...]>
/ @dict <dict @entries { any: Template ...:... }> .
```
### Validity of Caveats
The above definitions imply some *validity constraints* on `Caveat`s.
- All `TRef`s must be bound: the index referred to must relate to the index associated with
some `PBind` in the pattern corresponding to the template.
- Binding under negation is forbidden: a `pattern` within a `PNot` may not include any `PBind`
constructors.
- The value produced by instantiation of `template` within a `TAttenuate` must be an embedded
reference (a capability).
Implementations MUST enforce these constraints (either statically or dynamically).
## Membranes
Every relay maintains two stateful objects called *membranes*. A membrane is a bidirectional
mapping between [OID](./glossary.md#oid) and relay-internal entity pointer. Membranes connect
embedded references on the wire to entity references local to the relay.
- The *import membrane* connects OIDs managed by the *remote* peer to local [relay
entities](#relay-entities) which proxy access to an "imported" remote entity.
- The *export membrane* connects OIDs managed by the *local* peer to any local "exported"
entities accessible to the peer.
```ditaa membranes
|
|
Export Membrane | Import Membrane
|
+-+ | +-+
Pointer | | ID | ID | |
0x1234 <-+-+-> "my 7" | "your 7"<-+-+-> 0x9abc
| | | | |
^ | | ^ | ^ | | ^
| | | | -+- | | | |
V | | | | | | V
/------\ | | \-------------/ | | /------\
|Entity| | | | | |Relay |<-- ...
\------/ | | | | |Entity|
0x1234 | | --------> | | \------/
=-------------+-+---= packets | | 0x9abc
0x462e | | <-------- =---+-+-------------=
/------\ | | | | 0xa043
|Relay | | | | | /------\
... -->|Entity| | | /-------------\ | | |Entity|
\------/ | | | | | | \------/
^ | | | -+- | | | ^
| | | | | | | | |
V | | V | V | | V
| | | | |
0x462e <-+-+->"your 3" | "my 3" <-+-+-> 0xa043
Pointer | | ID | ID | | Pointer
+-+ | +-+
|
Import Membrane | Export Membrane
|
|
```
<!--
```ditaa one-sided-membrane
---------------------------------------+
|
Membrane |
----+---- |
/--\ | | |
|A1|<----+ Pointer | ID +-+-------------transport---
\--/ | |
+---0x7f1065218700<-+-> 7
| <---- from remote
/--\ | peer
|A2|<-----0x7f1065229780<-+-> 11
\--/ |
| ----> to remote
+--0x7f10652fe7c0<-+-> 13 peer
/--\ | |
|A3|<----+ | +-+-------------------------
\--/ | |
|
|
----------------------------------------+
```
-->
<!--
Each relay rewrites the embedded references in the messages it sends and receives. It maps back
and forth between one scope's name for an entity and the other scope's name for the same
entity.
```ditaa protocol-scope-chains
(Illustrative Example)
Browser syndicateserver
+----------------+ WebSocket +-------+-----------------------+
|Inbrowser scope|<----------->| Relay |<-\ |
+----------------+ (2) +-------+ | +---------+ |
| ^ \->|Dataspace| |
| | +---------+ |
| V ^ ^ ^ (1) |
+----------------+ TCP/IP +-------+ | | | |
|Remote Syndicate|<----------->| Relay |<----+-/ | | |
|server/dataspace| (3) +-------+ | | | |
+----------------+ | V V V |
| +-------+ +-------+ |
| | Relay | | Relay | |
+----------+-------+-+-------+--+
^ ^
LAN multicast (4) | | UNIX
... <-------/---------/---------/---------/ | socket
| | | | (5)
v v v v
+-------+ +-------+ +-------+ +-------+
|. . . | |. . . | |. . . | |. . . |
+-------+ +-------+ +-------+ +-------+
```
In the diagram above, networks (scopes) 1 and 4 are *multicast*, while networks 2, 3 and 5 are
*point-to-point*. Four relays bridge scope 1 to scopes 2 through 5. Within each scope, peers
are able to interact with each other directly. Each point-to-point scope contains exactly two
peers.
-->
Logically, a membrane's state can be represented as a set of `WireSymbol` structures: a
`WireSymbol` is a triple of an OID, a local reference pointer (its *ref*), and a reference
count. There is never more than one `WireSymbol` associated with an OID or a ref.
A `WireSymbol` exists only so long as some assertion mentioning its OID exists across the relay
link. When the last assertion mentioning an OID is retracted, its `WireSymbol` is deleted.
Assertions mentioning a particular OID can come from *either side* of the relay link:
initially, a local reference is sent to the peer in an assertion, but then the peer may assert
something *back*, either targeting or mentioning the same entity. Care must be taken not to
release an OID entry prematurely in such situations.
For example, at least the following contribute to a `WireSymbol`'s reference count:
- The initial entry mapping a local entity ref to an well-known OID for use at session startup
([see below](#well-known-oids)) contributes a permanent reference.
- Mention of an OID in a received or sent `TurnEvent` adds one to the OID's reference count
for the duration of processing of the event. For `Assert` events in either direction, the
duration of processing is until the assertion is later retracted. For received `Message`
events, the duration of processing is until the incoming message has been forwarded on to
the target ref.
**"Transient" references.** Embedded references in `Message` event bodies are special. Because
messages, unlike assertions, have no notion of lifetime—they are forwarded and forgotten—it is
not possible for a message to cause establishment of a long-lived entry in a membrane's
`WireSymbol` set. Therefore, messages MUST NOT embed any reference not previously known to the
peer (a "transient reference"). In other words, only after using an *assertion* to introduce a
reference, associating a conversational context with its lifetime, is it permitted to discuss
the reference using *messages*. A relay receiving a message bearing a transient reference MUST
terminate the session with an error. A relay about to send such a message SHOULD preemptively
refuse to do so.
### <span id="inbound-rewriting"></span>Rewriting embedded references upon receipt
When processing a `Value` *v* in a received `Assert` or `Message` event, embedded references in
*v* are decoded from their [on-the-wire `WireRef` form](#capabilities-on-the-wire) to in-memory
ref-pointer form.
The value is recursively traversed. As the relay comes across each embedded `WireRef`,
- If it is of `mine` variant, it refers to an entity exported by the remote, sending peer. Its
OID is looked up in the import membrane.
- If no `WireSymbol` exists in the import membrane, one is created, mapping the OID to a
fresh [relay entity](#relay-entities) for the OID.
- If a `WireSymbol` is already present, its associated ref is substituted into *v*.
- If it is of `yours` variant, it refers to an entity previously exported by the local,
receiving peer. Its OID is looked up in the export membrane.
- If no `WireSymbol` exists for the OID, one is created, associating the OID with a dummy
inert entity ref. The dummy ref is substituted into *v*. It will later be released once
the reference count of the `WireSymbol` drops to zero.
- If a `WireSymbol` exists for the OID, and the `WireRef` is not
[attenuated](#attenuation-of-authority), the associated ref is substituted into *v*. If
the `WireRef` is attenuated, the associated ref is wrapped with the `Caveat`s from the
`WireRef` before its substitution into *v*.
- In each case, the `WireSymbol` associated with the OID has its reference count incremented
(if an `Assert` is being processed).
In addition, for `Assert` events, the `WireSymbol` (necessarily in the export membrane)
associated with the OID to which the incoming `Assert` is targetted has its reference count
incremented.
### <span id="outbound-rewriting"></span>Rewriting embedded references for transmission
When transmitting a `Value` *v* in an `Assert` or `Message` event, embedded references in *v*
are encoded from their in-memory ref-pointer form to [on-the-wire `WireRef`
form](#capabilities-on-the-wire).
The value is recursively traversed. As the relay comes across each embedded reference:
- The reference is first looked up in the export membrane. If an associated `WireSymbol` is
present in the export membrane, its OID is substituted as a `mine`-variant `WireRef` into
*v*.
- Otherwise, it is looked up in the import membrane. If *no* associated `WireSymbol` exists
there, a fresh OID and `WireSymbol` are placed in the export membrane, and the new OID is
substituted as a `mine`-variant `WireRef` into *v*. If a `WireSymbol` exists in the import
membrane, however, the embedded reference must be a local [relay entity](#relay-entities)
referencing a previously-imported remote entity:
- If the local entity reference has not been attenuated subsequent to its import, the OID it
was imported under is substituted as a `yours`-variant `WireRef` into *v* with an empty
attenuation.
- If it has been attenuated, [the relay may choose whether to trust the remote party to
enforce an attenuation request](#attenuation-of-authority). If it trusts the peer to
honour attenuation requests, it substitutes a `yours`-variant `WireRef` with non-empty
attenuation into *v*. Otherwise, a fresh OID and `WireSymbol` are placed in the export
membrane, with ref denoting the attenuated local reference, and the new OID is substituted
as a `mine`-variant `WireRef` into *v*.
## Relay entities
A relay entity is a local proxy for an entity at the other side of a relay link. It forwards
events delivered to it—`assert`, `retract`, `message` and `sync`—across the link to its
counterpart at the other end. It holds two pieces of state: a pointer to the relay link, and
the OID of the remote entity it represents. It packages all received events into `TurnEvent`s
which are then sent across the transport.
**Turn boundaries.** When the relay is structured internally using the SAM, it is important to
preserve turn boundaries. When all the relay entities of a given relay instance are managed by
a single actor, this will be natural: a single turn can deliver events to a group of entities
in the actor, so if the relay entity enqueues its `TurnEvent`s in a buffer which is flushed
into a `Turn` packet sent across the transport at the conclusion of the turn, the correct turn
boundaries will be preserved.
## <span id="well-known-oids">Client and server roles
While the protocol itself is symmetric, in many cases there will be one active ("client") and
one passive ("server") party during the establishment of a transport connection.
As an optional convention, a "server" MAY have a single entity exposed as *well-known OID* 0 at
the establishment of a connection, and a "client" MAY likewise expect OID 0 to resolve to some
pre-arranged entity. It is frequently useful for the pre-arranged entity to be a [gatekeeper
service](./builtin/gatekeeper.md), but direct exposure of a
[dataspace](./glossary.md#dataspace) or even some domain-specific object can also be useful.
Either or both party to a connection may play one role, the other, neither, or both.
APIs for making use of relays in programs should permit programs to supply to a
newly-constructed relay an (optional) *initial ref*, to be exposed as well-known OID 0; an
(optional) *initial OID*, to denote a remote well-known OID and to be immediately proxied by a
local relay entity; or both.
In the case of TCP/IP, the "client" role is often played by a `connect`ing party, and the
"server" by a `listen`ing party, but the opposite arrangement is also useful from time to time.
## Security considerations
The security considerations for this protocol fall into two categories: those having to do with
particular transports for relay instances, and those having to do with the protocol itself.
### Transport security
The security of an instance of the protocol depends on the security characteristics of its
transport.
**Confidentiality.** Parties outwith the communicating peers must not be able to deduce the
contents of packets sent back and forth: some of the packets may contain secrets. For example,
a `Resolve` message sent to a [gatekeeper service](./builtin/gatekeeper.md) contains a "bearer
capability", which conveys authority to any holder able to present it to the gatekeeper.
**Integrity.** Packets delivered to peers must be proof from tampering or other in-flight
damage.
**Authenticity.** Each packet delivered to a peer must have genuinely originated with another
party, and must have genuinely originated in the same session. Forgery of packets must be
prevented.
**Replay-resistance.** Each packet delivered to a peer must be delivered exactly once within
the context of the transport session. That is, replay of otherwise-authentic packets must not
be possible from outside the session.
### Protocol security
The protocol builds on, and directly reflects, the [object-capability security
model](./glossary.md#object-capability-model) of the SAM. Entities are accessed via unforgeable
references (OIDs). OIDs are meaningful only within the context of their transport session; in
this way, they are analogous to Unix file descriptors, which are small integers that
meaningfully denote objects only within the context of a single Unix process. If the transport
is secure, so is the reference.
Entities can only obtain references to other entities by the [standard methods by which
"connectivity begets
connectivity"](http://www.erights.org/elib/capability/ode/ode-capabilities.html); namely:
- *By initial conditions.* The relevant initial conditions here are the state of the relays at
the moment a transport session is established, including any mappings from [well-known
OIDs](#well-known-oids) to their underlying refs.
- *By parenthood and by endowment.* No direct provision is made for creation of new entities
in this protocol, so these do not apply.
- *By introduction.* Transmission of OIDs in `Turn` packets, and the associated [rules for
managing the mappings between OIDs and references](#membranes), are the normal method by
which references pass from one entity to another.
While transport confidentiality is important for preserving secrecy of secrets such as bearer
capabilities, OIDs do not need this kind of protection. An attacker able to observe OIDs
communicated via a transport does not gain authority to deliver events to the denoted entity.
At most, the attacker may glean information on patterns of interconnectivity among entities
communicating across a transport link.
## Relation to CapTP
This protocol is *strikingly* similar to a family of protocols known as
[CapTP](http://www.erights.org/elib/distrib/captp/index.html) (see, for example,
[here](http://www.erights.org/elib/distrib/captp/index.html),
[here](https://spritelyproject.org/news/what-is-captp.html) and
[here](https://github.com/ocapn/ocapn)). This is no accident: the Syndicated Actor Model draws
heavily on the actor model, and has over the years been incrementally evolving to be closer and
closer to the actor model as it appears in the [E programming
language](http://www.erights.org/). However, the Syndicate protocol described in this document
was developed based on the needs of the Syndicated Actor Model, without particular reference to
CapTP. This makes it all the more striking that the similarities should be so strong. No doubt
I have been subconsciously as well as consciously influenced by E's design, but perhaps there
might also be a Platonic form awaiting discovery somewhere nearby.
For example:
- CapTP has the notion of a "c-list [capability list] index", cognate with our OID. A c-list
index is meaningful only within the context of a transport connection, just like an OID is.
A given c-list index maps to a "live-ref", an in-memory pointer to an object, in the same
way that an OID maps to a ref via a `WireSymbol`.
- CapTP has "[the four tables](http://www.erights.org/elib/distrib/captp/4tables.html)" at
each end of a connection; each of our relays has two [membranes](#membranes), each having
two unidirectional mapping tables.
- Syndicate [gatekeeper services](./builtin/gatekeeper.md) borrow the concept of a
[SturdyRef](http://wiki.erights.org/wiki/SturdyRef) directly from CapTP. However, the notion
of a gatekeeper entity at well-known OID 0 is an example of convergent evolution in action:
in the CapTP world, the [analogous
service](http://www.erights.org/elib/distrib/captp/NonceLocator.html) happens also to be
available at c-list index 0, by convention.
A notable difference is that this protocol completely lacks support for the promises/futures of
CapTP. CapTP c-list indices are just one part of a framework of
[descriptors](http://www.erights.org/elib/distrib/captp/index.html) (*desc*s) denoting various
kinds of remote object and eventual remote-procedure-call (RPC) result. The SAM handles RPC in
a different, more low-level way.
## Specific transport mappings
For now, this document focuses on `SOCK_STREAM`-like transports: reliable, in-order,
bidirectional, connection-oriented, fully-duplex byte streams. While these transports naturally
have a certain level of integrity assurance and replay-resistance associated with them, special
care should be taken in the case of non-cryptographic transport protocols like plain TCP/IP.
To use such a transport for this protocol, establish a connection and begin transmitting
[`Packet`s](#packet-definitions) encoded as Preserves values using either the Preserves [text
syntax](https://preserves.dev/preserves-text.html) or the Preserves
[machine-oriented syntax](https://preserves.dev/preserves-binary.html).
The session starts with the first packet and ends with transport disconnection. If either peer
in a connection detects a syntax error, it MUST disconnect the transport. A responding server
MUST support the binary syntax, and MAY also support the text syntax. It can autodetect the
syntax variant by following [the rules in the
specification](https://preserves.dev/preserves-binary.html#appendix-autodetection-of-textual-or-binary-syntax):
the first byte of a valid binary-syntax Preserves document is guaranteed not to be
interpretable as the start of a valid UTF-8 sequence.
`Packet`s encoded in either binary or text syntax are self-delimiting. However, peers using
text syntax MAY choose to insert whitespace (e.g. newline) after each transmitted packet.
Some domain-specific details are also relevant:
- **Unix-domain sockets.** An additional layer of authentication checks can be made based on
process-ID and user-ID credentials associated with each Unix-domain socket.
- **TCP/IP sockets.** Plain TCP/IP sockets offer only weak message integrity and
replay-resistance guarantees, and offer no authenticity or confidentiality guarantees at
all. Plain TCP/IP sockets SHOULD NOT be used; consider using TLS sockets instead.
- **TLS atop TCP/IP.** An additional layer of authentication checks can be made based on the
signatures and certificates exchanged during TLS setup.
> TODO: concretely develop some recommendations for ordinary use of TLS certificates,
> including referencing a domain name in a `SturdyRef`, checking the presented certificate,
> and requiring SNI at the server end.
- **WebSockets atop HTTP 1.x.** These suffer similar flaws to plain TCP/IP sockets and SHOULD NOT
be used.
- **WebSockets atop HTTPS 1.x.** Similar considerations to the use of TLS sockets apply
regarding authentication checks. WebSocket messages are self-delimiting; peers MUST place
exactly one `Packet` in each WebSocket message. Since (a) WebSockets are established after a
standard HTTP(S) message header exchange, (b) every HTTP(S) request header starts with an
ASCII letter, and (c) every `Packet` in text syntax begins with the ASCII "`<`" character,
it is possible to autodetect use of a WebSocket protocol multiplexed on a server socket that
is also able to handle plain Preserves binary and/or text syntax for `Packet`s: any ASCII
character between "`A`" and "`Z`" or "`a`" and "`z`" must be HTTP, an ASCII "`<`" must be
Preserves text syntax, and any byte with the high bit set must be Preserves binary syntax.
## Appendix: Complete schema of the protocol
The following is a consolidated form of the definitions from the text above.
### Protocol packets
The authoritative
version of this schema is
[`[syndicate-protocols]/schemas/protocol.prs`](https://git.syndicate-lang.org/syndicate-lang/syndicate-protocols/src/branch/main/schemas/protocol.prs).
```preserves-schema
version 1 .
Packet = Turn / Error / Extension .
Extension = <<rec> @label any @fields [any ...]> .
Error = <error @message string @detail any>.
Assertion = any .
Handle = int .
Event = Assert / Retract / Message / Sync .
Oid = int .
Turn = [TurnEvent ...].
TurnEvent = [@oid Oid @event Event].
Assert = <A @assertion Assertion @handle Handle>.
Retract = <R @handle Handle>.
Message = <M @body Assertion>.
Sync = <S @peer #:#t>.
```
### Capabilities, WireRefs, and attenuations
The authoritative version of this schema is
[`[syndicate-protocols]/schemas/sturdy.prs`](https://git.syndicate-lang.org/syndicate-lang/syndicate-protocols/src/branch/main/schemas/sturdy.prs).
```preserves-schema
version 1 .
Caveat = Rewrite / Alts / Reject / @unknown any .
Rewrite = <rewrite @pattern Pattern @template Template> .
Reject = <reject @pattern Pattern> .
Alts = <or @alternatives [Rewrite ...]>.
Oid = int .
WireRef = @mine [0 @oid Oid] / @yours [1 @oid Oid @attenuation Caveat ...].
Lit = <lit @value any>.
Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound .
PDiscard = <_>.
PAtom = =Boolean / =Double / =SignedInteger / =String / =ByteString / =Symbol .
PEmbedded = =Embedded .
PBind = <bind @pattern Pattern>.
PAnd = <and @patterns [Pattern ...]>.
PNot = <not @pattern Pattern>.
PCompound =
/ @rec <rec @label any @fields [Pattern ...]>
/ @arr <arr @items [Pattern ...]>
/ @dict <dict @entries { any: Pattern ...:... }> .
Template = TAttenuate / TRef / Lit / TCompound .
TAttenuate = <attenuate @template Template @attenuation [Caveat ...]>.
TRef = <ref @binding int>.
TCompound =
/ @rec <rec @label any @fields [Template ...]>
/ @arr <arr @items [Template ...]>
/ @dict <dict @entries { any: Template ...:... }> .
```
## Appendix: Pseudocode for attenuation, pattern matching, and template instantiation
### Attenuation
```python
def attenuate(caveats, value):
for caveat in reversed(caveats):
value = applyCaveat(caveat, value)
if value is None:
return None
return value
def applyCaveat(caveat, value):
if caveat is 'Alts' variant:
for rewrite in caveat.alternatives:
possibleResult = tryRewrite(rewrite, value);
if possibleResult is not None:
return possibleResult
return None
if caveat is 'Rewrite' variant:
return tryRewrite(caveat, value)
if caveat is 'Reject' variant:
if applyPattern(caveat.pattern, value) is None:
return value
else:
return None
if caveat is 'unknown' variant:
return None
def tryRewrite(rewrite, value):
bindings = applyPattern(rewrite.pattern, value)
if bindings is None:
return None
else:
return instantiateTemplate(rewrite.template, bindings)
```
### Pattern matching
```python
def match(pattern, value, bindings):
if pattern is 'PDiscard' variant:
return True
if pattern is 'PAtom' variant:
return True if value is of the appropriate atomic class else False
if pattern is 'PEmbedded' variant:
return True if value is a capability else False
if pattern is 'PBind' variant:
append value to bindings
return match(pattern.pattern, value, bindings)
if pattern is 'PAnd' variant:
for p in pattern.patterns:
if not match(p, value, bindings):
return False
return True
if pattern is 'PNot' variant:
return False if match(pattern.pattern, value, bindings) else True
if pattern is 'Lit' variant:
return (pattern.value == value)
if pattern is 'PCompound' variant:
if pattern is 'rec' variant:
if value is not a record: return False
if value.label is not equal to pattern.label: return False
if value.fields.length is not equal to pattern.fields.length: return False
for i in [0 .. pattern.fields.length):
if not match(pattern.fields[i], value.fields[i], bindings):
return False
return True
if pattern is 'arr' variant:
if value is not a sequence: return False
if value.length is not equal to pattern.items.length: return False
for i in [0 .. pattern.items.length):
if not match(pattern.items[i], value[i], bindings):
return False
return True
if pattern is 'dict' variant:
if value is not a dictionary: return False
for k in keys of pattern.entries:
if k not in keys of value: return False
if not match(pattern.entries[k], value[k], bindings):
return False
return True
```
### Template instantiation
```python
def instantiate(template, bindings):
if template is 'TAttenuate' variant:
c = instantiate(template.template, bindings)
if c is not a capability: raise an exception
c = c with the caveats in template.attenuation appended to the existing
attenuation in c
return c
if template is 'TRef' variant:
if 0 template.binding < bindings.length:
return bindings[template.binding]
else:
raise an exception
if template is 'Lit' variant:
return template.value
if template is 'TCompound' variant:
if template is 'rec' variant:
return Record(label=template.label,
fields=[instantiate(t, bindings) for t in template.fields])
if template is 'arr' variant:
return [instantiate(t, bindings) for t in template.items]
if template is 'dict' variant:
result = {}
for k in keys of template.entries:
result[k] = instantiate(template.entries[k], bindings)
return result
```
---
#### Notes
[^analogy-to-subnets]: Strictly speaking, scope *subnets* are connected by relay actors. The
situation is directly analogous to IP subnets being connected by IP routers.
[^relaying-over-syndicate]: In fact, it makes perfect sense to run the relay protocol between
actors that are *already connected* in some scope: this is like running a VPN, tunnelling
IP over IP. A variation of the Syndicate Protocol like this gives [federated
dataspaces](https://syndicate-lang.org/about/history/#postdoc).
[^automatic-when-implemented-with-sam]: This process of assertion-retraction on termination is
largely automatic when relay actors are structured internally using the SAM: simply
terminating a SAM actor automatically retracts its published assertions.
[^no-extensions-yet]: This specification does not define any extensions, but future revisions
could, for example, use extensions to perform version-negotiation. Another potential future
use could be to propagate provenance information for tracing/debugging.
[^slightly-silly-wireref]: The syntax for `WireRef`s is slightly silly, using tuples as
quasi-records with `0` and `1` acting as quasi-labels. It would probably be better to use
real records, like `<my @oid Oid>` and `<yours @oid Oid @attenuation [Caveat ...]>`. Pros:
less cryptic. Cons: slightly more verbose on the wire. TODO: should we revise the spec in
this regard?
[^attenuation-is-not-enforceable]: Such conditions can only ever be requests: after all, every
`yours`-capability is already completely accessible to the recipient of the packet.
Similarly, it does not make sense to include an attenuation description on a
`my`-capability. However, in every case, if a party wishes to *enforce* an attenuation on a
`my`- or `yours`-capability, it may record the attenuation against the underlying
capability internally, issuing to its peers a fresh `my`-capability denoting the attenuated
capability.
[^caveat-terminology-macaroon]: This terminology, "caveat", is lifted from the excellent paper
on [Macaroons](./glossary.md#macaroon), where it is used to describe a more general
mechanism. Future versions of this specification may opt to include some of this
generality.
[^affine-caveats]: `Caveat`s are thus *affine*.
[^single-rewrite-meaning]: A single `Rewrite` *R* is equivalent to `<or [`*R*`]>`.