# Syndicate Protocol Actors that share a local [scope](./glossary#scope) can communicate directly. To communicate further afield, scopes are *connected* using [relay actors](./glossary.md#relay).[^analogy-to-subnets] Relays allow *indirect* communication: distant entities can be addressed as if they were local. Relays exchange *Syndicate Protocol* messages across a [transport](./glossary.md#transport). A *transport* is the underlying medium connecting one relay to its counterpart(s). For example, a TLS-on-TCP/IP socket may connect a pair of relays to one another, or a UDP multicast socket may connect an entire group of relays across an ethernet.[^relaying-over-syndicate] | TLS/TCP/IP socket |<-->|Relay| . . . | | +-----+ +------------------------+ +-----+ | | | | | +-------------+ +-------------+ ``` --> ## Transport requirements Transports must - be able to carry [Preserves](./glossary.md#preserves) values back and forth, - be reliable and in-order, - have a well-defined session lifecycle (created → connected → disconnected), and - assure confidentiality, integrity, authenticity, and replay-resistance. This document focuses primarily on point-to-point transports, discussing multicast and in-memory variations briefly toward the end. ## Roles and session lifecycle The protocol is completely symmetric, aside from [certain conventions detailed below](#well-known-oids) about the entities available for use immediately upon connection establishment. It is *not* a client/server protocol. **Session startup.** To begin a session on a newly-established point-to-point link, a relay simply starts sending packets. Each peer starts the session with an empty entity reference map ([see below](#membranes)) and making no assertions in either the outbound (on behalf of local entities) or inbound (on behalf of the remote peer) directions. **Session teardown.** At the end of a session, terminated normally or abnormally, cleanly or through involuntary transport disconnection, all published assertions are retracted.[^automatic-when-implemented-with-sam] This is in keeping with the essence of the [Syndicated Actor Model (SAM)](./glossary.md#syndicated-actor-model). ## Packet definitions Packets exchanged by relays are [Preserves](./glossary.md#preserves) values defined using Preserves [schema](./glossary.md#schema). ```preserves-schema Packet = Turn / Error / Extension . ``` A packet may be a *turn*, an *error*, or an *extension*. Packets are neither commands nor responses; they are *events*. ### Extension packets ```preserves-schema Extension = < @label any @fields [any ...]> . ``` An extension packet must be a Preserves [record](./glossary.md#record), but is otherwise unconstrained. **Handling.** Peers MUST ignore extensions that they do not understand.[^no-extensions-yet] ### Error packets ```preserves-schema Error = . ``` **Handling.** An error packet describes something that went wrong on the other end of the connection. Error packets are primarily intended for debugging. Receipt of an error packet denotes that the sender has terminated (crashed) and will not respond further; the connection will usually be closed shortly thereafter. Error packets are *optional*: connections may simply be closed without comment. ### Turn packets ```preserves-schema Turn = [TurnEvent ...]. TurnEvent = [@oid Oid @event Event]. Event = Assert / Retract / Message / Sync . Assert = . Retract = . Message = . Sync = . Assertion = any . Handle = int . Oid = int . ``` A `Turn` is the most important packet variant. It directly reflects the [SAM](./glossary.md#syndicated-actor-model) notion of a [turn](./glossary.md#turn). **Handling.** Each `Turn` carries [events](./glossary.md#event) to be delivered to [entities](./glossary.md#entity) residing in the scope at the receiving end of the transport. Each event is either publication of an assertion, retraction of a previously-published assertion, delivery of a single message, or a [synchronization](./glossary.md#synchronization) event. Upon receipt of a `Turn`, the sequence of `TurnEvent`s is examined. The [OID](./glossary.md#oid) in each `TurnEvent` selects an entity known to the recipient. If a particular `TurnEvent`'s OID is not mapped to an entity, the `TurnEvent` is silently ignored, and the remaining `TurnEvent`s in the `Turn` are processed. The `assertion` fields of `Assert` events and the `body` fields of `Message` events may contain any Preserves value, including embedded entity references. On the wire, these will always be formatted [as described below](#capabilities-on-the-wire). As each `Assert` or `Message` is processed, embedded references are mapped to internal references. Symmetrically, internal references are mapped to their external form prior to transmission. [The mapping procedure to follow is detailed below.](#membranes) **Turn boundaries.** In the case that the receiving party is structured internally using the SAM, it is important to preserve turn boundaries. Since turn boundaries are a per-[actor](./glossary.md#actor) concept, but a `Turn` mentions only entities, the receiver must map entities to actors, group `TurnEvent`s into per-actor queues, and deliver those queues to each actor in a single SAM turn for each actor. **Uniqueness.** `Handle`s are meaningful only within the scope of a particular transport connection. Each `Handle` refers to at most one published assertion at a time, within that connection. Each `Assert` event causes its `Handle` to denote the corresponding `Assertion`; the `Handle` MUST be unused at the time of processing of the event. Similarly, each `Retract` event unbinds its `Handle`; the `Handle` MUST denote an assertion at the time of processing. ## Capabilities on the wire References embedded in `Turn` packets denote *capabilities* for interacting with some entity. For example, assertion of a capability-bearing record could appear as the following `Event`: ```preserves 1093> ``` The `#:[0 555]` is [concrete Preserves text syntax](./guide/preserves.md#concrete-syntax) for an embedded (`#:`) value (`[0 555]`). In the Syndicate Protocol, these embedded values MUST conform to the `WireRef` schema:[^slightly-silly-wireref] ```preserves-schema WireRef = @mine [0 @oid Oid] / @yours [1 @oid Oid @attenuation Caveat ...]. Oid = int . ``` The `mine` variant denotes capability references managed by the *sender* of a given packet; the `yours` variant, the *receiver* of the packet. A relay receiving a packet mentioning `#:[0 555]` will use `#:[1 555]` in later responses that refer to that same entity, and *vice versa*. ### Attenuation of authority A `yours`-variant capability may include a request[^attenuation-is-not-enforceable] to impose additional conditions on the receiver's use of its own capability, known as an [attenuation](./glossary.md#attenuation) of the capability's authority. An attenuation is a chain of `Caveat`s.[^caveat-terminology-macaroon] A `Caveat` acts as a function that, given a Preserves value representing an assertion or message body, yields either a possibly-rewritten value, or no value at all.[^affine-caveats] In the latter case, the value has been *rejected*. In the former case, the rewritten value is used as input to the next `Caveat` in the chain, or as the final assertion or message body for delivery to the entity backing the capability. The chain of `Caveats` in an attenuation is written down in *reverse* order: newer `Caveat`s are appended to the sequence, and each `Caveat`'s output is fed into the input of the next leftward `Caveat` in the sequence. If no `Caveat`s are present, the capability is unattenuated, and inputs are passed through to the backing capability unmodified. ```preserves-schema Caveat = Rewrite / Alts / Reject / @unknown any . Rewrite = . Reject = . Alts = . ``` A `Caveat` can be: - a single `Rewrite`[^single-rewrite-meaning], or a sequence of alternative possible rewrites `Alts`, to be tried in left-to-right order until one of them accepts the input or there are none left to try; - a `Reject`, which passes all inputs unmodified except those matching the contained pattern; or - an `unknown` caveat, which rejects all inputs. Each `Rewrite` applies its `Pattern` to its input. If the `Pattern` matches, the bindings captured by the pattern are gathered together and used in instantiation of the `Rewrite`'s `Template`, yielding the output from the `Caveat`. If the pattern does not match, the `Rewrite` has rejected the input, and other `alternatives` are tried until none remain, at which point the whole `Caveat` has rejected the input and processing of the triggering event stops. ### Patterns A `Pattern` within a rewrite can be any of the following variants: ```preserves-schema Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound . ``` **Wildcard.** `PDiscard` matches any value: ```preserves-schema PDiscard = <_>. ``` **Atomic type.** `PAtom` requires that a matched value be a boolean, a double-precision float, an integer, a string, a binary blob, or a symbol, respectively: ```preserves-schema PAtom = =Boolean / =Double / =SignedInteger / =String / =ByteString / =Symbol . ``` **Embedded value.** `PEmbedded` requires that a matched value be an embedded capability: ```preserves-schema PEmbedded = =Embedded . ``` **Binding.** `PBind` first *captures* the matched value, adding it to the bindings vector, and then applies the nested `pattern`. If the subpattern matches, the `PBind` succeeds; otherwise, it fails: ```preserves-schema PBind = . ``` **Conjunction.** `PAnd` is a conjunction of patterns; every pattern in `patterns` must match for the `PAnd` to match: ```preserves-schema PAnd = . ``` **Negation.** `PNot` is a pattern negation: if `pattern` matches, the `PNot` fails to match, and *vice versa*. It is an error for `pattern` to include any `PBind` subpatterns. ```preserves-schema PNot = . ``` **Literal.** `Lit` is an exact match pattern. If the matched value is exactly equal to `value` (according to Preserves' own built-in equivalence relation), the match succeeds; otherwise, it fails: ```preserves-schema Lit = . ``` **Compound.** Finally, `PCompound` patterns match compound data structures. The `rec` variant demands that a matched value be a record, with label exactly equal to `label` and fields one-for-one matching the `Pattern`s in `fields`; the `arr` variant demands a sequence, with each element matching the corresponding element of `items`; and `dict` demands a dictionary having *at least* entries named by the keys of the `entries` dictionary, each matching the corresponding `Pattern`. ```preserves-schema PCompound = / @rec / @arr / @dict . ``` ### Bindings Matching notionally produces a sequence of values, one for each `PBind` in the pattern. When a `PBind` pattern is seen, the matcher *first* appends the matched value to the binding sequence and *then* recurses on the nested subpattern. This makes binding *indexes* appear in left-to-right order as a `Pattern` is read. **Example.** Given the pattern `>, >]>>` and the matched value `["a" "b"]`, the resulting captured values are, in order, `["a" "b"]`, `"a"`, and `"b"`; the template `` will be instantiated to `["1" "2"]`, `` to `"a"`, and `` to `"b"`. ### Templates A `Template` within a rewrite produces a concrete Preserves value when instantiated with a vector of captured binding values. Template instantiation may fail, yielding no value. A given `Template` may be any of the following variants: ```preserves-schema Template = TAttenuate / TRef / Lit / TCompound . ``` `TAttenuate` first instantiates the sub-`template`. If it yields a value, and if that value is an embedded reference (i.e. a capability), the `Caveat`s in `attenuation` are appended to the (possibly-empty) sequence of `Caveat`s already present in the embedded capability. The resulting possibly-attenuated capability is the final result of instantiation of the `TAttenuate`. ```preserves-schema TAttenuate = . ``` `TRef` retrieves the `binding`th (0-based) index into the bindings vector, yielding the associated captured value as the result of instantiation. It is an error if `binding` is less than zero, or greater than or equal to the number of bindings in the bindings vector. ```preserves-schema TRef = . ``` `Lit` (the same definition as used in the grammar for `Pattern` above) instantiates to exactly its `value` argument: ```preserves-schema Lit = . ``` Finally, `TCompound` instantiates to compound data. The `rec` variant produces a record with the given `label` and `fields`; `arr` produces an array; and `dict` a dictionary: ```preserves-schema TCompound = / @rec / @arr / @dict . ``` ### Validity of Caveats The above definitions imply some *validity constraints* on `Caveat`s. - All `TRef`s must be bound: the index referred to must relate to the index associated with some `PBind` in the pattern corresponding to the template. - Binding under negation is forbidden: a `pattern` within a `PNot` may not include any `PBind` constructors. - The value produced by instantiation of `template` within a `TAttenuate` must be an embedded reference (a capability). Implementations MUST enforce these constraints (either statically or dynamically). ## Membranes Every relay maintains two stateful objects called *membranes*. A membrane is a bidirectional mapping between [OID](./glossary.md#oid) and relay-internal entity pointer. Membranes connect embedded references on the wire to entity references local to the relay. - The *import membrane* connects OIDs managed by the *remote* peer to local [relay entities](#relay-entities) which proxy access to an "imported" remote entity. - The *export membrane* connects OIDs managed by the *local* peer to any local "exported" entities accessible to the peer. ```ditaa membranes | | Export Membrane | Import Membrane | +-+ | +-+ Pointer | | ID | ID | | 0x1234 <-+-+-> "my 7" | "your 7"<-+-+-> 0x9abc | | | | | ^ | | ^ | ^ | | ^ | | | | -+- | | | | V | | | | | | V /------\ | | \-------------/ | | /------\ |Entity| | | | | |Relay |<-- ... \------/ | | | | |Entity| 0x1234 | | --------> | | \------/ =-------------+-+---= packets | | 0x9abc 0x462e | | <-------- =---+-+-------------= /------\ | | | | 0xa043 |Relay | | | | | /------\ ... -->|Entity| | | /-------------\ | | |Entity| \------/ | | | | | | \------/ ^ | | | -+- | | | ^ | | | | | | | | | V | | V | V | | V | | | | | 0x462e <-+-+->"your 3" | "my 3" <-+-+-> 0xa043 Pointer | | ID | ID | | Pointer +-+ | +-+ | Import Membrane | Export Membrane | | ``` to remote +--0x7f10652fe7c0<-+-> 13 peer /--\ | | |A3|<----+ | +-+------------------------- \--/ | | | | ----------------------------------------+ ``` --> | Relay |<-\ | +----------------+ (2) +-------+ | +---------+ | | ^ \->|Dataspace| | | | +---------+ | | V ^ ^ ^ (1) | +----------------+ TCP/IP +-------+ | | | | |Remote Syndicate|<----------->| Relay |<----+-/ | | | |server/dataspace| (3) +-------+ | | | | +----------------+ | V V V | | +-------+ +-------+ | | | Relay | | Relay | | +----------+-------+-+-------+--+ ^ ^ LAN multicast (4) | | UNIX ... <-------/---------/---------/---------/ | socket | | | | (5) v v v v +-------+ +-------+ +-------+ +-------+ |. . . | |. . . | |. . . | |. . . | +-------+ +-------+ +-------+ +-------+ ``` In the diagram above, networks (scopes) 1 and 4 are *multicast*, while networks 2, 3 and 5 are *point-to-point*. Four relays bridge scope 1 to scopes 2 through 5. Within each scope, peers are able to interact with each other directly. Each point-to-point scope contains exactly two peers. --> Logically, a membrane's state can be represented as a set of `WireSymbol` structures: a `WireSymbol` is a triple of an OID, a local reference pointer (its *ref*), and a reference count. There is never more than one `WireSymbol` associated with an OID or a ref. A `WireSymbol` exists only so long as some assertion mentioning its OID exists across the relay link. When the last assertion mentioning an OID is retracted, its `WireSymbol` is deleted. Assertions mentioning a particular OID can come from *either side* of the relay link: initially, a local reference is sent to the peer in an assertion, but then the peer may assert something *back*, either targeting or mentioning the same entity. Care must be taken not to release an OID entry prematurely in such situations. For example, at least the following contribute to a `WireSymbol`'s reference count: - The initial entry mapping a local entity ref to an well-known OID for use at session startup ([see below](#well-known-oids)) contributes a permanent reference. - Mention of an OID in a received or sent `TurnEvent` adds one to the OID's reference count for the duration of processing of the event. For `Assert` events in either direction, the duration of processing is until the assertion is later retracted. For received `Message` events, the duration of processing is until the incoming message has been forwarded on to the target ref. **"Transient" references.** Embedded references in `Message` event bodies are special. Because messages, unlike assertions, have no notion of lifetime—they are forwarded and forgotten—it is not possible for a message to cause establishment of a long-lived entry in a membrane's `WireSymbol` set. Therefore, messages MUST NOT embed any reference not previously known to the peer (a "transient reference"). In other words, only after using an *assertion* to introduce a reference, associating a conversational context with its lifetime, is it permitted to discuss the reference using *messages*. A relay receiving a message bearing a transient reference MUST terminate the session with an error. A relay about to send such a message SHOULD preemptively refuse to do so. ### Rewriting embedded references upon receipt When processing a `Value` *v* in a received `Assert` or `Message` event, embedded references in *v* are decoded from their [on-the-wire `WireRef` form](#capabilities-on-the-wire) to in-memory ref-pointer form. The value is recursively traversed. As the relay comes across each embedded `WireRef`, - If it is of `mine` variant, it refers to an entity exported by the remote, sending peer. Its OID is looked up in the import membrane. - If no `WireSymbol` exists in the import membrane, one is created, mapping the OID to a fresh [relay entity](#relay-entities) for the OID. - If a `WireSymbol` is already present, its associated ref is substituted into *v*. - If it is of `yours` variant, it refers to an entity previously exported by the local, receiving peer. Its OID is looked up in the export membrane. - If no `WireSymbol` exists for the OID, one is created, associating the OID with a dummy inert entity ref. The dummy ref is substituted into *v*. It will later be released once the reference count of the `WireSymbol` drops to zero. - If a `WireSymbol` exists for the OID, and the `WireRef` is not [attenuated](#attenuation-of-authority), the associated ref is substituted into *v*. If the `WireRef` is attenuated, the associated ref is wrapped with the `Caveat`s from the `WireRef` before its substitution into *v*. - In each case, the `WireSymbol` associated with the OID has its reference count incremented (if an `Assert` is being processed). In addition, for `Assert` events, the `WireSymbol` (necessarily in the export membrane) associated with the OID to which the incoming `Assert` is targetted has its reference count incremented. ### Rewriting embedded references for transmission When transmitting a `Value` *v* in an `Assert` or `Message` event, embedded references in *v* are encoded from their in-memory ref-pointer form to [on-the-wire `WireRef` form](#capabilities-on-the-wire). The value is recursively traversed. As the relay comes across each embedded reference: - The reference is first looked up in the export membrane. If an associated `WireSymbol` is present in the export membrane, its OID is substituted as a `mine`-variant `WireRef` into *v*. - Otherwise, it is looked up in the import membrane. If *no* associated `WireSymbol` exists there, a fresh OID and `WireSymbol` are placed in the export membrane, and the new OID is substituted as a `mine`-variant `WireRef` into *v*. If a `WireSymbol` exists in the import membrane, however, the embedded reference must be a local [relay entity](#relay-entities) referencing a previously-imported remote entity: - If the local entity reference has not been attenuated subsequent to its import, the OID it was imported under is substituted as a `yours`-variant `WireRef` into *v* with an empty attenuation. - If it has been attenuated, [the relay may choose whether to trust the remote party to enforce an attenuation request](#attenuation-of-authority). If it trusts the peer to honour attenuation requests, it substitutes a `yours`-variant `WireRef` with non-empty attenuation into *v*. Otherwise, a fresh OID and `WireSymbol` are placed in the export membrane, with ref denoting the attenuated local reference, and the new OID is substituted as a `mine`-variant `WireRef` into *v*. ## Relay entities A relay entity is a local proxy for an entity at the other side of a relay link. It forwards events delivered to it—`assert`, `retract`, `message` and `sync`—across the link to its counterpart at the other end. It holds two pieces of state: a pointer to the relay link, and the OID of the remote entity it represents. It packages all received events into `TurnEvent`s which are then sent across the transport. **Turn boundaries.** When the relay is structured internally using the SAM, it is important to preserve turn boundaries. When all the relay entities of a given relay instance are managed by a single actor, this will be natural: a single turn can deliver events to a group of entities in the actor, so if the relay entity enqueues its `TurnEvent`s in a buffer which is flushed into a `Turn` packet sent across the transport at the conclusion of the turn, the correct turn boundaries will be preserved. ## Client and server roles While the protocol itself is symmetric, in many cases there will be one active ("client") and one passive ("server") party during the establishment of a transport connection. As an optional convention, a "server" MAY have a single entity exposed as *well-known OID* 0 at the establishment of a connection, and a "client" MAY likewise expect OID 0 to resolve to some pre-arranged entity. It is frequently useful for the pre-arranged entity to be a [gatekeeper service](./builtin/gatekeeper.md), but direct exposure of a [dataspace](./glossary.md#dataspace) or even some domain-specific object can also be useful. Either or both party to a connection may play one role, the other, neither, or both. APIs for making use of relays in programs should permit programs to supply to a newly-constructed relay an (optional) *initial ref*, to be exposed as well-known OID 0; an (optional) *initial OID*, to denote a remote well-known OID and to be immediately proxied by a local relay entity; or both. In the case of TCP/IP, the "client" role is often played by a `connect`ing party, and the "server" by a `listen`ing party, but the opposite arrangement is also useful from time to time. ## Security considerations The security considerations for this protocol fall into two categories: those having to do with particular transports for relay instances, and those having to do with the protocol itself. ### Transport security The security of an instance of the protocol depends on the security characteristics of its transport. **Confidentiality.** Parties outwith the communicating peers must not be able to deduce the contents of packets sent back and forth: some of the packets may contain secrets. For example, a `Resolve` message sent to a [gatekeeper service](./builtin/gatekeeper.md) contains a "bearer capability", which conveys authority to any holder able to present it to the gatekeeper. **Integrity.** Packets delivered to peers must be proof from tampering or other in-flight damage. **Authenticity.** Each packet delivered to a peer must have genuinely originated with another party, and must have genuinely originated in the same session. Forgery of packets must be prevented. **Replay-resistance.** Each packet delivered to a peer must be delivered exactly once within the context of the transport session. That is, replay of otherwise-authentic packets must not be possible from outside the session. ### Protocol security The protocol builds on, and directly reflects, the [object-capability security model](./glossary.md#object-capability-model) of the SAM. Entities are accessed via unforgeable references (OIDs). OIDs are meaningful only within the context of their transport session; in this way, they are analogous to Unix file descriptors, which are small integers that meaningfully denote objects only within the context of a single Unix process. If the transport is secure, so is the reference. Entities can only obtain references to other entities by the [standard methods by which "connectivity begets connectivity"](http://www.erights.org/elib/capability/ode/ode-capabilities.html); namely: - *By initial conditions.* The relevant initial conditions here are the state of the relays at the moment a transport session is established, including any mappings from [well-known OIDs](#well-known-oids) to their underlying refs. - *By parenthood and by endowment.* No direct provision is made for creation of new entities in this protocol, so these do not apply. - *By introduction.* Transmission of OIDs in `Turn` packets, and the associated [rules for managing the mappings between OIDs and references](#membranes), are the normal method by which references pass from one entity to another. While transport confidentiality is important for preserving secrecy of secrets such as bearer capabilities, OIDs do not need this kind of protection. An attacker able to observe OIDs communicated via a transport does not gain authority to deliver events to the denoted entity. At most, the attacker may glean information on patterns of interconnectivity among entities communicating across a transport link. ## Relation to CapTP This protocol is *strikingly* similar to a family of protocols known as [CapTP](http://www.erights.org/elib/distrib/captp/index.html) (see, for example, [here](http://www.erights.org/elib/distrib/captp/index.html), [here](https://spritelyproject.org/news/what-is-captp.html) and [here](https://github.com/ocapn/ocapn)). This is no accident: the Syndicated Actor Model draws heavily on the actor model, and has over the years been incrementally evolving to be closer and closer to the actor model as it appears in the [E programming language](http://www.erights.org/). However, the Syndicate protocol described in this document was developed based on the needs of the Syndicated Actor Model, without particular reference to CapTP. This makes it all the more striking that the similarities should be so strong. No doubt I have been subconsciously as well as consciously influenced by E's design, but perhaps there might also be a Platonic form awaiting discovery somewhere nearby. For example: - CapTP has the notion of a "c-list [capability list] index", cognate with our OID. A c-list index is meaningful only within the context of a transport connection, just like an OID is. A given c-list index maps to a "live-ref", an in-memory pointer to an object, in the same way that an OID maps to a ref via a `WireSymbol`. - CapTP has "[the four tables](http://www.erights.org/elib/distrib/captp/4tables.html)" at each end of a connection; each of our relays has two [membranes](#membranes), each having two unidirectional mapping tables. - Syndicate [gatekeeper services](./builtin/gatekeeper.md) borrow the concept of a [SturdyRef](http://wiki.erights.org/wiki/SturdyRef) directly from CapTP. However, the notion of a gatekeeper entity at well-known OID 0 is an example of convergent evolution in action: in the CapTP world, the [analogous service](http://www.erights.org/elib/distrib/captp/NonceLocator.html) happens also to be available at c-list index 0, by convention. A notable difference is that this protocol completely lacks support for the promises/futures of CapTP. CapTP c-list indices are just one part of a framework of [descriptors](http://www.erights.org/elib/distrib/captp/index.html) (*desc*s) denoting various kinds of remote object and eventual remote-procedure-call (RPC) result. The SAM handles RPC in a different, more low-level way. ## Specific transport mappings For now, this document focuses on `SOCK_STREAM`-like transports: reliable, in-order, bidirectional, connection-oriented, fully-duplex byte streams. While these transports naturally have a certain level of integrity assurance and replay-resistance associated with them, special care should be taken in the case of non-cryptographic transport protocols like plain TCP/IP. To use such a transport for this protocol, establish a connection and begin transmitting [`Packet`s](#packet-definitions) encoded as Preserves values using either the Preserves [text syntax](https://preserves.dev/preserves-text.html) or the Preserves [machine-oriented syntax](https://preserves.dev/preserves-binary.html). The session starts with the first packet and ends with transport disconnection. If either peer in a connection detects a syntax error, it MUST disconnect the transport. A responding server MUST support the binary syntax, and MAY also support the text syntax. It can autodetect the syntax variant by following [the rules in the specification](https://preserves.dev/preserves-binary.html#appendix-autodetection-of-textual-or-binary-syntax): the first byte of a valid binary-syntax Preserves document is guaranteed not to be interpretable as the start of a valid UTF-8 sequence. `Packet`s encoded in either binary or text syntax are self-delimiting. However, peers using text syntax MAY choose to insert whitespace (e.g. newline) after each transmitted packet. Some domain-specific details are also relevant: - **Unix-domain sockets.** An additional layer of authentication checks can be made based on process-ID and user-ID credentials associated with each Unix-domain socket. - **TCP/IP sockets.** Plain TCP/IP sockets offer only weak message integrity and replay-resistance guarantees, and offer no authenticity or confidentiality guarantees at all. Plain TCP/IP sockets SHOULD NOT be used; consider using TLS sockets instead. - **TLS atop TCP/IP.** An additional layer of authentication checks can be made based on the signatures and certificates exchanged during TLS setup. > TODO: concretely develop some recommendations for ordinary use of TLS certificates, > including referencing a domain name in a `SturdyRef`, checking the presented certificate, > and requiring SNI at the server end. - **WebSockets atop HTTP 1.x.** These suffer similar flaws to plain TCP/IP sockets and SHOULD NOT be used. - **WebSockets atop HTTPS 1.x.** Similar considerations to the use of TLS sockets apply regarding authentication checks. WebSocket messages are self-delimiting; peers MUST place exactly one `Packet` in each WebSocket message. Since (a) WebSockets are established after a standard HTTP(S) message header exchange, (b) every HTTP(S) request header starts with an ASCII letter, and (c) every `Packet` in text syntax begins with the ASCII "`<`" character, it is possible to autodetect use of a WebSocket protocol multiplexed on a server socket that is also able to handle plain Preserves binary and/or text syntax for `Packet`s: any ASCII character between "`A`" and "`Z`" or "`a`" and "`z`" must be HTTP, an ASCII "`<`" must be Preserves text syntax, and any byte with the high bit set must be Preserves binary syntax. ## Appendix: Complete schema of the protocol The following is a consolidated form of the definitions from the text above. ### Protocol packets The authoritative version of this schema is [`[syndicate-protocols]/schemas/protocol.prs`](https://git.syndicate-lang.org/syndicate-lang/syndicate-protocols/src/branch/main/schemas/protocol.prs). ```preserves-schema version 1 . Packet = Turn / Error / Extension . Extension = < @label any @fields [any ...]> . Error = . Assertion = any . Handle = int . Event = Assert / Retract / Message / Sync . Oid = int . Turn = [TurnEvent ...]. TurnEvent = [@oid Oid @event Event]. Assert = . Retract = . Message = . Sync = . ``` ### Capabilities, WireRefs, and attenuations The authoritative version of this schema is [`[syndicate-protocols]/schemas/sturdy.prs`](https://git.syndicate-lang.org/syndicate-lang/syndicate-protocols/src/branch/main/schemas/sturdy.prs). ```preserves-schema version 1 . Caveat = Rewrite / Alts / Reject / @unknown any . Rewrite = . Reject = . Alts = . Oid = int . WireRef = @mine [0 @oid Oid] / @yours [1 @oid Oid @attenuation Caveat ...]. Lit = . Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound . PDiscard = <_>. PAtom = =Boolean / =Double / =SignedInteger / =String / =ByteString / =Symbol . PEmbedded = =Embedded . PBind = . PAnd = . PNot = . PCompound = / @rec / @arr / @dict . Template = TAttenuate / TRef / Lit / TCompound . TAttenuate = . TRef = . TCompound = / @rec / @arr / @dict . ``` ## Appendix: Pseudocode for attenuation, pattern matching, and template instantiation ### Attenuation ```python def attenuate(caveats, value): for caveat in reversed(caveats): value = applyCaveat(caveat, value) if value is None: return None return value def applyCaveat(caveat, value): if caveat is 'Alts' variant: for rewrite in caveat.alternatives: possibleResult = tryRewrite(rewrite, value); if possibleResult is not None: return possibleResult return None if caveat is 'Rewrite' variant: return tryRewrite(caveat, value) if caveat is 'Reject' variant: if applyPattern(caveat.pattern, value) is None: return value else: return None if caveat is 'unknown' variant: return None def tryRewrite(rewrite, value): bindings = applyPattern(rewrite.pattern, value) if bindings is None: return None else: return instantiateTemplate(rewrite.template, bindings) ``` ### Pattern matching ```python def match(pattern, value, bindings): if pattern is 'PDiscard' variant: return True if pattern is 'PAtom' variant: return True if value is of the appropriate atomic class else False if pattern is 'PEmbedded' variant: return True if value is a capability else False if pattern is 'PBind' variant: append value to bindings return match(pattern.pattern, value, bindings) if pattern is 'PAnd' variant: for p in pattern.patterns: if not match(p, value, bindings): return False return True if pattern is 'PNot' variant: return False if match(pattern.pattern, value, bindings) else True if pattern is 'Lit' variant: return (pattern.value == value) if pattern is 'PCompound' variant: if pattern is 'rec' variant: if value is not a record: return False if value.label is not equal to pattern.label: return False if value.fields.length is not equal to pattern.fields.length: return False for i in [0 .. pattern.fields.length): if not match(pattern.fields[i], value.fields[i], bindings): return False return True if pattern is 'arr' variant: if value is not a sequence: return False if value.length is not equal to pattern.items.length: return False for i in [0 .. pattern.items.length): if not match(pattern.items[i], value[i], bindings): return False return True if pattern is 'dict' variant: if value is not a dictionary: return False for k in keys of pattern.entries: if k not in keys of value: return False if not match(pattern.entries[k], value[k], bindings): return False return True ``` ### Template instantiation ```python def instantiate(template, bindings): if template is 'TAttenuate' variant: c = instantiate(template.template, bindings) if c is not a capability: raise an exception c′ = c with the caveats in template.attenuation appended to the existing attenuation in c return c′ if template is 'TRef' variant: if 0 ≤ template.binding < bindings.length: return bindings[template.binding] else: raise an exception if template is 'Lit' variant: return template.value if template is 'TCompound' variant: if template is 'rec' variant: return Record(label=template.label, fields=[instantiate(t, bindings) for t in template.fields]) if template is 'arr' variant: return [instantiate(t, bindings) for t in template.items] if template is 'dict' variant: result = {} for k in keys of template.entries: result[k] = instantiate(template.entries[k], bindings) return result ``` --- #### Notes [^analogy-to-subnets]: Strictly speaking, scope *subnets* are connected by relay actors. The situation is directly analogous to IP subnets being connected by IP routers. [^relaying-over-syndicate]: In fact, it makes perfect sense to run the relay protocol between actors that are *already connected* in some scope: this is like running a VPN, tunnelling IP over IP. A variation of the Syndicate Protocol like this gives [federated dataspaces](https://syndicate-lang.org/about/history/#postdoc). [^automatic-when-implemented-with-sam]: This process of assertion-retraction on termination is largely automatic when relay actors are structured internally using the SAM: simply terminating a SAM actor automatically retracts its published assertions. [^no-extensions-yet]: This specification does not define any extensions, but future revisions could, for example, use extensions to perform version-negotiation. Another potential future use could be to propagate provenance information for tracing/debugging. [^slightly-silly-wireref]: The syntax for `WireRef`s is slightly silly, using tuples as quasi-records with `0` and `1` acting as quasi-labels. It would probably be better to use real records, like `` and ``. Pros: less cryptic. Cons: slightly more verbose on the wire. TODO: should we revise the spec in this regard? [^attenuation-is-not-enforceable]: Such conditions can only ever be requests: after all, every `yours`-capability is already completely accessible to the recipient of the packet. Similarly, it does not make sense to include an attenuation description on a `my`-capability. However, in every case, if a party wishes to *enforce* an attenuation on a `my`- or `yours`-capability, it may record the attenuation against the underlying capability internally, issuing to its peers a fresh `my`-capability denoting the attenuated capability. [^caveat-terminology-macaroon]: This terminology, "caveat", is lifted from the excellent paper on [Macaroons](./glossary.md#macaroon), where it is used to describe a more general mechanism. Future versions of this specification may opt to include some of this generality. [^affine-caveats]: `Caveat`s are thus *affine*. [^single-rewrite-meaning]: A single `Rewrite` *R* is equivalent to ``.