diff --git a/src/glossary.md b/src/glossary.md index 7e3bd7e..23fa4b7 100644 --- a/src/glossary.md +++ b/src/glossary.md @@ -93,6 +93,9 @@ Often abbreviated **SAM**. Source [entities](#entity) running within an [actor](#actor) publish [assertions](#assertion) and send [messages](#message) to target entities, possibly in other actors. +Essential idea: state replication is more useful than message-passing. (Message-passing +protocols usually end up simulating it, badly, anyway.) + ## System Layer ## System Dataspace ## Transport diff --git a/src/protocol.md b/src/protocol.md index 205597c..3bfc7b7 100644 --- a/src/protocol.md +++ b/src/protocol.md @@ -36,26 +36,31 @@ Transports must This document focuses primarily on point-to-point transports, discussing multicast and in-memory variations briefly toward the end. -## Roles and session establishment +## Roles and session lifecycle The protocol is completely symmetric, aside from [certain conventions detailed below](#well-known-oids) about the entities available for use immediately upon connection establishment. It is *not* a client/server protocol. -To begin a session on a newly-established point-to-point link, a relay simply starts sending -packets. Each peer starts the session with an empty entity reference map ([see -below](#membranes)) and making no assertions in either the outbound (on behalf of local +**Session startup.** To begin a session on a newly-established point-to-point link, a relay +simply starts sending packets. Each peer starts the session with an empty entity reference map +([see below](#membranes)) and making no assertions in either the outbound (on behalf of local entities) or inbound (on behalf of the remote peer) directions. +**Session teardown.** At the end of a session, terminated normally or abnormally, cleanly or +through involuntary transport disconnection, all published assertions are +retracted.[^automatic-when-implemented-with-sam] This is in keeping with the essence of the +[Syndicated Actor Model (SAM)](./glossary.md#syndicated-actor-model). + ## Packet definitions +Packets exchanged by relays are [Preserves](./glossary.md#preserves) values defined using +Preserves [schema](./glossary.md#schema). + ```preserves-schema Packet = Turn / Error / Extension . ``` -Packets exchanged by relays are [Preserves](./glossary.md#preserves) values defined using -Preserves [schema](./glossary.md#schema). - A packet may be a *turn*, an *error*, or an *extension*. Packets are neither commands nor responses; they are *events*. @@ -108,33 +113,32 @@ A `Turn` is the most important packet variant. It directly reflects the [SAM](./glossary.md#syndicated-actor-model) notion of a [turn](./glossary.md#turn). **Handling.** Each `Turn` carries [events](./glossary.md#event) to be delivered to -[entities](./glossary.md#entity) residing at the receiving end of the transport. +[entities](./glossary.md#entity) residing in the scope at the receiving end of the transport. -The `assertion` fields of `Assert` events and the `body` fields of `Message` events may contain -any Preserves value, including embedded entity references. On the wire, these will always be -formatted [as described below](#capabilities-on-the-wire). Upon receipt of a `Turn`, embedded -references are first mapped to internal references. Prior to transmission, internal references -are mapped to their external form. [The mapping procedure to follow is detailed -below.](#membranes) - -After reference rewriting is complete, the sequence of `TurnEvent`s is examined. The +Upon receipt of a `Turn`, the sequence of `TurnEvent`s is examined. The [OID](./glossary.md#oid) in each `TurnEvent` selects an entity known to the recipient. Each `Event` is either publication of an assertion, retraction of a previously-published assertion, delivery of a single message, or a [synchronization](./glossary.md#synchronization) event. -In the case that the receiving party is structured internally using the SAM, it is important to -preserve turn boundaries. Since turn boundaries are a per-[actor](./glossary.md#actor) concept, -but a `Turn` mentions only entities, the receiver must map entities to actors, group -`TurnEvent`s into per-actor queues, and deliver those queues to each actor in a single SAM turn -for each actor. +The `assertion` fields of `Assert` events and the `body` fields of `Message` events may contain +any Preserves value, including embedded entity references. On the wire, these will always be +formatted [as described below](#capabilities-on-the-wire). As each `Assert` or `Message` is +processed, embedded references are mapped to internal references. Symmetrically, internal +references are mapped to their external form prior to transmission. [The mapping procedure to +follow is detailed below.](#membranes) -The `Handle`s used to refer to published assertions MUST be unique within the scope of the -transport connection. +**Turn boundaries.** In the case that the receiving party is structured internally using the +SAM, it is important to preserve turn boundaries. Since turn boundaries are a +per-[actor](./glossary.md#actor) concept, but a `Turn` mentions only entities, the receiver +must map entities to actors, group `TurnEvent`s into per-actor queues, and deliver those queues +to each actor in a single SAM turn for each actor. + +**Uniqueness.** The `Handle`s used to refer to published assertions MUST be unique within the +scope of the transport connection. ## Capabilities on the wire -Packets sent and received on a point-to-point transport frequently include embedded references. -These references denote *capabilities* for interacting with some entity. +References embedded in `Turn` packets denote *capabilities* for interacting with some entity. For example, assertion of a capability-bearing record could appear as the following `Event`: @@ -154,9 +158,8 @@ Oid = int . ``` The `mine` variant denotes capability references managed by the *sender* of a given packet; the -`yours` variant, the *receiver* of the packet. Accordingly, if a relay receives a packet -mentioning `#![0 555]`, it will later use `#![1 555]` if it needs to send a packet to refer to -that same entity. +`yours` variant, the *receiver* of the packet. A relay receiving a packet mentioning `#![0 +555]` will use `#![1 555]` in later responses that refer to that same entity, and *vice versa*. ### Attenuation of authority @@ -166,10 +169,10 @@ additional conditions on the receiver's use of its own capability, known as an An attenuation is a chain of `Caveat`s.[^caveat-terminology-macaroon] A `Caveat` acts as a function that, given a Preserves value representing an assertion or message body, yields either -a possibly-rewritten value, or no value at all. In the latter case, the value has been -*rejected*. In the former case, the rewritten value is used as input to the next `Caveat` in -the chain, or as the final assertion or message body for delivery to the entity backing the -capability. +a possibly-rewritten value, or no value at all.[^zero-or-more] In the latter case, the value +has been *rejected*. In the former case, the rewritten value is used as input to the next +`Caveat` in the chain, or as the final assertion or message body for delivery to the entity +backing the capability. The chain of `Caveats` in an attenuation is written down in *reverse* order: newer `Caveat`s are appended to the sequence, and each `Caveat`'s output is fed into the input of the next @@ -193,7 +196,7 @@ captured by the pattern are gathered together and used in instantiation of the ` has rejected the input, and other `alternatives` are tried until none remain, at which point the whole `Caveat` has rejected the input and processing of the triggering event stops. -#### Patterns +### Patterns A `Pattern` within a rewrite can be any of the following variants: @@ -201,54 +204,54 @@ A `Pattern` within a rewrite can be any of the following variants: Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound . ``` -`PDiscard` matches any value: +**Wildcard.** `PDiscard` matches any value: ```preserves-schema PDiscard = <_>. ``` -`PAtom` requires that a matched value be a boolean, a single- or double-precision float, an +**Atomic type.** `PAtom` requires that a matched value be a boolean, a single- or double-precision float, an integer, a string, a binary blob, or a symbol, respectively: ```preserves-schema PAtom = =Boolean / =Float / =Double / =SignedInteger / =String / =ByteString / =Symbol . ``` -`PEmbedded` requires that a matched value be an embedded capability: +**Embedded value.** `PEmbedded` requires that a matched value be an embedded capability: ```preserves-schema PEmbedded = =Embedded . ``` -`PBind` first *captures* the matched value, adding it to the bindings vector, and then applies +**Binding.** `PBind` first *captures* the matched value, adding it to the bindings vector, and then applies the nested `pattern`. If the subpattern matches, the `PBind` succeeds; otherwise, it fails: ```preserves-schema PBind = . ``` -`PAnd` is a conjunction of patterns; every pattern in `patterns` must match for the `PAnd` to +**Conjunction.** `PAnd` is a conjunction of patterns; every pattern in `patterns` must match for the `PAnd` to match: ```preserves-schema PAnd = . ``` -`PNot` is a pattern negation: if `pattern` matches, the `PNot` fails to match, and *vice +**Negation.** `PNot` is a pattern negation: if `pattern` matches, the `PNot` fails to match, and *vice versa*. It is an error for `pattern` to include any `PBind` subpatterns. ```preserves-schema PNot = . ``` -`Lit` is an exact match pattern. If the matched value is exactly equal to `value` (according to +**Literal.** `Lit` is an exact match pattern. If the matched value is exactly equal to `value` (according to Preserves' own built-in equivalence relation), the match succeeds; otherwise, it fails: ```preserves-schema Lit = . ``` -Finally, `PCompound` patterns match compound data structures. The `rec` variant demands that a +**Compound.** Finally, `PCompound` patterns match compound data structures. The `rec` variant demands that a matched value be a record, with label exactly equal to `label` and fields one-for-one matching the `Pattern`s in `fields`; the `arr` variant demands a sequence, with each element matching the corresponding element of `items`; and `dict` demands a dictionary having *at least* entries @@ -261,18 +264,20 @@ PCompound = / @dict . ``` -#### Bindings +### Bindings -Bindings resulting from matching are stored as a sequence of values. +Matching notionally produces a sequence of values, one for each `PBind` in the pattern. -During matching, when a `PBind` pattern is seen, the matcher *first* appends the matched value -to the binding sequence and *then* recurses on the nested subpattern. This makes binding -*indexes* appear in left-to-right order as a `Pattern` is read. +When a `PBind` pattern is seen, the matcher *first* appends the matched value to the binding +sequence and *then* recurses on the nested subpattern. This makes binding *indexes* appear in +left-to-right order as a `Pattern` is read. -For example, given the pattern `>, >]>>` and the matched value -`[1 2]`, the resulting captured values will be, in order, `[1 2]`, `1`, and `2`. +**Example.** Given the pattern `>, >]>>` and the matched value +`["a" "b"]`, the resulting captured values are, in order, `["a" "b"]`, `"a"`, and `"b"`; the +template `` will be instantiated to `["1" "2"]`, `` to `"a"`, and `` to +`"b"`. -#### Templates +### Templates A `Template` within a rewrite produces a concrete Preserves value when instantiated with a vector of captured binding values. Template instantiation may fail, yielding no value. @@ -318,7 +323,7 @@ TCompound = / @dict . ``` -#### Validity of Caveats +### Validity of Caveats The above definitions imply some *validity constraints* on `Caveat`s. @@ -335,12 +340,12 @@ Implementations MUST enforce these constraints (either statically or dynamically ## Membranes -In order to correctly map between embedded references on the wire and entity references local -to the relay, the relay maintains two stateful objects called *membranes*. A membrane is a -bidirectional mapping between [OID](./glossary.md#oid) and relay-internal entity pointer. +Every relay maintains two stateful objects called *membranes*. A membrane is a bidirectional +mapping between [OID](./glossary.md#oid) and relay-internal entity pointer. Membranes connect +embedded references on the wire to entity references local to the relay. - - The *import membrane* connects OIDs managed by the *remote* peer to local *relay entities* - which proxy access to an "imported" remote entity. + - The *import membrane* connects OIDs managed by the *remote* peer to local [relay + entities](#relay-entities) which proxy access to an "imported" remote entity. - The *export membrane* connects OIDs managed by the *local* peer to any local "exported" entities accessible to the peer. @@ -380,6 +385,7 @@ bidirectional mapping between [OID](./glossary.md#oid) and relay-internal entity | ``` + +Logically, a membrane's state can be represented as a set of `WireSymbol` structures: a +`WireSymbol` is a triple of an OID, a local reference pointer (its *ref*), and a reference +count. There is never more than one `WireSymbol` associated with an OID or a ref. + +A `WireSymbol` exists only so long as some assertion mentioning its OID exists across the relay +link. When the last assertion mentioning an OID is retracted, its `WireSymbol` is deleted. +Assertions mentioning a particular OID can come from *either side* of the relay link: +initially, a local reference is sent to the peer in an assertion, but then the peer may assert +something *back*, either targeting or mentioning the same entity. Care must be taken not to +release an OID entry prematurely in such situations. + +For example, at least the following contribute to a `WireSymbol`'s reference count: + + - The initial entry mapping a local entity ref to an well-known OID for use at session startup + ([see below](#well-known-oids)) contributes a permanent reference. + + - Mention of an OID in a received *or sent* `TurnEvent` adds one to the OID's reference count + for the duration of processing of the event. For `Assert` events in either direction, the + duration of processing is until the assertion is later retracted. For received `Message` + events, the duration of processing is until the incoming message has been forwarded on to + the target ref. + +**"Transient" references.** Embedded references in `Message` event bodies are special. Because +messages, unlike assertions, have no notion of lifetime—they are forwarded and forgotten—it is +not possible for a message to cause establishment of a long-lived entry in a membrane's +`WireSymbol` set. Therefore, messages MUST NOT embed any reference not previously known to the +peer (a "transient reference"). In other words, only after using an *assertion* to introduce a +reference, associating a conversational context with its lifetime, is it permitted to discuss +the reference using *messages*. A relay receiving a message bearing a transient reference MUST +terminate the session with an error. A relay about to send such a message SHOULD preemptively +refuse to do so. + +### Rewriting embedded references upon receipt + +When processing a `Value` *v* in a received `Assert` or `Message` event, embedded references in +*v* are decoded from their [on-the-wire `WireRef` form](#capabilities-on-the-wire) to in-memory +ref-pointer form. + +The value is recursively traversed. As the relay comes across each embedded `WireRef`, + + - If it is of `mine` variant, it refers to an entity exported by the remote, sending peer. Its + OID is looked up in the import membrane. + + - If no `WireSymbol` exists in the import membrane, one is created, mapping the OID to a + fresh [relay entity](#relay-entities). + + - If a `WireSymbol` is already present, its associated ref is substituted into *v*. + + - If it is of `yours` variant, it refers to an entity previously exported by the local, + receiving peer. Its OID is looked up in the export membrane. + + - If no `WireSymbol` exists for the OID, one is created, associating the OID with a dummy + inert entity ref. The dummy ref is substituted into *v*. + + - If a `WireSymbol` exists for the OID, and the `WireRef` is not + [attenuated](#attenuation-of-authority), the associated ref is substituted into *v*. If + the `WireRef` *is* attenuated, the associated ref is wrapped with the `Caveat`s from the + `WireRef` before its substitution into *v*. + + - In each case, the `WireSymbol` associated with the OID has its reference count incremented + (if an `Assert` is being processed). + +### Rewriting embedded references for transmission + +When transmitting a `Value` *v* in an `Assert` or `Message` event, embedded references in *v* +are encoded from their in-memory ref-pointer form to [on-the-wire `WireRef` +form](#capabilities-on-the-wire). + +The value is recursively traversed. As the relay comes across each embedded reference: + + - The reference is first looked up in the export membrane. If an associated `WireSymbol` is + present in the export membrane, its OID is substituted as a `mine`-variant `WireRef` into + *v*. + + - Otherwise, it is looked up in the import membrane. If *no* associated `WireSymbol` exists + there, a fresh OID and `WireSymbol` are placed in the export membrane, and the new OID is + substituted as a `mine`-variant `WireRef` into *v*. + + - Otherwise, it refers to a previously-imported entity. + + - If the local entity reference has not been attenuated subsequent to its import, the OID it + was imported under is substituted as a `yours`-variant `WireRef` into *v* with an empty + attenuation. + + - If it has been attenuated, [the relay may choose whether to trust the remote party to + enforce an attenuation request](#attenuation-of-authority). If it trusts the peer to + honour attenuation requests, it substitutes a `yours`-variant `WireRef` with non-empty + attenuation into *v*. Otherwise, a fresh OID and `WireSymbol` are placed in the export + membrane, with ref denoting to the attenuated local reference, and the new OID is + substituted as a `mine`-variant `WireRef` into *v*. + +## Relay entities + + ## Client and server roles ## Well-known OIDs @@ -453,6 +554,21 @@ OID 0, initial ref, initial oid ## Security considerations +((Tease out into Related Work section?)) + +OIDs are locally-meaningful only, so if the transport is secure, so is the reference. Can't +steal one and put it on a different transport: it's like taking fd 6 from another process and +trying to use fd 6 locally to mean what the other process means. Extensive related work and +prior art here. + +http://www.erights.org/elib/distrib/captp/index.html + +Relate terms here to captp terms: + - Hah, `NonceLocator` vs `Gatekeeper` + - well-known "positions" (??) (vs "OID"s?) + - OID = "index", "capability-list index", "c-list index" + - @cwebber says "c-list is the structure mapping descriptors to live-refs" + ### Secrecy ### Privacy @@ -658,6 +774,10 @@ def instantiate(template, bindings): IP over IP. A variation of the Syndicate Protocol like this gives [federated dataspaces](https://syndicate-lang.org/about/history/#postdoc). +[^automatic-when-implemented-with-sam]: This process of assertion-retraction on termination is + largely automatic when relay actors are structured internally using the SAM: simply + terminating a SAM actor automatically retracts its published assertions. + [^no-extensions-yet]: This specification does not define any extensions, but future revisions could, for example, use extensions to perform version-negotiation. Another potential future use could be to propagate provenance information for tracing/debugging. @@ -680,3 +800,11 @@ def instantiate(template, bindings): on [Macaroons](./glossary.md#macaroon), where it is used to describe a more general mechanism. Future versions of this specification may opt to include some of this generality. + +[^zero-or-more]: TODO: It might be better to have a `Caveat` yield *zero or more* values? That + way they can act as filters. I've sometimes wanted the multiple-value case, though I've so + far been able to work around its lack. TODO: Perhaps it would also make sense to have a + `Caveat` map an *event* to zero or more *events*, rather than to values? Tricky corners + there include ensuring that carried authority isn't misused; macaroons are a very elegant + solution to this problem, of course, so maybe the macaroon design idea could be adapted to + this. For now, `Value`→`Option` is probably OK.