Progress on protocol spec

This commit is contained in:
Tony Garnock-Jones 2022-02-24 16:01:10 +01:00
parent bc5136c9e9
commit b16ea848b4
2 changed files with 186 additions and 55 deletions

View File

@ -93,6 +93,9 @@ Often abbreviated **SAM**.
Source [entities](#entity) running within an [actor](#actor) publish [assertions](#assertion) Source [entities](#entity) running within an [actor](#actor) publish [assertions](#assertion)
and send [messages](#message) to target entities, possibly in other actors. and send [messages](#message) to target entities, possibly in other actors.
Essential idea: state replication is more useful than message-passing. (Message-passing
protocols usually end up simulating it, badly, anyway.)
## System Layer ## System Layer
## System Dataspace ## System Dataspace
## Transport ## Transport

View File

@ -36,26 +36,31 @@ Transports must
This document focuses primarily on point-to-point transports, discussing multicast and This document focuses primarily on point-to-point transports, discussing multicast and
in-memory variations briefly toward the end. in-memory variations briefly toward the end.
## Roles and session establishment ## Roles and session lifecycle
The protocol is completely symmetric, aside from [certain conventions detailed The protocol is completely symmetric, aside from [certain conventions detailed
below](#well-known-oids) about the entities available for use immediately upon connection below](#well-known-oids) about the entities available for use immediately upon connection
establishment. It is *not* a client/server protocol. establishment. It is *not* a client/server protocol.
To begin a session on a newly-established point-to-point link, a relay simply starts sending **Session startup.** To begin a session on a newly-established point-to-point link, a relay
packets. Each peer starts the session with an empty entity reference map ([see simply starts sending packets. Each peer starts the session with an empty entity reference map
below](#membranes)) and making no assertions in either the outbound (on behalf of local ([see below](#membranes)) and making no assertions in either the outbound (on behalf of local
entities) or inbound (on behalf of the remote peer) directions. entities) or inbound (on behalf of the remote peer) directions.
**Session teardown.** At the end of a session, terminated normally or abnormally, cleanly or
through involuntary transport disconnection, all published assertions are
retracted.[^automatic-when-implemented-with-sam] This is in keeping with the essence of the
[Syndicated Actor Model (SAM)](./glossary.md#syndicated-actor-model).
## Packet definitions ## Packet definitions
Packets exchanged by relays are [Preserves](./glossary.md#preserves) values defined using
Preserves [schema](./glossary.md#schema).
```preserves-schema ```preserves-schema
Packet = Turn / Error / Extension . Packet = Turn / Error / Extension .
``` ```
Packets exchanged by relays are [Preserves](./glossary.md#preserves) values defined using
Preserves [schema](./glossary.md#schema).
A packet may be a *turn*, an *error*, or an *extension*. A packet may be a *turn*, an *error*, or an *extension*.
Packets are neither commands nor responses; they are *events*. Packets are neither commands nor responses; they are *events*.
@ -108,33 +113,32 @@ A `Turn` is the most important packet variant. It directly reflects the
[SAM](./glossary.md#syndicated-actor-model) notion of a [turn](./glossary.md#turn). [SAM](./glossary.md#syndicated-actor-model) notion of a [turn](./glossary.md#turn).
**Handling.** Each `Turn` carries [events](./glossary.md#event) to be delivered to **Handling.** Each `Turn` carries [events](./glossary.md#event) to be delivered to
[entities](./glossary.md#entity) residing at the receiving end of the transport. [entities](./glossary.md#entity) residing in the scope at the receiving end of the transport.
The `assertion` fields of `Assert` events and the `body` fields of `Message` events may contain Upon receipt of a `Turn`, the sequence of `TurnEvent`s is examined. The
any Preserves value, including embedded entity references. On the wire, these will always be
formatted [as described below](#capabilities-on-the-wire). Upon receipt of a `Turn`, embedded
references are first mapped to internal references. Prior to transmission, internal references
are mapped to their external form. [The mapping procedure to follow is detailed
below.](#membranes)
After reference rewriting is complete, the sequence of `TurnEvent`s is examined. The
[OID](./glossary.md#oid) in each `TurnEvent` selects an entity known to the recipient. Each [OID](./glossary.md#oid) in each `TurnEvent` selects an entity known to the recipient. Each
`Event` is either publication of an assertion, retraction of a previously-published assertion, `Event` is either publication of an assertion, retraction of a previously-published assertion,
delivery of a single message, or a [synchronization](./glossary.md#synchronization) event. delivery of a single message, or a [synchronization](./glossary.md#synchronization) event.
In the case that the receiving party is structured internally using the SAM, it is important to The `assertion` fields of `Assert` events and the `body` fields of `Message` events may contain
preserve turn boundaries. Since turn boundaries are a per-[actor](./glossary.md#actor) concept, any Preserves value, including embedded entity references. On the wire, these will always be
but a `Turn` mentions only entities, the receiver must map entities to actors, group formatted [as described below](#capabilities-on-the-wire). As each `Assert` or `Message` is
`TurnEvent`s into per-actor queues, and deliver those queues to each actor in a single SAM turn processed, embedded references are mapped to internal references. Symmetrically, internal
for each actor. references are mapped to their external form prior to transmission. [The mapping procedure to
follow is detailed below.](#membranes)
The `Handle`s used to refer to published assertions MUST be unique within the scope of the **Turn boundaries.** In the case that the receiving party is structured internally using the
transport connection. SAM, it is important to preserve turn boundaries. Since turn boundaries are a
per-[actor](./glossary.md#actor) concept, but a `Turn` mentions only entities, the receiver
must map entities to actors, group `TurnEvent`s into per-actor queues, and deliver those queues
to each actor in a single SAM turn for each actor.
**Uniqueness.** The `Handle`s used to refer to published assertions MUST be unique within the
scope of the transport connection.
## Capabilities on the wire ## Capabilities on the wire
Packets sent and received on a point-to-point transport frequently include embedded references. References embedded in `Turn` packets denote *capabilities* for interacting with some entity.
These references denote *capabilities* for interacting with some entity.
For example, assertion of a capability-bearing record could appear as the following `Event`: For example, assertion of a capability-bearing record could appear as the following `Event`:
@ -154,9 +158,8 @@ Oid = int .
``` ```
The `mine` variant denotes capability references managed by the *sender* of a given packet; the The `mine` variant denotes capability references managed by the *sender* of a given packet; the
`yours` variant, the *receiver* of the packet. Accordingly, if a relay receives a packet `yours` variant, the *receiver* of the packet. A relay receiving a packet mentioning `#![0
mentioning `#![0 555]`, it will later use `#![1 555]` if it needs to send a packet to refer to 555]` will use `#![1 555]` in later responses that refer to that same entity, and *vice versa*.
that same entity.
### Attenuation of authority ### Attenuation of authority
@ -166,10 +169,10 @@ additional conditions on the receiver's use of its own capability, known as an
An attenuation is a chain of `Caveat`s.[^caveat-terminology-macaroon] A `Caveat` acts as a An attenuation is a chain of `Caveat`s.[^caveat-terminology-macaroon] A `Caveat` acts as a
function that, given a Preserves value representing an assertion or message body, yields either function that, given a Preserves value representing an assertion or message body, yields either
a possibly-rewritten value, or no value at all. In the latter case, the value has been a possibly-rewritten value, or no value at all.[^zero-or-more] In the latter case, the value
*rejected*. In the former case, the rewritten value is used as input to the next `Caveat` in has been *rejected*. In the former case, the rewritten value is used as input to the next
the chain, or as the final assertion or message body for delivery to the entity backing the `Caveat` in the chain, or as the final assertion or message body for delivery to the entity
capability. backing the capability.
The chain of `Caveats` in an attenuation is written down in *reverse* order: newer `Caveat`s The chain of `Caveats` in an attenuation is written down in *reverse* order: newer `Caveat`s
are appended to the sequence, and each `Caveat`'s output is fed into the input of the next are appended to the sequence, and each `Caveat`'s output is fed into the input of the next
@ -193,7 +196,7 @@ captured by the pattern are gathered together and used in instantiation of the `
has rejected the input, and other `alternatives` are tried until none remain, at which point has rejected the input, and other `alternatives` are tried until none remain, at which point
the whole `Caveat` has rejected the input and processing of the triggering event stops. the whole `Caveat` has rejected the input and processing of the triggering event stops.
#### Patterns ### Patterns
A `Pattern` within a rewrite can be any of the following variants: A `Pattern` within a rewrite can be any of the following variants:
@ -201,54 +204,54 @@ A `Pattern` within a rewrite can be any of the following variants:
Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound . Pattern = PDiscard / PAtom / PEmbedded / PBind / PAnd / PNot / Lit / PCompound .
``` ```
`PDiscard` matches any value: **Wildcard.** `PDiscard` matches any value:
```preserves-schema ```preserves-schema
PDiscard = <_>. PDiscard = <_>.
``` ```
`PAtom` requires that a matched value be a boolean, a single- or double-precision float, an **Atomic type.** `PAtom` requires that a matched value be a boolean, a single- or double-precision float, an
integer, a string, a binary blob, or a symbol, respectively: integer, a string, a binary blob, or a symbol, respectively:
```preserves-schema ```preserves-schema
PAtom = =Boolean / =Float / =Double / =SignedInteger / =String / =ByteString / =Symbol . PAtom = =Boolean / =Float / =Double / =SignedInteger / =String / =ByteString / =Symbol .
``` ```
`PEmbedded` requires that a matched value be an embedded capability: **Embedded value.** `PEmbedded` requires that a matched value be an embedded capability:
```preserves-schema ```preserves-schema
PEmbedded = =Embedded . PEmbedded = =Embedded .
``` ```
`PBind` first *captures* the matched value, adding it to the bindings vector, and then applies **Binding.** `PBind` first *captures* the matched value, adding it to the bindings vector, and then applies
the nested `pattern`. If the subpattern matches, the `PBind` succeeds; otherwise, it fails: the nested `pattern`. If the subpattern matches, the `PBind` succeeds; otherwise, it fails:
```preserves-schema ```preserves-schema
PBind = <bind @pattern Pattern>. PBind = <bind @pattern Pattern>.
``` ```
`PAnd` is a conjunction of patterns; every pattern in `patterns` must match for the `PAnd` to **Conjunction.** `PAnd` is a conjunction of patterns; every pattern in `patterns` must match for the `PAnd` to
match: match:
```preserves-schema ```preserves-schema
PAnd = <and @patterns [Pattern ...]>. PAnd = <and @patterns [Pattern ...]>.
``` ```
`PNot` is a pattern negation: if `pattern` matches, the `PNot` fails to match, and *vice **Negation.** `PNot` is a pattern negation: if `pattern` matches, the `PNot` fails to match, and *vice
versa*. It is an error for `pattern` to include any `PBind` subpatterns. versa*. It is an error for `pattern` to include any `PBind` subpatterns.
```preserves-schema ```preserves-schema
PNot = <not @pattern Pattern>. PNot = <not @pattern Pattern>.
``` ```
`Lit` is an exact match pattern. If the matched value is exactly equal to `value` (according to **Literal.** `Lit` is an exact match pattern. If the matched value is exactly equal to `value` (according to
Preserves' own built-in equivalence relation), the match succeeds; otherwise, it fails: Preserves' own built-in equivalence relation), the match succeeds; otherwise, it fails:
```preserves-schema ```preserves-schema
Lit = <lit @value any>. Lit = <lit @value any>.
``` ```
Finally, `PCompound` patterns match compound data structures. The `rec` variant demands that a **Compound.** Finally, `PCompound` patterns match compound data structures. The `rec` variant demands that a
matched value be a record, with label exactly equal to `label` and fields one-for-one matching matched value be a record, with label exactly equal to `label` and fields one-for-one matching
the `Pattern`s in `fields`; the `arr` variant demands a sequence, with each element matching the `Pattern`s in `fields`; the `arr` variant demands a sequence, with each element matching
the corresponding element of `items`; and `dict` demands a dictionary having *at least* entries the corresponding element of `items`; and `dict` demands a dictionary having *at least* entries
@ -261,18 +264,20 @@ PCompound =
/ @dict <dict @entries { any: Pattern ...:... }> . / @dict <dict @entries { any: Pattern ...:... }> .
``` ```
#### Bindings ### Bindings
Bindings resulting from matching are stored as a sequence of values. Matching notionally produces a sequence of values, one for each `PBind` in the pattern.
During matching, when a `PBind` pattern is seen, the matcher *first* appends the matched value When a `PBind` pattern is seen, the matcher *first* appends the matched value to the binding
to the binding sequence and *then* recurses on the nested subpattern. This makes binding sequence and *then* recurses on the nested subpattern. This makes binding *indexes* appear in
*indexes* appear in left-to-right order as a `Pattern` is read. left-to-right order as a `Pattern` is read.
For example, given the pattern `<bind <arr [<bind <_>>, <bind <_>>]>>` and the matched value **Example.** Given the pattern `<bind <arr [<bind <_>>, <bind <_>>]>>` and the matched value
`[1 2]`, the resulting captured values will be, in order, `[1 2]`, `1`, and `2`. `["a" "b"]`, the resulting captured values are, in order, `["a" "b"]`, `"a"`, and `"b"`; the
template `<ref 0>` will be instantiated to `["1" "2"]`, `<ref 1>` to `"a"`, and `<ref 2>` to
`"b"`.
#### Templates ### Templates
A `Template` within a rewrite produces a concrete Preserves value when instantiated with a A `Template` within a rewrite produces a concrete Preserves value when instantiated with a
vector of captured binding values. Template instantiation may fail, yielding no value. vector of captured binding values. Template instantiation may fail, yielding no value.
@ -318,7 +323,7 @@ TCompound =
/ @dict <dict @entries { any: Template ...:... }> . / @dict <dict @entries { any: Template ...:... }> .
``` ```
#### Validity of Caveats ### Validity of Caveats
The above definitions imply some *validity constraints* on `Caveat`s. The above definitions imply some *validity constraints* on `Caveat`s.
@ -335,12 +340,12 @@ Implementations MUST enforce these constraints (either statically or dynamically
## Membranes ## Membranes
In order to correctly map between embedded references on the wire and entity references local Every relay maintains two stateful objects called *membranes*. A membrane is a bidirectional
to the relay, the relay maintains two stateful objects called *membranes*. A membrane is a mapping between [OID](./glossary.md#oid) and relay-internal entity pointer. Membranes connect
bidirectional mapping between [OID](./glossary.md#oid) and relay-internal entity pointer. embedded references on the wire to entity references local to the relay.
- The *import membrane* connects OIDs managed by the *remote* peer to local *relay entities* - The *import membrane* connects OIDs managed by the *remote* peer to local [relay
which proxy access to an "imported" remote entity. entities](#relay-entities) which proxy access to an "imported" remote entity.
- The *export membrane* connects OIDs managed by the *local* peer to any local "exported" - The *export membrane* connects OIDs managed by the *local* peer to any local "exported"
entities accessible to the peer. entities accessible to the peer.
@ -380,6 +385,7 @@ bidirectional mapping between [OID](./glossary.md#oid) and relay-internal entity
| |
``` ```
<!--
```ditaa one-sided-membrane ```ditaa one-sided-membrane
---------------------------------------+ ---------------------------------------+
@ -404,6 +410,7 @@ bidirectional mapping between [OID](./glossary.md#oid) and relay-internal entity
----------------------------------------+ ----------------------------------------+
``` ```
-->
<!-- <!--
Each relay rewrites the embedded references in the messages it sends and receives. It maps back Each relay rewrites the embedded references in the messages it sends and receives. It maps back
@ -445,6 +452,100 @@ peers.
--> -->
Logically, a membrane's state can be represented as a set of `WireSymbol` structures: a
`WireSymbol` is a triple of an OID, a local reference pointer (its *ref*), and a reference
count. There is never more than one `WireSymbol` associated with an OID or a ref.
A `WireSymbol` exists only so long as some assertion mentioning its OID exists across the relay
link. When the last assertion mentioning an OID is retracted, its `WireSymbol` is deleted.
Assertions mentioning a particular OID can come from *either side* of the relay link:
initially, a local reference is sent to the peer in an assertion, but then the peer may assert
something *back*, either targeting or mentioning the same entity. Care must be taken not to
release an OID entry prematurely in such situations.
For example, at least the following contribute to a `WireSymbol`'s reference count:
- The initial entry mapping a local entity ref to an well-known OID for use at session startup
([see below](#well-known-oids)) contributes a permanent reference.
- Mention of an OID in a received *or sent* `TurnEvent` adds one to the OID's reference count
for the duration of processing of the event. For `Assert` events in either direction, the
duration of processing is until the assertion is later retracted. For received `Message`
events, the duration of processing is until the incoming message has been forwarded on to
the target ref.
**"Transient" references.** Embedded references in `Message` event bodies are special. Because
messages, unlike assertions, have no notion of lifetime—they are forwarded and forgotten—it is
not possible for a message to cause establishment of a long-lived entry in a membrane's
`WireSymbol` set. Therefore, messages MUST NOT embed any reference not previously known to the
peer (a "transient reference"). In other words, only after using an *assertion* to introduce a
reference, associating a conversational context with its lifetime, is it permitted to discuss
the reference using *messages*. A relay receiving a message bearing a transient reference MUST
terminate the session with an error. A relay about to send such a message SHOULD preemptively
refuse to do so.
### Rewriting embedded references upon receipt
When processing a `Value` *v* in a received `Assert` or `Message` event, embedded references in
*v* are decoded from their [on-the-wire `WireRef` form](#capabilities-on-the-wire) to in-memory
ref-pointer form.
The value is recursively traversed. As the relay comes across each embedded `WireRef`,
- If it is of `mine` variant, it refers to an entity exported by the remote, sending peer. Its
OID is looked up in the import membrane.
- If no `WireSymbol` exists in the import membrane, one is created, mapping the OID to a
fresh [relay entity](#relay-entities).
- If a `WireSymbol` is already present, its associated ref is substituted into *v*.
- If it is of `yours` variant, it refers to an entity previously exported by the local,
receiving peer. Its OID is looked up in the export membrane.
- If no `WireSymbol` exists for the OID, one is created, associating the OID with a dummy
inert entity ref. The dummy ref is substituted into *v*.
- If a `WireSymbol` exists for the OID, and the `WireRef` is not
[attenuated](#attenuation-of-authority), the associated ref is substituted into *v*. If
the `WireRef` *is* attenuated, the associated ref is wrapped with the `Caveat`s from the
`WireRef` before its substitution into *v*.
- In each case, the `WireSymbol` associated with the OID has its reference count incremented
(if an `Assert` is being processed).
### Rewriting embedded references for transmission
When transmitting a `Value` *v* in an `Assert` or `Message` event, embedded references in *v*
are encoded from their in-memory ref-pointer form to [on-the-wire `WireRef`
form](#capabilities-on-the-wire).
The value is recursively traversed. As the relay comes across each embedded reference:
- The reference is first looked up in the export membrane. If an associated `WireSymbol` is
present in the export membrane, its OID is substituted as a `mine`-variant `WireRef` into
*v*.
- Otherwise, it is looked up in the import membrane. If *no* associated `WireSymbol` exists
there, a fresh OID and `WireSymbol` are placed in the export membrane, and the new OID is
substituted as a `mine`-variant `WireRef` into *v*.
- Otherwise, it refers to a previously-imported entity.
- If the local entity reference has not been attenuated subsequent to its import, the OID it
was imported under is substituted as a `yours`-variant `WireRef` into *v* with an empty
attenuation.
- If it has been attenuated, [the relay may choose whether to trust the remote party to
enforce an attenuation request](#attenuation-of-authority). If it trusts the peer to
honour attenuation requests, it substitutes a `yours`-variant `WireRef` with non-empty
attenuation into *v*. Otherwise, a fresh OID and `WireSymbol` are placed in the export
membrane, with ref denoting to the attenuated local reference, and the new OID is
substituted as a `mine`-variant `WireRef` into *v*.
## Relay entities
## Client and server roles ## Client and server roles
## Well-known OIDs ## Well-known OIDs
@ -453,6 +554,21 @@ OID 0, initial ref, initial oid
## Security considerations ## Security considerations
((Tease out into Related Work section?))
OIDs are locally-meaningful only, so if the transport is secure, so is the reference. Can't
steal one and put it on a different transport: it's like taking fd 6 from another process and
trying to use fd 6 locally to mean what the other process means. Extensive related work and
prior art here.
http://www.erights.org/elib/distrib/captp/index.html
Relate terms here to captp terms:
- Hah, `NonceLocator` vs `Gatekeeper`
- well-known "positions" (??) (vs "OID"s?)
- OID = "index", "capability-list index", "c-list index"
- @cwebber says "c-list is the structure mapping descriptors to live-refs"
### Secrecy ### Secrecy
### Privacy ### Privacy
@ -658,6 +774,10 @@ def instantiate(template, bindings):
IP over IP. A variation of the Syndicate Protocol like this gives [federated IP over IP. A variation of the Syndicate Protocol like this gives [federated
dataspaces](https://syndicate-lang.org/about/history/#postdoc). dataspaces](https://syndicate-lang.org/about/history/#postdoc).
[^automatic-when-implemented-with-sam]: This process of assertion-retraction on termination is
largely automatic when relay actors are structured internally using the SAM: simply
terminating a SAM actor automatically retracts its published assertions.
[^no-extensions-yet]: This specification does not define any extensions, but future revisions [^no-extensions-yet]: This specification does not define any extensions, but future revisions
could, for example, use extensions to perform version-negotiation. Another potential future could, for example, use extensions to perform version-negotiation. Another potential future
use could be to propagate provenance information for tracing/debugging. use could be to propagate provenance information for tracing/debugging.
@ -680,3 +800,11 @@ def instantiate(template, bindings):
on [Macaroons](./glossary.md#macaroon), where it is used to describe a more general on [Macaroons](./glossary.md#macaroon), where it is used to describe a more general
mechanism. Future versions of this specification may opt to include some of this mechanism. Future versions of this specification may opt to include some of this
generality. generality.
[^zero-or-more]: TODO: It might be better to have a `Caveat` yield *zero or more* values? That
way they can act as filters. I've sometimes wanted the multiple-value case, though I've so
far been able to work around its lack. TODO: Perhaps it would also make sense to have a
`Caveat` map an *event* to zero or more *events*, rather than to values? Tricky corners
there include ensuring that carried authority isn't misused; macaroons are a very elegant
solution to this problem, of course, so maybe the macaroon design idea could be adapted to
this. For now, `Value`→`Option<Value>` is probably OK.