First complete (?) draft of the protocol spec
This commit is contained in:
parent
b16ea848b4
commit
6b2540dc12
183
src/protocol.md
183
src/protocol.md
|
@ -31,7 +31,7 @@ Transports must
|
||||||
- be able to carry [Preserves](./glossary.md#preserves) values back and forth,
|
- be able to carry [Preserves](./glossary.md#preserves) values back and forth,
|
||||||
- be reliable and in-order,
|
- be reliable and in-order,
|
||||||
- have a well-defined session lifecycle (created → connected → disconnected), and
|
- have a well-defined session lifecycle (created → connected → disconnected), and
|
||||||
- assure confidentiality, integrity and replay-resistance.
|
- assure confidentiality, integrity, authenticity, and replay-resistance.
|
||||||
|
|
||||||
This document focuses primarily on point-to-point transports, discussing multicast and
|
This document focuses primarily on point-to-point transports, discussing multicast and
|
||||||
in-memory variations briefly toward the end.
|
in-memory variations briefly toward the end.
|
||||||
|
@ -545,46 +545,185 @@ The value is recursively traversed. As the relay comes across each embedded refe
|
||||||
|
|
||||||
## Relay entities
|
## Relay entities
|
||||||
|
|
||||||
|
A relay entity is a local proxy for an entity at the other side of a relay link. It forwards
|
||||||
|
events delivered to it—`assert`, `retract`, `message` and `sync`—across the link to its
|
||||||
|
counterpart at the other end. It holds two pieces of state: a pointer to the relay link, and
|
||||||
|
the OID of the remote entity it represents. It packages all received events into `TurnEvent`s
|
||||||
|
which are then sent across the transport.
|
||||||
|
|
||||||
## Client and server roles
|
**Turn boundaries.** When the relay is structured internally using the SAM, it is important to
|
||||||
|
preserve turn boundaries. When all the relay entities of a given relay instance are managed by
|
||||||
|
a single actor, this will be natural: a single turn can deliver events to a group of entities
|
||||||
|
in the actor, so if the relay entity enqueues its `TurnEvent`s in a buffer which is flushed
|
||||||
|
into a `Turn` packet sent across the transport at the conclusion of the turn, the correct turn
|
||||||
|
boundaries will be preserved.
|
||||||
|
|
||||||
## Well-known OIDs
|
## <span id="well-known-oids">Client and server roles
|
||||||
|
|
||||||
OID 0, initial ref, initial oid
|
While the protocol itself is symmetric, in many cases there will be one active ("client") and
|
||||||
|
one passive ("server") party during the establishment of a transport connection.
|
||||||
|
|
||||||
|
As an optional convention, a "server" MAY have a single entity exposed as *well-known OID* 0 at
|
||||||
|
the establishment of a connection, and a "client" MAY likewise expect OID 0 to resolve to some
|
||||||
|
pre-arranged entity. It is frequently useful for the pre-arranged entity to be a [gatekeeper
|
||||||
|
service](./builtin/gatekeeper.md), but direct exposure of a
|
||||||
|
[dataspace](./glossary.md#dataspace) or even some domain-specific object can also be useful.
|
||||||
|
Either or both party to a connection may play one role, the other, neither, or both.
|
||||||
|
|
||||||
|
APIs for making use of relays in programs should permit programs to supply to a
|
||||||
|
newly-constructed relay an (optional) *initial ref*, to be exposed as well-known OID 0; an
|
||||||
|
(optional) *initial OID*, to denote a remote well-known OID and to be immediately proxied by a
|
||||||
|
local relay entity; or both.
|
||||||
|
|
||||||
|
In the case of TCP/IP, the "client" role is often played by a `connect`ing party, and the
|
||||||
|
"server" by a `listen`ing party, but the opposite arrangement is also useful from time to time.
|
||||||
|
|
||||||
## Security considerations
|
## Security considerations
|
||||||
|
|
||||||
((Tease out into Related Work section?))
|
The security considerations for this protocol fall into two categories: those having to do with
|
||||||
|
particular transports for relay instances, and those having to do with the protocol itself.
|
||||||
|
|
||||||
OIDs are locally-meaningful only, so if the transport is secure, so is the reference. Can't
|
### Transport security
|
||||||
steal one and put it on a different transport: it's like taking fd 6 from another process and
|
|
||||||
trying to use fd 6 locally to mean what the other process means. Extensive related work and
|
|
||||||
prior art here.
|
|
||||||
|
|
||||||
http://www.erights.org/elib/distrib/captp/index.html
|
The security of an instance of the protocol depends on the security characteristics of its
|
||||||
|
transport.
|
||||||
|
|
||||||
Relate terms here to captp terms:
|
**Confidentiality.** Parties outwith the communicating peers must not be able to deduce the
|
||||||
- Hah, `NonceLocator` vs `Gatekeeper`
|
contents of packets sent back and forth: some of the packets may contain secrets. For example,
|
||||||
- well-known "positions" (??) (vs "OID"s?)
|
a `Resolve` message sent to a [gatekeeper service](./builtin/gatekeeper.md) contains a "bearer
|
||||||
- OID = "index", "capability-list index", "c-list index"
|
capability", which conveys authority to any holder able to present it to the gatekeeper.
|
||||||
- @cwebber says "c-list is the structure mapping descriptors to live-refs"
|
|
||||||
|
|
||||||
### Secrecy
|
**Integrity.** Packets delivered to peers must be proof from tampering or other in-flight
|
||||||
|
damage.
|
||||||
|
|
||||||
### Privacy
|
**Authenticity.** Each packet delivered to a peer must have genuinely originated with another
|
||||||
|
party, and must have genuinely originated in the same session. Forgery of packets must be
|
||||||
|
prevented.
|
||||||
|
|
||||||
|
**Replay-resistance.** Each packet delivered to a peer must be delivered exactly once within
|
||||||
|
the context of the transport session. That is, replay of otherwise-authentic packets must not
|
||||||
|
be possible from outside the session.
|
||||||
|
|
||||||
|
### Protocol security
|
||||||
|
|
||||||
|
The protocol builds on, and directly reflects, the [object-capability security
|
||||||
|
model](./glossary.md#object-capability-model) of the SAM. Entities are accessed via unforgeable
|
||||||
|
references (OIDs). OIDs are meaningful only within the context of their transport session; in
|
||||||
|
this way, they are analogous to Unix file descriptors, which are small integers that
|
||||||
|
meaningfully denote objects only within the context of a single Unix process. If the transport
|
||||||
|
is secure, so is the reference.
|
||||||
|
|
||||||
|
Entities can only obtain references to other entities by the [standard methods by which
|
||||||
|
"connectivity begets
|
||||||
|
connectivity"](http://www.erights.org/elib/capability/ode/ode-capabilities.html); namely:
|
||||||
|
|
||||||
|
- *By initial conditions.* The relevant initial conditions here are the state of the relays at
|
||||||
|
the moment a transport session is established, including any mappings from [well-known
|
||||||
|
OIDs](#well-known-oids) to their underlying refs.
|
||||||
|
|
||||||
|
- *By parenthood and by endowment.* No direct provision is made for creation of new entities
|
||||||
|
in this protocol, so these do not apply.
|
||||||
|
|
||||||
|
- *By introduction.* Transmission of OIDs in `Turn` packets, and the associated [rules for
|
||||||
|
managing the mappings between OIDs and references](#membranes), are the normal method by
|
||||||
|
which references pass from one entity to another.
|
||||||
|
|
||||||
|
While transport confidentiality is important for preserving secrecy of secrets such as bearer
|
||||||
|
capabilities, OIDs do not need this kind of protection. An attacker able to observe OIDs
|
||||||
|
communicated via a transport does not gain authority to deliver events to the denoted entity.
|
||||||
|
At most, the attacker may glean information on patterns of interconnectivity among entities
|
||||||
|
communicating across a transport link.
|
||||||
|
|
||||||
|
## Relation to CapTP
|
||||||
|
|
||||||
|
This protocol is *strikingly* similar to a family of protocols known as
|
||||||
|
[CapTP](http://www.erights.org/elib/distrib/captp/index.html) (see, for example,
|
||||||
|
[here](http://www.erights.org/elib/distrib/captp/index.html),
|
||||||
|
[here](https://spritelyproject.org/news/what-is-captp.html) and
|
||||||
|
[here](https://github.com/ocapn/ocapn)). This is no accident: the Syndicated Actor Model draws
|
||||||
|
heavily on the actor model, and has over the years been incrementally evolving to be closer and
|
||||||
|
closer to the actor model as it appears in the [E programming
|
||||||
|
language](http://www.erights.org/). However, the Syndicate protocol described in this document
|
||||||
|
was developed based on the needs of the Syndicated Actor Model, without particular reference to
|
||||||
|
CapTP. This makes it all the more striking that the similarities should be so strong. No doubt
|
||||||
|
I have been subconsciously as well as consciously influenced by E's design, but perhaps there
|
||||||
|
might also be a Platonic form awaiting discovery somewhere nearby.
|
||||||
|
|
||||||
|
For example:
|
||||||
|
|
||||||
|
- CapTP has the notion of a "c-list [capability list] index", cognate with our OID. A c-list
|
||||||
|
index is meaningful only within the context of a transport connection, just like an OID is.
|
||||||
|
A given c-list index maps to a "live-ref", an in-memory pointer to an object, in the same
|
||||||
|
way that an OID maps to a ref via a `WireSymbol`.
|
||||||
|
|
||||||
|
- CapTP has "[the four tables](http://www.erights.org/elib/distrib/captp/4tables.html)" at
|
||||||
|
each end of a connection; each of our relays has two [membranes](#membranes), each having
|
||||||
|
two unidirectional mapping tables.
|
||||||
|
|
||||||
|
- Syndicate [gatekeeper services](./builtin/gatekeeper.md) borrow the concept of a
|
||||||
|
[SturdyRef](http://wiki.erights.org/wiki/SturdyRef) directly from CapTP. However, the notion
|
||||||
|
of a gatekeeper entity at well-known OID 0 is an example of convergent evolution in action:
|
||||||
|
in the CapTP world, the [analogous
|
||||||
|
service](http://www.erights.org/elib/distrib/captp/NonceLocator.html) happens also to be
|
||||||
|
available at c-list index 0, by convention.
|
||||||
|
|
||||||
|
A notable difference is that this protocol completely lacks support for the promises/futures of
|
||||||
|
CapTP. CapTP c-list indices are just one part of a framework of
|
||||||
|
[descriptors](http://www.erights.org/elib/distrib/captp/index.html) (*desc*s) denoting various
|
||||||
|
kinds of remote object and eventual remote-procedure-call (RPC) result. The SAM handles RPC in
|
||||||
|
a different, more low-level way.
|
||||||
|
|
||||||
## Specific transport mappings
|
## Specific transport mappings
|
||||||
|
|
||||||
TCP/IP
|
For now, this document focuses on `SOCK_STREAM`-like transports: reliable, in-order,
|
||||||
|
bidirectional, connection-oriented, fully-duplex byte streams. While these transports naturally
|
||||||
|
have a certain level of integrity assurance and replay-resistance associated with them, special
|
||||||
|
care should be taken in the case of non-cryptographic transport protocols like plain TCP/IP.
|
||||||
|
|
||||||
TLS TCP/IP
|
To use such a transport for this protocol, establish a connection and begin transmitting
|
||||||
|
[`Packet`s](#packet-definitions) encoded as Preserves values using either the Preserves [text
|
||||||
|
syntax](https://preserves.gitlab.io/preserves/preserves.html#textual-syntax) or the Preserves
|
||||||
|
[binary syntax](https://preserves.gitlab.io/preserves/preserves.html#compact-binary-syntax).
|
||||||
|
The session starts with the first packet and ends with transport disconnection. It MUST
|
||||||
|
disconnect the transport upon syntax error. A responding server MUST support the binary syntax,
|
||||||
|
and MAY also support the text syntax. It can autodetect the syntax variant by following [the
|
||||||
|
rules in the
|
||||||
|
specification](https://preserves.gitlab.io/preserves/preserves.html#appendix-autodetection-of-textual-or-binary-syntax):
|
||||||
|
in short, the first byte of a valid binary-syntax Preserves document is guaranteed not to be
|
||||||
|
interpretable as the start of a valid UTF-8 sequence.
|
||||||
|
|
||||||
WebSockets
|
`Packet`s encoded in either binary or text syntax are self-delimiting. However, peers using
|
||||||
|
text syntax MAY choose to insert whitespace (e.g. newline) after each transmitted packet.
|
||||||
|
|
||||||
|
Some domain-specific details are also relevant:
|
||||||
|
|
||||||
## Other kinds of medium
|
- **Unix-domain sockets.** An additional layer of authentication checks can be made based on
|
||||||
|
process-ID and user-ID credentials associated with each Unix-domain socket.
|
||||||
|
|
||||||
|
- **TCP/IP sockets.** Plain TCP/IP sockets offer only weak message integrity and
|
||||||
|
replay-resistance guarantees, and offer no authenticity or confidentiality guarantees at
|
||||||
|
all. Plain TCP/IP sockets SHOULD NOT be used; consider using TLS sockets instead.
|
||||||
|
|
||||||
|
- **TLS atop TCP/IP.** An additional layer of authentication checks can be made based on the
|
||||||
|
signatures and certificates exchanged during TLS setup.
|
||||||
|
|
||||||
|
> TODO: concretely develop some recommendations for ordinary use of TLS certificates,
|
||||||
|
> including referencing a domain name in a `SturdyRef`, checking the presented certificate,
|
||||||
|
> and requiring SNI at the server end.
|
||||||
|
|
||||||
|
- **WebSockets atop HTTP 1.x.** These suffer similar flaws to plain TCP/IP sockets and SHOULD NOT
|
||||||
|
be used.
|
||||||
|
|
||||||
|
- **WebSockets atop HTTPS 1.x.** Similar considerations to the use of TLS sockets apply
|
||||||
|
regarding authentication checks. WebSocket messages are self-delimiting; peers MUST place
|
||||||
|
exactly one `Packet` in each WebSocket message. Since (a) WebSockets are established after a
|
||||||
|
standard HTTP(S) message header exchange, (b) every HTTP(S) request header starts with an
|
||||||
|
ASCII letter, and (c) every `Packet` in text syntax begins with the ASCII "`<`" character,
|
||||||
|
it is possible to autodetect use of a WebSocket protocol multiplexed on a server socket that
|
||||||
|
is also able to handle plain Preserves binary and/or text syntax for `Packet`s: any ASCII
|
||||||
|
character between "`A`" and "`Z`" or "`a`" and "`z`" must be HTTP, an ASCII "`<`" must be
|
||||||
|
Preserves text syntax, and any byte with the high bit set must be Preserves binary syntax.
|
||||||
|
|
||||||
Multicast/broadcast, in-memory
|
|
||||||
|
|
||||||
## Appendix: Complete schema of the protocol
|
## Appendix: Complete schema of the protocol
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue