First complete (?) draft of the protocol spec
This commit is contained in:
parent
b16ea848b4
commit
6b2540dc12
183
src/protocol.md
183
src/protocol.md
|
@ -31,7 +31,7 @@ Transports must
|
|||
- be able to carry [Preserves](./glossary.md#preserves) values back and forth,
|
||||
- be reliable and in-order,
|
||||
- have a well-defined session lifecycle (created → connected → disconnected), and
|
||||
- assure confidentiality, integrity and replay-resistance.
|
||||
- assure confidentiality, integrity, authenticity, and replay-resistance.
|
||||
|
||||
This document focuses primarily on point-to-point transports, discussing multicast and
|
||||
in-memory variations briefly toward the end.
|
||||
|
@ -545,46 +545,185 @@ The value is recursively traversed. As the relay comes across each embedded refe
|
|||
|
||||
## Relay entities
|
||||
|
||||
A relay entity is a local proxy for an entity at the other side of a relay link. It forwards
|
||||
events delivered to it—`assert`, `retract`, `message` and `sync`—across the link to its
|
||||
counterpart at the other end. It holds two pieces of state: a pointer to the relay link, and
|
||||
the OID of the remote entity it represents. It packages all received events into `TurnEvent`s
|
||||
which are then sent across the transport.
|
||||
|
||||
## Client and server roles
|
||||
**Turn boundaries.** When the relay is structured internally using the SAM, it is important to
|
||||
preserve turn boundaries. When all the relay entities of a given relay instance are managed by
|
||||
a single actor, this will be natural: a single turn can deliver events to a group of entities
|
||||
in the actor, so if the relay entity enqueues its `TurnEvent`s in a buffer which is flushed
|
||||
into a `Turn` packet sent across the transport at the conclusion of the turn, the correct turn
|
||||
boundaries will be preserved.
|
||||
|
||||
## Well-known OIDs
|
||||
## <span id="well-known-oids">Client and server roles
|
||||
|
||||
OID 0, initial ref, initial oid
|
||||
While the protocol itself is symmetric, in many cases there will be one active ("client") and
|
||||
one passive ("server") party during the establishment of a transport connection.
|
||||
|
||||
As an optional convention, a "server" MAY have a single entity exposed as *well-known OID* 0 at
|
||||
the establishment of a connection, and a "client" MAY likewise expect OID 0 to resolve to some
|
||||
pre-arranged entity. It is frequently useful for the pre-arranged entity to be a [gatekeeper
|
||||
service](./builtin/gatekeeper.md), but direct exposure of a
|
||||
[dataspace](./glossary.md#dataspace) or even some domain-specific object can also be useful.
|
||||
Either or both party to a connection may play one role, the other, neither, or both.
|
||||
|
||||
APIs for making use of relays in programs should permit programs to supply to a
|
||||
newly-constructed relay an (optional) *initial ref*, to be exposed as well-known OID 0; an
|
||||
(optional) *initial OID*, to denote a remote well-known OID and to be immediately proxied by a
|
||||
local relay entity; or both.
|
||||
|
||||
In the case of TCP/IP, the "client" role is often played by a `connect`ing party, and the
|
||||
"server" by a `listen`ing party, but the opposite arrangement is also useful from time to time.
|
||||
|
||||
## Security considerations
|
||||
|
||||
((Tease out into Related Work section?))
|
||||
The security considerations for this protocol fall into two categories: those having to do with
|
||||
particular transports for relay instances, and those having to do with the protocol itself.
|
||||
|
||||
OIDs are locally-meaningful only, so if the transport is secure, so is the reference. Can't
|
||||
steal one and put it on a different transport: it's like taking fd 6 from another process and
|
||||
trying to use fd 6 locally to mean what the other process means. Extensive related work and
|
||||
prior art here.
|
||||
### Transport security
|
||||
|
||||
http://www.erights.org/elib/distrib/captp/index.html
|
||||
The security of an instance of the protocol depends on the security characteristics of its
|
||||
transport.
|
||||
|
||||
Relate terms here to captp terms:
|
||||
- Hah, `NonceLocator` vs `Gatekeeper`
|
||||
- well-known "positions" (??) (vs "OID"s?)
|
||||
- OID = "index", "capability-list index", "c-list index"
|
||||
- @cwebber says "c-list is the structure mapping descriptors to live-refs"
|
||||
**Confidentiality.** Parties outwith the communicating peers must not be able to deduce the
|
||||
contents of packets sent back and forth: some of the packets may contain secrets. For example,
|
||||
a `Resolve` message sent to a [gatekeeper service](./builtin/gatekeeper.md) contains a "bearer
|
||||
capability", which conveys authority to any holder able to present it to the gatekeeper.
|
||||
|
||||
### Secrecy
|
||||
**Integrity.** Packets delivered to peers must be proof from tampering or other in-flight
|
||||
damage.
|
||||
|
||||
### Privacy
|
||||
**Authenticity.** Each packet delivered to a peer must have genuinely originated with another
|
||||
party, and must have genuinely originated in the same session. Forgery of packets must be
|
||||
prevented.
|
||||
|
||||
**Replay-resistance.** Each packet delivered to a peer must be delivered exactly once within
|
||||
the context of the transport session. That is, replay of otherwise-authentic packets must not
|
||||
be possible from outside the session.
|
||||
|
||||
### Protocol security
|
||||
|
||||
The protocol builds on, and directly reflects, the [object-capability security
|
||||
model](./glossary.md#object-capability-model) of the SAM. Entities are accessed via unforgeable
|
||||
references (OIDs). OIDs are meaningful only within the context of their transport session; in
|
||||
this way, they are analogous to Unix file descriptors, which are small integers that
|
||||
meaningfully denote objects only within the context of a single Unix process. If the transport
|
||||
is secure, so is the reference.
|
||||
|
||||
Entities can only obtain references to other entities by the [standard methods by which
|
||||
"connectivity begets
|
||||
connectivity"](http://www.erights.org/elib/capability/ode/ode-capabilities.html); namely:
|
||||
|
||||
- *By initial conditions.* The relevant initial conditions here are the state of the relays at
|
||||
the moment a transport session is established, including any mappings from [well-known
|
||||
OIDs](#well-known-oids) to their underlying refs.
|
||||
|
||||
- *By parenthood and by endowment.* No direct provision is made for creation of new entities
|
||||
in this protocol, so these do not apply.
|
||||
|
||||
- *By introduction.* Transmission of OIDs in `Turn` packets, and the associated [rules for
|
||||
managing the mappings between OIDs and references](#membranes), are the normal method by
|
||||
which references pass from one entity to another.
|
||||
|
||||
While transport confidentiality is important for preserving secrecy of secrets such as bearer
|
||||
capabilities, OIDs do not need this kind of protection. An attacker able to observe OIDs
|
||||
communicated via a transport does not gain authority to deliver events to the denoted entity.
|
||||
At most, the attacker may glean information on patterns of interconnectivity among entities
|
||||
communicating across a transport link.
|
||||
|
||||
## Relation to CapTP
|
||||
|
||||
This protocol is *strikingly* similar to a family of protocols known as
|
||||
[CapTP](http://www.erights.org/elib/distrib/captp/index.html) (see, for example,
|
||||
[here](http://www.erights.org/elib/distrib/captp/index.html),
|
||||
[here](https://spritelyproject.org/news/what-is-captp.html) and
|
||||
[here](https://github.com/ocapn/ocapn)). This is no accident: the Syndicated Actor Model draws
|
||||
heavily on the actor model, and has over the years been incrementally evolving to be closer and
|
||||
closer to the actor model as it appears in the [E programming
|
||||
language](http://www.erights.org/). However, the Syndicate protocol described in this document
|
||||
was developed based on the needs of the Syndicated Actor Model, without particular reference to
|
||||
CapTP. This makes it all the more striking that the similarities should be so strong. No doubt
|
||||
I have been subconsciously as well as consciously influenced by E's design, but perhaps there
|
||||
might also be a Platonic form awaiting discovery somewhere nearby.
|
||||
|
||||
For example:
|
||||
|
||||
- CapTP has the notion of a "c-list [capability list] index", cognate with our OID. A c-list
|
||||
index is meaningful only within the context of a transport connection, just like an OID is.
|
||||
A given c-list index maps to a "live-ref", an in-memory pointer to an object, in the same
|
||||
way that an OID maps to a ref via a `WireSymbol`.
|
||||
|
||||
- CapTP has "[the four tables](http://www.erights.org/elib/distrib/captp/4tables.html)" at
|
||||
each end of a connection; each of our relays has two [membranes](#membranes), each having
|
||||
two unidirectional mapping tables.
|
||||
|
||||
- Syndicate [gatekeeper services](./builtin/gatekeeper.md) borrow the concept of a
|
||||
[SturdyRef](http://wiki.erights.org/wiki/SturdyRef) directly from CapTP. However, the notion
|
||||
of a gatekeeper entity at well-known OID 0 is an example of convergent evolution in action:
|
||||
in the CapTP world, the [analogous
|
||||
service](http://www.erights.org/elib/distrib/captp/NonceLocator.html) happens also to be
|
||||
available at c-list index 0, by convention.
|
||||
|
||||
A notable difference is that this protocol completely lacks support for the promises/futures of
|
||||
CapTP. CapTP c-list indices are just one part of a framework of
|
||||
[descriptors](http://www.erights.org/elib/distrib/captp/index.html) (*desc*s) denoting various
|
||||
kinds of remote object and eventual remote-procedure-call (RPC) result. The SAM handles RPC in
|
||||
a different, more low-level way.
|
||||
|
||||
## Specific transport mappings
|
||||
|
||||
TCP/IP
|
||||
For now, this document focuses on `SOCK_STREAM`-like transports: reliable, in-order,
|
||||
bidirectional, connection-oriented, fully-duplex byte streams. While these transports naturally
|
||||
have a certain level of integrity assurance and replay-resistance associated with them, special
|
||||
care should be taken in the case of non-cryptographic transport protocols like plain TCP/IP.
|
||||
|
||||
TLS TCP/IP
|
||||
To use such a transport for this protocol, establish a connection and begin transmitting
|
||||
[`Packet`s](#packet-definitions) encoded as Preserves values using either the Preserves [text
|
||||
syntax](https://preserves.gitlab.io/preserves/preserves.html#textual-syntax) or the Preserves
|
||||
[binary syntax](https://preserves.gitlab.io/preserves/preserves.html#compact-binary-syntax).
|
||||
The session starts with the first packet and ends with transport disconnection. It MUST
|
||||
disconnect the transport upon syntax error. A responding server MUST support the binary syntax,
|
||||
and MAY also support the text syntax. It can autodetect the syntax variant by following [the
|
||||
rules in the
|
||||
specification](https://preserves.gitlab.io/preserves/preserves.html#appendix-autodetection-of-textual-or-binary-syntax):
|
||||
in short, the first byte of a valid binary-syntax Preserves document is guaranteed not to be
|
||||
interpretable as the start of a valid UTF-8 sequence.
|
||||
|
||||
WebSockets
|
||||
`Packet`s encoded in either binary or text syntax are self-delimiting. However, peers using
|
||||
text syntax MAY choose to insert whitespace (e.g. newline) after each transmitted packet.
|
||||
|
||||
Some domain-specific details are also relevant:
|
||||
|
||||
## Other kinds of medium
|
||||
- **Unix-domain sockets.** An additional layer of authentication checks can be made based on
|
||||
process-ID and user-ID credentials associated with each Unix-domain socket.
|
||||
|
||||
- **TCP/IP sockets.** Plain TCP/IP sockets offer only weak message integrity and
|
||||
replay-resistance guarantees, and offer no authenticity or confidentiality guarantees at
|
||||
all. Plain TCP/IP sockets SHOULD NOT be used; consider using TLS sockets instead.
|
||||
|
||||
- **TLS atop TCP/IP.** An additional layer of authentication checks can be made based on the
|
||||
signatures and certificates exchanged during TLS setup.
|
||||
|
||||
> TODO: concretely develop some recommendations for ordinary use of TLS certificates,
|
||||
> including referencing a domain name in a `SturdyRef`, checking the presented certificate,
|
||||
> and requiring SNI at the server end.
|
||||
|
||||
- **WebSockets atop HTTP 1.x.** These suffer similar flaws to plain TCP/IP sockets and SHOULD NOT
|
||||
be used.
|
||||
|
||||
- **WebSockets atop HTTPS 1.x.** Similar considerations to the use of TLS sockets apply
|
||||
regarding authentication checks. WebSocket messages are self-delimiting; peers MUST place
|
||||
exactly one `Packet` in each WebSocket message. Since (a) WebSockets are established after a
|
||||
standard HTTP(S) message header exchange, (b) every HTTP(S) request header starts with an
|
||||
ASCII letter, and (c) every `Packet` in text syntax begins with the ASCII "`<`" character,
|
||||
it is possible to autodetect use of a WebSocket protocol multiplexed on a server socket that
|
||||
is also able to handle plain Preserves binary and/or text syntax for `Packet`s: any ASCII
|
||||
character between "`A`" and "`Z`" or "`a`" and "`z`" must be HTTP, an ASCII "`<`" must be
|
||||
Preserves text syntax, and any byte with the high bit set must be Preserves binary syntax.
|
||||
|
||||
Multicast/broadcast, in-memory
|
||||
|
||||
## Appendix: Complete schema of the protocol
|
||||
|
||||
|
|
Loading…
Reference in New Issue