More manual

This commit is contained in:
Tony Garnock-Jones 2022-02-11 23:21:50 +01:00
parent 4c61507658
commit b2194acdbf
2 changed files with 61 additions and 45 deletions

View File

@ -76,7 +76,7 @@
- [Overview](./guide/index.md)
- [Preserves](./guide/preserves.md)
- [Schemas and schema bundles]()
- [Working with schemas]()
- [Language-neutral protocols]()
- [Rust]()
- [Python]()

View File

@ -25,25 +25,24 @@ document](https://preserves.gitlab.io/preserves/preserves.html):
## Why does Synit rely on Preserves?
There are five aspects of Preserves that make it particularly relevant to Synit:
There are four aspects of Preserves that make it particularly relevant to Synit:
- the core Preserves [data language](#grammar-of-values) has a robust *semantics*;
- Preserves values may have [capability references() embedded within them;
- Preserves has a [schema language](#schemas) useful for specifying protocols among actors;
- a [canonical form](#canonical-form) exists for every Preserves value; and
- Preserves has a [query language](#preserves-path) for extracting portions of a Preserves value.
- a [canonical form](#canonical-form) exists for every Preserves value;
- Preserves values may have [capability references](#capabilities) embedded within them; and
- Preserves has a [schema language](#schemas) useful for specifying protocols among actors.
## Grammar of values
The main reason Preserves is useful for Synit is that it has *semantics*: the specification
defines a language-independent *equivalence relation* over Preserves
values.[^preserves-ordering-exists-too] This makes it a solid foundation for a multi-language,
multi-process, potentially distributed system like Synit.
[^dataspaces-need-data-with-semantics]
Preserves has programming-language-independent *semantics*: the specification defines an
*equivalence relation* over Preserves values.[^preserves-ordering-exists-too] This makes it a
solid foundation for a multi-language, multi-process, potentially distributed system like
Synit. [^dataspaces-need-data-with-semantics]
### Abstract syntax: Values
The *abstract syntax* of Preserves values is as follows (from the specification):
The *abstract syntax* of Preserves values includes a few basic atomic types, plus sequence,
set, dictionary, and record compound types. From the specification:
Value = Atom Atom = Boolean
| Compound | Float
@ -56,24 +55,25 @@ The *abstract syntax* of Preserves values is as follows (from the specification)
### Concrete syntax
Because Preserves has semantics independent of its syntax, we are free to define *syntax*
appropriate for its use in different settings. Values can be automatically, *losslessly*
translated from one syntax to another. The core Preserves specification defines both a
*text-based*, human-readable, JSON-like syntax, that is a syntactic superset of JSON, and a
completely equivalent compact *binary* syntax, crucial to the definition of [canonical
form](#canonical-form) for Preserves values.[^syrup]
Because Preserves' semantics are independent of its syntax, we may use different syntax for it
in different settings. Values can be automatically, losslessly translated from one syntax to
another.
The core Preserves specification defines a text-based, human-readable, JSON-like syntax, that
is a syntactic superset of JSON, and a completely equivalent compact binary syntax, crucial to
the definition of [canonical form](#canonical-form) for Preserves values.[^syrup]
Here are a few example values, written using the text syntax (see [the
specification](https://preserves.gitlab.io/preserves/preserves.html#textual-syntax) for the
grammar):
Boolean : #t, #f
Float : 1.0f, 10.4e3f, -100.6f
Double : 1.0, 10.4e3, -100.6
Integer : 1, 0, -100
Boolean : #t #f
Float : 1.0f 10.4e3f -100.6f
Double : 1.0 10.4e3 -100.6
Integer : 1 0 -100
String : "Hello, world!\n"
ByteString : #"bin\x00str\x00", #[YmluAHN0cgA], #x"62696e0073747200"
Symbol : hello-world, |hello world|, =, !, hello?, ||, ...
ByteString : #"bin\x00str\x00" #[YmluAHN0cgA] #x"62696e0073747200"
Symbol : hello-world |hello world| = ! hello? || ...
Record : <label field1 field2 ...>
Sequence : [value1 value2 ...]
Set : #{value1 value2 ...}
@ -89,37 +89,51 @@ syntax](https://preserves.gitlab.io/preserves/preserves.html#compact-binary-synt
[a few simple rules](https://preserves.gitlab.io/preserves/canonical-binary.html) about
serialization ordering of elements in sets and keys in dictionaries.
Having a canonical form means that, for example, a SHA-512 (or other secure) digest of the
canonical serialization of a value can be used as a unique, short name for the value.
Having a canonical form means that, for example, a cryptographic hash of a value's canonical
serialization can be used as a unique fingerprint for the value.
For example, the value
For example, the SHA-512 digest of the canonical serializartion of the value
```preserves
<sms-delivery <address international "31653131313">
<sms-delivery <address international "31653131313">
<address international "31655512345">
<rfc3339 "2022-02-09T08:18:29.88847+01:00">
"This is a test SMS message">
```
serializes canonically to
00000000: b4b3 0c73 6d73 2d64 656c 6976 6572 79b4 ...sms-delivery.
00000010: b307 6164 6472 6573 73b3 0d69 6e74 6572 ..address..inter
00000020: 6e61 7469 6f6e 616c b10b 3331 3635 3331 national..316531
00000030: 3331 3331 3384 b4b3 0761 6464 7265 7373 31313....address
00000040: b30d 696e 7465 726e 6174 696f 6e61 6cb1 ..international.
00000050: 0b33 3136 3535 3531 3233 3435 84b4 b307 .31655512345....
00000060: 7266 6333 3333 39b1 1f32 3032 322d 3032 rfc3339..2022-02
00000070: 2d30 3954 3038 3a31 383a 3239 2e38 3838 -09T08:18:29.888
00000080: 3437 2b30 313a 3030 84b1 1a54 6869 7320 47+01:00...This
00000090: 6973 2061 2074 6573 7420 534d 5320 6d65 is a test SMS me
000000a0: 7373 6167 6584 ssage.
which has SHA-512 hash
is
bfea9bd5ddf7781e34b6ca7e146ba2e442ef8ce04fd5ff912f889359945d0e2967a77a13
c86b13959dcce7e8ba3950d303832b825648609447b3d147677163ce
### Capabilities
Preserves values can include *embedded references*, written as values with a `#!` prefix. For
example, a command adding `<some-setting>` to the user settings database might be sent to the
root dataspace as follows:
```preserves
<user-settings-command <assert <some-setting>> #![0 123]>
```
The `user-settings-command` structure includes the `assert` command itself, plus an embedded
capability reference, `#![0 123]`, which encodes a transport-specific reference to an object.
> TODO: Link to documentation for `sturdy.prs`.
The syntax of values under `#!` differs depending on the medium carrying the message:
point-to-point transports need to be able to refer to "my references" (`#![0 `*n*`]`) and "your
references" (`#![1 `*n*`]`); multicast/broadcast media (like Ethernet) need to be able to name
references within specific, named conversational participants (`#![<udp [192 168 1 10] 5999>
`*n*`]`) ; in-memory representations use direct pointers (`#!140425190562944`); and so on. In
every case, the references themselves function very similarly to Unix file descriptors: an
integer or similar that unforgeably denotes, in a local context, some complex data structure on
the other side of a trust boundary.
When capability-bearing Preserves values are read off a transport, the capabilities are
automatically rewritten into references to in-memory proxy objects. The reverse process of
rewriting capability references happens when an in-memory value is serialized for transmission.
## Schemas
Preserves comes with a schema language suitable for defining protocols among actors/programs in
@ -306,7 +320,7 @@ Here's the Preserves value equivalent to the example above, expressed using the
#### Notes
[^preserves-ordering-exists-too]: The specification defines a *total order relation* over
[^preserves-ordering-exists-too]: The specification defines a total order relation over
Preserves values as well.
[^dataspaces-need-data-with-semantics]: In particular, *dataspaces* need the assertion data
@ -335,4 +349,6 @@ Here's the Preserves value equivalent to the example above, expressed using the
[^including-json]: Including JSON values, of course!
[^lose-compatibility]: By doing so, we of course lose compatibility with the Serde structures.
[^lose-compatibility]: By doing so, we lose compatibility with the Serde structures, but the
point is to show the kinds of schemas available to us once we move away from strict
compatibility with existing data formats.