Preserves stuff for the manual
This commit is contained in:
parent
c7e2abe1fb
commit
3ef314bc21
|
@ -12,13 +12,13 @@
|
||||||
|
|
||||||
- [System overview](./operation/index.md)
|
- [System overview](./operation/index.md)
|
||||||
- [The System Bus: syndicate-server](./operation/system-bus.md)
|
- [The System Bus: syndicate-server](./operation/system-bus.md)
|
||||||
|
- [Configuration language](./operation/scripting.md)
|
||||||
- [Services and service dependencies](./operation/service.md)
|
- [Services and service dependencies](./operation/service.md)
|
||||||
- [Built-in services and service classes](./operation/builtin/index.md)
|
- [Built-in services and service classes](./operation/builtin/index.md)
|
||||||
- [Gatekeeper](./operation/builtin/gatekeeper.md)
|
- [Gatekeeper](./operation/builtin/gatekeeper.md)
|
||||||
- [TCP/IP and Unix-socket Transports](./operation/builtin/relay-listener.md)
|
- [TCP/IP and Unix-socket Transports](./operation/builtin/relay-listener.md)
|
||||||
- [Configuration watcher](./operation/builtin/config-watcher.md)
|
- [Configuration watcher](./operation/builtin/config-watcher.md)
|
||||||
- [Daemons and external programs](./operation/builtin/daemon.md)
|
- [Daemons and external programs](./operation/builtin/daemon.md)
|
||||||
- [Configuration scripting language](./operation/scripting.md)
|
|
||||||
- [Configuration files and directories]()
|
- [Configuration files and directories]()
|
||||||
- [The boot layer]()
|
- [The boot layer]()
|
||||||
- [Logging]()
|
- [Logging]()
|
||||||
|
|
|
@ -1,9 +1,334 @@
|
||||||
# Preserves
|
# Preserves
|
||||||
|
|
||||||
an S-expression-like language that is a syntactic superset of
|
Synit makes **extensive** use of *Preserves*, a programming-language-independent language for
|
||||||
JSON. Like JSON, Preserves is not specifically tied to any
|
data.
|
||||||
particular programming language. Unlike JSON, Preserves has a
|
|
||||||
robust semantics, designed specifically to be a solid foundation
|
- [Preserves homepage](https://preserves.gitlab.io/)
|
||||||
for networked communication.
|
- [Preserves specification](https://preserves.gitlab.io/preserves/preserves.html)
|
||||||
|
- [Preserves schema-language specification](https://preserves.gitlab.io/preserves/preserves-schema.html)
|
||||||
|
- [Source code](https://gitlab.com/preserves/preserves) for many (not all) of the implementations
|
||||||
|
- Implementations for
|
||||||
|
[Nim](https://git.sr.ht/~ehmry/preserves-nim),
|
||||||
|
[Python](https://pypi.org/project/preserves/),
|
||||||
|
[Racket](https://pkgs.racket-lang.org/package/preserves),
|
||||||
|
[Rust](https://docs.rs/preserves/latest/preserves/),
|
||||||
|
[Squeak Smalltalk](https://squeaksource.com/Preserves.html),
|
||||||
|
[TypeScript/Javascript](https://www.npmjs.com/org/preserves)
|
||||||
|
|
||||||
|
The Preserves data language is in many ways comparable to JSON, XML, S-expressions, CBOR, ASN.1
|
||||||
|
BER, and so on. From the [specification
|
||||||
|
document](https://preserves.gitlab.io/preserves/preserves.html):
|
||||||
|
|
||||||
|
> Preserves supports *records* with user-defined *labels*, embedded *references*, and the usual
|
||||||
|
> suite of atomic and compound data types, including *binary* data as a distinct type from text
|
||||||
|
> strings.
|
||||||
|
|
||||||
|
## Why does Synit rely on Preserves?
|
||||||
|
|
||||||
|
There are five aspects of Preserves that make it particularly relevant to Synit:
|
||||||
|
|
||||||
|
- the core Preserves [data language](#grammar-of-values) has a robust *semantics*;
|
||||||
|
- Preserves values may have [capability references() embedded within them;
|
||||||
|
- Preserves has a [schema language](#schemas) useful for specifying protocols among actors;
|
||||||
|
- a [canonical form](#canonical-form) exists for every Preserves value; and
|
||||||
|
- Preserves has a [query language](#preserves-path) for extracting portions of a Preserves value.
|
||||||
|
|
||||||
|
## Grammar of values
|
||||||
|
|
||||||
|
The main reason Preserves is useful for Synit is that it has *semantics*: the specification
|
||||||
|
defines a language-independent *equivalence relation* over Preserves
|
||||||
|
values.[^preserves-ordering-exists-too] This makes it a solid foundation for a multi-language,
|
||||||
|
multi-process, potentially distributed system like Synit.
|
||||||
|
[^dataspaces-need-data-with-semantics]
|
||||||
|
|
||||||
|
### Abstract syntax: Values
|
||||||
|
|
||||||
|
The *abstract syntax* of Preserves values is as follows (from the specification):
|
||||||
|
|
||||||
|
Value = Atom Atom = Boolean
|
||||||
|
| Compound | Float
|
||||||
|
| Embedded | Double
|
||||||
|
| SignedInteger
|
||||||
|
Compound = Record | String
|
||||||
|
| Sequence | ByteString
|
||||||
|
| Set | Symbol
|
||||||
|
| Dictionary
|
||||||
|
|
||||||
|
### Concrete syntax
|
||||||
|
|
||||||
|
Because Preserves has semantics independent of its syntax, we are free to define *syntax*
|
||||||
|
appropriate for its use in different settings. Values can be automatically, *losslessly*
|
||||||
|
translated from one syntax to another. The core Preserves specification defines both a
|
||||||
|
*text-based*, human-readable, JSON-like syntax, that is a syntactic superset of JSON, and a
|
||||||
|
completely equivalent compact *binary* syntax, crucial to the definition of [canonical
|
||||||
|
form](#canonical-form) for Preserves values.[^syrup]
|
||||||
|
|
||||||
|
Here are a few example values, written using the text syntax (see [the
|
||||||
|
specification](https://preserves.gitlab.io/preserves/preserves.html#textual-syntax) for the
|
||||||
|
grammar):
|
||||||
|
|
||||||
|
Boolean : #t, #f
|
||||||
|
Float : 1.0f, 10.4e3f, -100.6f
|
||||||
|
Double : 1.0, 10.4e3, -100.6
|
||||||
|
Integer : 1, 0, -100
|
||||||
|
String : "Hello, world!\n"
|
||||||
|
ByteString : #"bin\x00str\x00", #[YmluAHN0cgA], #x"62696e0073747200"
|
||||||
|
Symbol : hello-world, |hello world|, =, !, hello?, ||, ...
|
||||||
|
Record : <label field1 field2 ...>
|
||||||
|
Sequence : [value1 value2 ...]
|
||||||
|
Set : #{value1 value2 ...}
|
||||||
|
Dictionary : {key1: value1 key2: value2 ...: ...}
|
||||||
|
Embedded : #!value
|
||||||
|
|
||||||
|
Commas are optional in sequences, sets, and dictionaries.
|
||||||
|
|
||||||
|
### Canonical form
|
||||||
|
|
||||||
|
Every Preserves value can be serialized into a *canonical form* using the [binary
|
||||||
|
syntax](https://preserves.gitlab.io/preserves/preserves.html#compact-binary-syntax) along with
|
||||||
|
[a few simple rules](https://preserves.gitlab.io/preserves/canonical-binary.html) about
|
||||||
|
serialization ordering of elements in sets and keys in dictionaries.
|
||||||
|
|
||||||
|
Having a canonical form means that, for example, a SHA-512 (or other secure) digest of the
|
||||||
|
canonical serialization of a value can be used as a unique, short name for the value.
|
||||||
|
|
||||||
|
For example, the value
|
||||||
|
|
||||||
|
```preserves
|
||||||
|
<sms-delivery <address international "31653131313">
|
||||||
|
<address international "31655512345">
|
||||||
|
<rfc3339 "2022-02-09T08:18:29.88847+01:00">
|
||||||
|
"This is a test SMS message">
|
||||||
|
```
|
||||||
|
|
||||||
|
serializes canonically to
|
||||||
|
|
||||||
|
00000000: b4b3 0c73 6d73 2d64 656c 6976 6572 79b4 ...sms-delivery.
|
||||||
|
00000010: b307 6164 6472 6573 73b3 0d69 6e74 6572 ..address..inter
|
||||||
|
00000020: 6e61 7469 6f6e 616c b10b 3331 3635 3331 national..316531
|
||||||
|
00000030: 3331 3331 3384 b4b3 0761 6464 7265 7373 31313....address
|
||||||
|
00000040: b30d 696e 7465 726e 6174 696f 6e61 6cb1 ..international.
|
||||||
|
00000050: 0b33 3136 3535 3531 3233 3435 84b4 b307 .31655512345....
|
||||||
|
00000060: 7266 6333 3333 39b1 1f32 3032 322d 3032 rfc3339..2022-02
|
||||||
|
00000070: 2d30 3954 3038 3a31 383a 3239 2e38 3838 -09T08:18:29.888
|
||||||
|
00000080: 3437 2b30 313a 3030 84b1 1a54 6869 7320 47+01:00...This
|
||||||
|
00000090: 6973 2061 2074 6573 7420 534d 5320 6d65 is a test SMS me
|
||||||
|
000000a0: 7373 6167 6584 ssage.
|
||||||
|
|
||||||
|
which has SHA-512 hash
|
||||||
|
|
||||||
|
bfea9bd5ddf7781e34b6ca7e146ba2e442ef8ce04fd5ff912f889359945d0e2967a77a13
|
||||||
|
c86b13959dcce7e8ba3950d303832b825648609447b3d147677163ce
|
||||||
|
|
||||||
## Schemas
|
## Schemas
|
||||||
|
|
||||||
|
Preserves comes with a schema language suitable for defining protocols among actors/programs in
|
||||||
|
Synit. Because Preserves is a superset of JSON, its schemas can be used for parsing JSON just
|
||||||
|
as well as for native Preserves values. From the [schema
|
||||||
|
specification](https://preserves.gitlab.io/preserves/preserves-schema.html):
|
||||||
|
|
||||||
|
> A Preserves schema connects Preserves Values to host-language data
|
||||||
|
> structures. Each definition within a schema can be processed by a
|
||||||
|
> compiler to produce
|
||||||
|
>
|
||||||
|
> - a host-language *type definition*;
|
||||||
|
> - a partial *parsing* function from Values to instances of the
|
||||||
|
> produced type; and
|
||||||
|
> - a total *serialization* function from instances of the type to
|
||||||
|
> Values.
|
||||||
|
>
|
||||||
|
> Every parsed Value retains enough information to always be able to
|
||||||
|
> be serialized again, and every instance of a host-language data
|
||||||
|
> structure contains, by construction, enough information to be
|
||||||
|
> successfully serialized.
|
||||||
|
|
||||||
|
Instead of taking host-language data structure definitions as primary, in the way that systems
|
||||||
|
like [serde](https://serde.rs/) do, Preserves schemas take *the shape of the serialized data*
|
||||||
|
as primary.
|
||||||
|
|
||||||
|
To see the difference, let's look at an example.
|
||||||
|
|
||||||
|
### Example: Book Outline
|
||||||
|
|
||||||
|
Systems like [Serde](https://serde.rs/) concentrate on defining (de)serializers for
|
||||||
|
host-language type definitions.
|
||||||
|
|
||||||
|
Serde starts from definitions like the following[^this-example-from-mdbook]. It generates
|
||||||
|
(de)serialization code for various different *data languages* (such as JSON, XML, CBOR, etc.)
|
||||||
|
in a single *programming language*: Rust.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct BookOutline {
|
||||||
|
pub sections: Vec<BookItem>,
|
||||||
|
}
|
||||||
|
pub enum BookItem {
|
||||||
|
Chapter(Chapter),
|
||||||
|
Separator,
|
||||||
|
PartTitle(String),
|
||||||
|
}
|
||||||
|
pub struct Chapter {
|
||||||
|
pub name: String,
|
||||||
|
pub sub_items: Vec<BookItem>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The (de)serializers are able to produce and understand values such as the following JSON
|
||||||
|
document, converting them to and from in-memory representations. The focus is on Rust:
|
||||||
|
interpreting the produced documents from other languages is out-of-scope for Serde.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"sections": [
|
||||||
|
{ "PartTitle": "Part I" },
|
||||||
|
"Separator",
|
||||||
|
{
|
||||||
|
"Chapter": {
|
||||||
|
"name": "Chapter One",
|
||||||
|
"sub_items": []
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"Chapter": {
|
||||||
|
"name": "Chapter Two",
|
||||||
|
"sub_items": []
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
By contrast, Preserves schemas focus on Preserves values[^including-json] only.
|
||||||
|
|
||||||
|
Each Preserves schema compiler generates type definitions and (de)serialization code for a
|
||||||
|
single *programming language* able to understand common *data*. The grammar of the data itself
|
||||||
|
is language-independent.
|
||||||
|
|
||||||
|
For example, a Preserves schema able to parse values compatible with those produced by Serde
|
||||||
|
for the type definitions above is the following:
|
||||||
|
|
||||||
|
```preserves
|
||||||
|
version 1 .
|
||||||
|
BookOutline = {
|
||||||
|
"sections": @sections [BookItem ...],
|
||||||
|
} .
|
||||||
|
BookItem = @chapter { "Chapter": @value Chapter }
|
||||||
|
/ @separator "Separator"
|
||||||
|
/ @partTitle { "PartTitle": @value string } .
|
||||||
|
Chapter = {
|
||||||
|
"name": @name string,
|
||||||
|
"sub_items": @sub_items [BookItem ...],
|
||||||
|
} .
|
||||||
|
```
|
||||||
|
|
||||||
|
Using the Rust schema compiler, we see types such as the following, which are *similar to* but
|
||||||
|
not the *same* as the original Rust types above:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct BookOutline {
|
||||||
|
pub sections: std::vec::Vec<BookItem>
|
||||||
|
}
|
||||||
|
pub enum BookItem {
|
||||||
|
Chapter { value: std::boxed::Box<Chapter> },
|
||||||
|
Separator,
|
||||||
|
PartTitle { value: std::string::String }
|
||||||
|
}
|
||||||
|
pub struct Chapter {
|
||||||
|
pub name: std::string::String,
|
||||||
|
pub sub_items: std::vec::Vec<BookItem>
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Using the TypeScript schema compiler, we see
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export type BookOutline = {"sections": Array<BookItem>};
|
||||||
|
export type BookItem = (
|
||||||
|
{"_variant": "chapter", "value": Chapter} |
|
||||||
|
{"_variant": "separator"} |
|
||||||
|
{"_variant": "partTitle", "value": string}
|
||||||
|
);
|
||||||
|
export type Chapter = {"name": string, "sub_items": Array<BookItem>};
|
||||||
|
```
|
||||||
|
|
||||||
|
Using the Racket schema compiler, we see
|
||||||
|
|
||||||
|
```racket
|
||||||
|
(struct BookOutline (sections))
|
||||||
|
(define (BookItem? p)
|
||||||
|
(or (BookItem-chapter? p)
|
||||||
|
(BookItem-separator? p)
|
||||||
|
(BookItem-partTitle? p)))
|
||||||
|
(struct BookItem-chapter (value))
|
||||||
|
(struct BookItem-separator ())
|
||||||
|
(struct BookItem-partTitle (value))
|
||||||
|
(struct Chapter (name sub_items))
|
||||||
|
```
|
||||||
|
|
||||||
|
and so on.
|
||||||
|
|
||||||
|
### Example: Book Outline redux, using Records
|
||||||
|
|
||||||
|
The schema for book outlines above accepts Preserves (JSON) documents compatible with the
|
||||||
|
(de)serializers produced by Serde for a Rust-native type.
|
||||||
|
|
||||||
|
Instead, we might choose to define a Preserves-native data definition, and to work from
|
||||||
|
that:[^lose-compatibility]
|
||||||
|
|
||||||
|
```preserves
|
||||||
|
version 1 .
|
||||||
|
BookOutline = <book-outline @sections [BookItem ...]> .
|
||||||
|
BookItem = Chapter / =separator / @partTitle string .
|
||||||
|
Chapter = <chapter @name string @sub_items [BookItem ...]> .
|
||||||
|
```
|
||||||
|
|
||||||
|
The schema compilers produce **exactly the same** type definitions for this variation!
|
||||||
|
|
||||||
|
The differences are in the (de)serialization code only.
|
||||||
|
|
||||||
|
Here's the Preserves value equivalent to the example above, expressed using the Preserves-native schema:
|
||||||
|
|
||||||
|
```preserves
|
||||||
|
<book-outline [
|
||||||
|
"Part I"
|
||||||
|
separator
|
||||||
|
<chapter "Chapter One" []>
|
||||||
|
<chapter "Chapter Two" []>
|
||||||
|
]>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Preserves Path
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
#### Notes
|
||||||
|
|
||||||
|
[^preserves-ordering-exists-too]: The specification defines a *total order relation* over
|
||||||
|
Preserves values as well.
|
||||||
|
|
||||||
|
[^dataspaces-need-data-with-semantics]: In particular, *dataspaces* need the assertion data
|
||||||
|
they contain to have a sensible equivalence predicate in order to be useful at all. If you
|
||||||
|
can't reliably tell whether two values are the same or different, how are you supposed to
|
||||||
|
use them to look things up in anything database-like?
|
||||||
|
Languages like JSON, which [don't have a well-defined equivalence
|
||||||
|
relation](https://preserves.gitlab.io/preserves/why-not-json.html#json-syntax-doesnt-mean-anything),
|
||||||
|
aren't good enough. When programs communicate with each other, they need to be sure that
|
||||||
|
their peers will understand the information they receive exactly as it was sent.
|
||||||
|
|
||||||
|
[^syrup]: Besides the two core syntaxes, other serialization syntaxes are in use in other
|
||||||
|
systems. For example, the [Spritely](https://gitlab.com/spritely)
|
||||||
|
[Goblins](https://gitlab.com/spritely/goblins) actor library uses a serialization syntax
|
||||||
|
called [Syrup](https://github.com/ocapn/syrup#pseudo-specification), reminiscent of
|
||||||
|
[`bencode`](https://en.wikipedia.org/wiki/Bencode).
|
||||||
|
|
||||||
|
[^this-example-from-mdbook]: This example is a simplified form of the preprocessor type
|
||||||
|
definitions for
|
||||||
|
[mdBook](https://rust-lang.github.io/mdBook/for_developers/preprocessors.html), the system
|
||||||
|
used to render this manual. I use a real [Preserves schema
|
||||||
|
definition](https://git.syndicate-lang.org/synit/synit/src/branch/main/manual/book.prs) for
|
||||||
|
parsing and producing Serde's JSON representation of mdBook `Book` structures in order to
|
||||||
|
[preprocess the manual's source
|
||||||
|
code](https://git.syndicate-lang.org/synit/synit/src/branch/main/manual/mdbook-ditaa).
|
||||||
|
|
||||||
|
[^including-json]: Including JSON values, of course!
|
||||||
|
|
||||||
|
[^lose-compatibility]: By doing so, we of course lose compatibility with the Serde structures.
|
||||||
|
|
|
@ -11,17 +11,17 @@ It provides:
|
||||||
1. A **[root system bus](#the-root-system-bus)** service for use by other programs. In this way, it is
|
1. A **[root system bus](#the-root-system-bus)** service for use by other programs. In this way, it is
|
||||||
analogous to D-Bus.
|
analogous to D-Bus.
|
||||||
|
|
||||||
2. A general-purpose **[service dependency tracking facility](./service.md)**.
|
2. A **[configuration language](./scripting.md)** suitable for programming
|
||||||
|
[dataspaces](../glossary.md#dataspace) with simple reactive behaviours.
|
||||||
|
|
||||||
3. A [**gatekeeper** service](./builtin/gatekeeper.md), for exposing
|
3. A general-purpose **[service dependency tracking facility](./service.md)**.
|
||||||
|
|
||||||
|
4. A [**gatekeeper** service](./builtin/gatekeeper.md), for exposing
|
||||||
[capabilities](../glossary.md#capability) to running objects as (potentially long-lived)
|
[capabilities](../glossary.md#capability) to running objects as (potentially long-lived)
|
||||||
[macaroon](../glossary.md#macaroon)-style "sturdy references", plus TCP/IP- and
|
[macaroon](../glossary.md#macaroon)-style "sturdy references", plus TCP/IP- and
|
||||||
Unix-socket-based **[transports](./builtin/relay-listener.md)** for accessing capabilities
|
Unix-socket-based **[transports](./builtin/relay-listener.md)** for accessing capabilities
|
||||||
through the gatekeeper.
|
through the gatekeeper.
|
||||||
|
|
||||||
4. A limited **[configuration scripting language](./scripting.md)** suitable for
|
|
||||||
programming [dataspaces](../glossary.md#dataspace) with simple reactive behaviours.
|
|
||||||
|
|
||||||
5. An [`inotify`](https://en.wikipedia.org/wiki/Inotify)-based **[configuration
|
5. An [`inotify`](https://en.wikipedia.org/wiki/Inotify)-based **[configuration
|
||||||
loader](./builtin/config-watcher.md)** which loads and executes configuration files written
|
loader](./builtin/config-watcher.md)** which loads and executes configuration files written
|
||||||
in the scripting language.
|
in the scripting language.
|
||||||
|
|
Loading…
Reference in New Issue