synit-manual/src/guide/preserves.md

# Preserves

Synit makes **extensive** use of *Preserves*, a programming-language-independent language for
data.

 - [Preserves homepage](https://preserves.dev/)
 - [Preserves specification](https://preserves.dev/preserves.html)
 - [Preserves Schema specification](https://preserves.dev/preserves-schema.html)
 - [Source code](https://gitlab.com/preserves/preserves) for many (not all) of the implementations
 - Implementations for
   [Nim](https://git.sr.ht/~ehmry/preserves-nim),
   [Python](https://pypi.org/project/preserves/),
   [Racket](https://pkgs.racket-lang.org/package/preserves),
   [Rust](https://docs.rs/preserves/latest/preserves/),
   [Squeak Smalltalk](https://squeaksource.com/Preserves.html),
   [TypeScript/Javascript](https://www.npmjs.com/org/preserves)

The Preserves data language is in many ways comparable to JSON, XML, S-expressions, CBOR, ASN.1
BER, and so on. From the [specification
document](https://preserves.dev/preserves.html):

> Preserves supports *records* with user-defined *labels*, embedded *references*, and the usual
> suite of atomic and compound data types, including *binary* data as a distinct type from text
> strings.

## Why does Synit rely on Preserves?

There are four aspects of Preserves that make it particularly relevant to Synit:

 - the core Preserves [data language](#grammar-of-values) has a robust semantics;
 - a [canonical form](#canonical-form) exists for every Preserves value;
 - Preserves values may have [capability references](#capabilities) embedded within them; and
 - Preserves has a [schema language](#schemas) useful for specifying protocols among actors.

## Grammar of values

Preserves has programming-language-independent *semantics*: the specification defines an
*equivalence relation* over Preserves values.[^preserves-ordering-exists-too] This makes it a
solid foundation for a multi-language, multi-process, potentially distributed system like
Synit. [^dataspaces-need-data-with-semantics]

### Values and Types

Preserves values come in various *types*: a few basic atomic types, plus sequence, set,
dictionary, and record compound types. From the specification:

                        Value = Atom           Atom = Boolean
                              | Compound            | Double
                              | Embedded            | SignedInteger
                                                    | String
                     Compound = Record              | ByteString
                              | Sequence            | Symbol
                              | Set
                              | Dictionary

### Concrete syntax

Preserves offers *multiple* syntaxes, each useful in different settings. Values are
automatically, losslessly translatable from one syntax to another because Preserves' semantics
are syntax-independent.

The core Preserves specification defines a text-based, human-readable, JSON-like syntax, that
is a syntactic superset of JSON, and a completely equivalent compact machine-oriented syntax,
crucial to the definition of [canonical form](#canonical-form) for Preserves values.[^syrup]

Here are a few example values, written using the text syntax (see [the
specification](https://preserves.dev/preserves-text.html) for the
grammar):

    Boolean    : #t #f
    Double     : 1.0 10.4e3 -100.6
    Integer    : 1 0 -100
    String     : "Hello, world!\n"
    ByteString : #"bin\x00str\x00" #[YmluAHN0cgA] #x"62696e0073747200"
    Symbol     : hello-world |hello world| = ! hello? || ...
    Record     : <label field1 field2 ...>
    Sequence   : [value1 value2 ...]
    Set        : #{value1 value2 ...}
    Dictionary : {key1: value1 key2: value2 ...: ...}
    Embedded   : #:value

Commas are optional in sequences, sets, and dictionaries.

### Canonical form

Every Preserves value can be serialized into a *canonical form* using the [machine-oriented
syntax](https://preserves.dev/preserves-binary.html) along with [a few simple
rules](https://preserves.dev/canonical-binary.html) about serialization ordering of elements in
sets and keys in dictionaries.

Having a canonical form means that, for example, a cryptographic hash of a value's canonical
serialization can be used as a unique fingerprint for the value.

For example, the SHA-512 digest of the canonical serialization of the value

```preserves
<sms-delivery <address international "31653131313">
              <address international "31655512345">
              <rfc3339 "2022-02-09T08:18:29.88847+01:00">
              "This is a test SMS message">
```

is

    bfea9bd5ddf7781e34b6ca7e146ba2e442ef8ce04fd5ff912f889359945d0e2967a77a13
    c86b13959dcce7e8ba3950d303832b825648609447b3d147677163ce

### Capabilities

Preserves values can include *embedded references*, written as values with a `#:` prefix. For
example, a command adding `<some-setting>` to the user settings database might look like this
as it travels over a Unix pipe connecting a program to the root dataspace:

```preserves
<user-settings-command <assert <some-setting>> #:[0 123]>
```

The `user-settings-command` structure includes the `assert` command itself, plus an embedded
capability reference, `#:[0 123]`, which encodes a transport-specific reference to an object.
(See the [Syndicate Protocol](../protocol.md#capabilities-on-the-wire) for an concrete example
of this.)

The syntax of values under `#:` differs depending on the medium carrying the message.
For example, point-to-point transports need to be able to refer to "my references" (`#:[0 `*n*`]`) and "your
references" (`#:[1 `*n*`]`), while multicast/broadcast media (like Ethernet) need to be able to name
references within specific, named conversational participants (`#:[<udp [192 168 1 10] 5999>
`*n*`]`), and in-memory representations need to use direct pointers (`#:140425190562944`).

In every case, the references themselves work like Unix file descriptors: an integer or similar
that unforgeably denotes, in a local context, some complex data structure on the other side of
a trust boundary.

When capability-bearing Preserves values are read off a transport, the capabilities are
[automatically rewritten](../protocol.md#inbound-rewriting) into references to in-memory proxy
objects. The [reverse process](../protocol.md#outbound-rewriting) of rewriting capability
references happens when an in-memory value is serialized for transmission.

## Schemas

Preserves comes with a schema language suitable for defining protocols among actors/programs in
Synit. Because Preserves is a superset of JSON, its schemas can be used for parsing JSON just
as well as for native Preserves values.[^you-have-to-use-a-preserves-reader] From the [schema
specification](https://preserves.dev/preserves-schema.html):

> A Preserves schema connects Preserves Values to host-language data
> structures. Each definition within a schema can be processed by a
> compiler to produce
> 
>  - a host-language *type definition*;
>  - a partial *parsing* function from Values to instances of the
>    produced type; and
>  - a total *serialization* function from instances of the type to
>    Values.
> 
> Every parsed Value retains enough information to always be able to
> be serialized again, and every instance of a host-language data
> structure contains, by construction, enough information to be
> successfully serialized.

Instead of taking host-language data structure definitions as primary, in the way that systems
like [Serde](https://serde.rs/) do, Preserves schemas take *the shape of the serialized data*
as primary.

To see the difference, let's look at an example.

### Example: Book Outline

Systems like [Serde](https://serde.rs/) concentrate on defining (de)serializers for
host-language type definitions.

Serde starts from definitions like the following.[^this-example-from-mdbook] It generates
(de)serialization code for various different *data* languages (such as JSON, XML, CBOR, etc.)
in a single *programming* language: Rust.

```rust
pub struct BookOutline {
    pub sections: Vec<BookItem>,
}
pub enum BookItem {
    Chapter(Chapter),
    Separator,
    PartTitle(String),
}
pub struct Chapter {
    pub name: String,
    pub sub_items: Vec<BookItem>,
}
```

The (de)serializers are able to convert between in-memory and serialized representations such
as the following JSON document. The focus is on Rust: interpreting the produced documents from
other languages is out-of-scope for Serde.

```json
{
  "sections": [
    { "PartTitle": "Part I" },
    "Separator",
    {
      "Chapter": {
        "name": "Chapter One",
        "sub_items": []
      }
    },
    {
      "Chapter": {
        "name": "Chapter Two",
        "sub_items": []
      }
    }
  ]
}
```

By contrast, Preserves schemas map a single *data* language to and from multiple *programming*
languages. Each specific programming language has its own schema compiler, which generates type
definitions and (de)serialization code for that language from a language-independent grammar.

For example, a schema able to parse values compatible with those produced by Serde for the type
definitions above is the following:

```preserves
version 1 .

BookOutline = {
  "sections": @sections [BookItem ...],
} .

BookItem = @chapter { "Chapter": @value Chapter }
         / @separator "Separator"
         / @partTitle { "PartTitle": @value string } .

Chapter = {
  "name": @name string,
  "sub_items": @sub_items [BookItem ...],
} .
```

Using the Rust schema compiler, we see types such as the following, which are similar to but
not the same as the original Rust types above:

```rust
pub struct BookOutline {
    pub sections: std::vec::Vec<BookItem>
}
pub enum BookItem {
    Chapter { value: std::boxed::Box<Chapter> },
    Separator,
    PartTitle { value: std::string::String }
}
pub struct Chapter {
    pub name: std::string::String,
    pub sub_items: std::vec::Vec<BookItem>
}
```

Using the TypeScript schema compiler, we see

```typescript
export type BookOutline = {"sections": Array<BookItem>};

export type BookItem = (
    {"_variant": "chapter", "value": Chapter} |
    {"_variant": "separator"} |
    {"_variant": "partTitle", "value": string}
);

export type Chapter = {"name": string, "sub_items": Array<BookItem>};
```

Using the Racket schema compiler, we see

```racket
(struct BookOutline (sections))
(define (BookItem? p)
    (or (BookItem-chapter? p)
        (BookItem-separator? p)
        (BookItem-partTitle? p)))
(struct BookItem-chapter (value))
(struct BookItem-separator ())
(struct BookItem-partTitle (value))
(struct Chapter (name sub_items))
```

and so on.

### Example: Book Outline redux, using Records

The schema for book outlines above accepts Preserves (JSON) documents compatible with the
(de)serializers produced by Serde for a Rust-native type.

Instead, we might choose to define a Preserves-native data definition, and to work from
that:[^lose-compatibility]

```preserves
version 1 .
BookOutline = <book-outline @sections [BookItem ...]> .
BookItem = Chapter / =separator / @partTitle string .
Chapter = <chapter @name string @sub_items [BookItem ...]> .
```

The schema compilers produce **exactly the same type definitions**[^well-almost-exactly] for
this variation. The differences are in the (de)serialization code only.

Here's the Preserves value equivalent to the example above, expressed using the Preserves-native schema:

```preserves
<book-outline [
  "Part I"
  separator
  <chapter "Chapter One" []>
  <chapter "Chapter Two" []>
]>
```

---

#### Notes

[^preserves-ordering-exists-too]: The specification defines a total order relation over
    Preserves values as well.

[^dataspaces-need-data-with-semantics]: In particular, *dataspaces* need the assertion data
    they contain to have a sensible equivalence predicate in order to be useful at all. If you
    can't reliably tell whether two values are the same or different, how are you supposed to
    use them to look things up in anything database-like?
    Languages like JSON, which [don't have a well-defined equivalence
    relation](https://preserves.dev/why-not-json.html#json-syntax-doesnt-mean-anything),
    aren't good enough. When programs communicate with each other, they need to be sure that
    their peers will understand the information they receive exactly as it was sent.

[^syrup]: Besides the two core syntaxes, other serialization syntaxes are in use in other
    systems. For example, the [Spritely](https://gitlab.com/spritely)
    [Goblins](https://gitlab.com/spritely/goblins) actor library uses a serialization syntax
    called [Syrup](https://github.com/ocapn/syrup#pseudo-specification), reminiscent of
    [`bencode`](https://en.wikipedia.org/wiki/Bencode).

[^you-have-to-use-a-preserves-reader]: You have to use a Preserves text-syntax reader on JSON
    terms to do this, though: JSON values like `null`, `true`, and `false` naively read as
    Preserves *symbols*. Preserves doesn't have the concept of `null`.

[^this-example-from-mdbook]: This example is a simplified form of the preprocessor type
    definitions for
    [mdBook](https://rust-lang.github.io/mdBook/for_developers/preprocessors.html), the system
    used to render these pages. I use a real [Preserves schema
    definition](https://git.syndicate-lang.org/synit/synit-manual/src/branch/main/book.prs) for
    parsing and producing Serde's JSON representation of mdBook `Book` structures in order to
    [preprocess the text](https://git.syndicate-lang.org/synit/synit-manual/src/branch/main/mdbook-ditaa).

[^lose-compatibility]: By doing so, we lose compatibility with the Serde structures, but the
    point is to show the kinds of schemas available to us once we move away from strict
    compatibility with existing data formats.

[^well-almost-exactly]: Well, almost exactly the same. The only difference is in the Rust
    types, which use tuple-style instead of record-style structs for chapters and part titles.
Initial commit of manual 2022-02-10 12:36:29 +00:00			`# Preserves`

Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`Synit makes extensive use of Preserves, a programming-language-independent language for`
			`data.`

preserves.dev 2022-05-24 12:04:03 +00:00			`- [Preserves homepage](https://preserves.dev/)`
			`- [Preserves specification](https://preserves.dev/preserves.html)`
			`- [Preserves Schema specification](https://preserves.dev/preserves-schema.html)`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`- [Source code](https://gitlab.com/preserves/preserves) for many (not all) of the implementations`
			`- Implementations for`
			`[Nim](https://git.sr.ht/~ehmry/preserves-nim),`
			`[Python](https://pypi.org/project/preserves/),`
			`[Racket](https://pkgs.racket-lang.org/package/preserves),`
			`[Rust](https://docs.rs/preserves/latest/preserves/),`
			`[Squeak Smalltalk](https://squeaksource.com/Preserves.html),`
			`[TypeScript/Javascript](https://www.npmjs.com/org/preserves)`

			`The Preserves data language is in many ways comparable to JSON, XML, S-expressions, CBOR, ASN.1`
			`BER, and so on. From the [specification`
preserves.dev 2022-05-24 12:04:03 +00:00			`document](https://preserves.dev/preserves.html):`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`> Preserves supports records with user-defined labels, embedded references, and the usual`
			`> suite of atomic and compound data types, including binary data as a distinct type from text`
			`> strings.`

			`## Why does Synit rely on Preserves?`

More manual 2022-02-11 22:21:50 +00:00			`There are four aspects of Preserves that make it particularly relevant to Synit:`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
More manual 2022-02-11 22:56:11 +00:00			`- the core Preserves [data language](#grammar-of-values) has a robust semantics;`
More manual 2022-02-11 22:21:50 +00:00			`- a [canonical form](#canonical-form) exists for every Preserves value;`
			`- Preserves values may have [capability references](#capabilities) embedded within them; and`
			`- Preserves has a [schema language](#schemas) useful for specifying protocols among actors.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`## Grammar of values`

More manual 2022-02-11 22:21:50 +00:00			`Preserves has programming-language-independent semantics: the specification defines an`
			`equivalence relation over Preserves values.[^preserves-ordering-exists-too] This makes it a`
			`solid foundation for a multi-language, multi-process, potentially distributed system like`
			`Synit. [^dataspaces-need-data-with-semantics]`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
More manual 2022-02-11 22:56:11 +00:00			`### Values and Types`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
More manual 2022-02-11 22:56:11 +00:00			`Preserves values come in various types: a few basic atomic types, plus sequence, set,`
			`dictionary, and record compound types. From the specification:`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`Value = Atom Atom = Boolean`
Single-precision floats are no more in Preserves 2024-02-08 20:06:34 +00:00			`\| Compound \| Double`
			`\| Embedded \| SignedInteger`
			`\| String`
			`Compound = Record \| ByteString`
			`\| Sequence \| Symbol`
			`\| Set`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`\| Dictionary`

			`### Concrete syntax`

Manual updates 2022-02-24 12:25:03 +00:00			`Preserves offers multiple syntaxes, each useful in different settings. Values are`
			`automatically, losslessly translatable from one syntax to another because Preserves' semantics`
			`are syntax-independent.`
More manual 2022-02-11 22:21:50 +00:00
			`The core Preserves specification defines a text-based, human-readable, JSON-like syntax, that`
Bring manual up to date and repair links 2023-02-07 12:30:13 +00:00			`is a syntactic superset of JSON, and a completely equivalent compact machine-oriented syntax,`
			`crucial to the definition of [canonical form](#canonical-form) for Preserves values.[^syrup]`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`Here are a few example values, written using the text syntax (see [the`
Bring manual up to date and repair links 2023-02-07 12:30:13 +00:00			`specification](https://preserves.dev/preserves-text.html) for the`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`grammar):`

More manual 2022-02-11 22:21:50 +00:00			`Boolean : #t #f`
			`Double : 1.0 10.4e3 -100.6`
			`Integer : 1 0 -100`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`String : "Hello, world!\n"`
More manual 2022-02-11 22:21:50 +00:00			`ByteString : #"bin\x00str\x00" #[YmluAHN0cgA] #x"62696e0073747200"`
			`Symbol : hello-world \|hello world\| = ! hello? \|\| ...`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`Record : <label field1 field2 ...>`
			`Sequence : [value1 value2 ...]`
			`Set : #{value1 value2 ...}`
			`Dictionary : {key1: value1 key2: value2 ...: ...}`
Switch embedded preserves syntax from `#!` to `#:` 2024-02-05 23:11:04 +00:00			`Embedded : #:value`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`Commas are optional in sequences, sets, and dictionaries.`

			`### Canonical form`

Bring manual up to date and repair links 2023-02-07 12:30:13 +00:00			`Every Preserves value can be serialized into a canonical form using the [machine-oriented`
			`syntax](https://preserves.dev/preserves-binary.html) along with [a few simple`
			`rules](https://preserves.dev/canonical-binary.html) about serialization ordering of elements in`
			`sets and keys in dictionaries.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
More manual 2022-02-11 22:21:50 +00:00			`Having a canonical form means that, for example, a cryptographic hash of a value's canonical`
			`serialization can be used as a unique fingerprint for the value.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
Manual updates 2022-02-24 12:25:03 +00:00			`For example, the SHA-512 digest of the canonical serialization of the value`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			```preserves
More manual 2022-02-11 22:21:50 +00:00			`<sms-delivery <address international "31653131313">`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`<address international "31655512345">`
			`<rfc3339 "2022-02-09T08:18:29.88847+01:00">`
			`"This is a test SMS message">`
			```

More manual 2022-02-11 22:21:50 +00:00			`is`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`bfea9bd5ddf7781e34b6ca7e146ba2e442ef8ce04fd5ff912f889359945d0e2967a77a13`
			`c86b13959dcce7e8ba3950d303832b825648609447b3d147677163ce`
Initial commit of manual 2022-02-10 12:36:29 +00:00
More manual 2022-02-11 22:21:50 +00:00			`### Capabilities`

Switch embedded preserves syntax from `#!` to `#:` 2024-02-05 23:11:04 +00:00			Preserves values can include embedded references, written as values with a `#:` prefix. For
Manual updates 2022-02-24 12:25:03 +00:00			example, a command adding `<some-setting>` to the user settings database might look like this
			`as it travels over a Unix pipe connecting a program to the root dataspace:`
More manual 2022-02-11 22:21:50 +00:00
			```preserves
Switch embedded preserves syntax from `#!` to `#:` 2024-02-05 23:11:04 +00:00			`<user-settings-command <assert <some-setting>> #:[0 123]>`
More manual 2022-02-11 22:21:50 +00:00			```

			The `user-settings-command` structure includes the `assert` command itself, plus an embedded
Switch embedded preserves syntax from `#!` to `#:` 2024-02-05 23:11:04 +00:00			capability reference, `#:[0 123]`, which encodes a transport-specific reference to an object.
One TODO todone 2022-10-12 11:18:25 +00:00			`(See the [Syndicate Protocol](../protocol.md#capabilities-on-the-wire) for an concrete example`
			`of this.)`
More manual 2022-02-11 22:21:50 +00:00
Switch embedded preserves syntax from `#!` to `#:` 2024-02-05 23:11:04 +00:00			The syntax of values under `#:` differs depending on the medium carrying the message.
			For example, point-to-point transports need to be able to refer to "my references" (`#:[0 `n`]`) and "your
			references" (`#:[1 `n`]`), while multicast/broadcast media (like Ethernet) need to be able to name
			references within specific, named conversational participants (`#:[<udp [192 168 1 10] 5999>
			`n`]`), and in-memory representations need to use direct pointers (`#:140425190562944`).
Manual updates 2022-02-24 12:25:03 +00:00
			`In every case, the references themselves work like Unix file descriptors: an integer or similar`
			`that unforgeably denotes, in a local context, some complex data structure on the other side of`
			`a trust boundary.`
More manual 2022-02-11 22:21:50 +00:00
			`When capability-bearing Preserves values are read off a transport, the capabilities are`
One TODO todone 2022-10-12 11:18:25 +00:00			`[automatically rewritten](../protocol.md#inbound-rewriting) into references to in-memory proxy`
			`objects. The [reverse process](../protocol.md#outbound-rewriting) of rewriting capability`
			`references happens when an in-memory value is serialized for transmission.`
More manual 2022-02-11 22:21:50 +00:00
Initial commit of manual 2022-02-10 12:36:29 +00:00			`## Schemas`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`Preserves comes with a schema language suitable for defining protocols among actors/programs in`
			`Synit. Because Preserves is a superset of JSON, its schemas can be used for parsing JSON just`
Manual updates 2022-02-24 12:25:03 +00:00			`as well as for native Preserves values.[^you-have-to-use-a-preserves-reader] From the [schema`
preserves.dev 2022-05-24 12:04:03 +00:00			`specification](https://preserves.dev/preserves-schema.html):`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`> A Preserves schema connects Preserves Values to host-language data`
			`> structures. Each definition within a schema can be processed by a`
			`> compiler to produce`
			`>`
			`> - a host-language type definition;`
			`> - a partial parsing function from Values to instances of the`
			`> produced type; and`
			`> - a total serialization function from instances of the type to`
			`> Values.`
			`>`
			`> Every parsed Value retains enough information to always be able to`
			`> be serialized again, and every instance of a host-language data`
			`> structure contains, by construction, enough information to be`
			`> successfully serialized.`

			`Instead of taking host-language data structure definitions as primary, in the way that systems`
More manual 2022-02-11 22:56:11 +00:00			`like [Serde](https://serde.rs/) do, Preserves schemas take the shape of the serialized data`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`as primary.`

			`To see the difference, let's look at an example.`

			`### Example: Book Outline`

			`Systems like [Serde](https://serde.rs/) concentrate on defining (de)serializers for`
			`host-language type definitions.`

More manual 2022-02-11 22:56:11 +00:00			`Serde starts from definitions like the following.[^this-example-from-mdbook] It generates`
			`(de)serialization code for various different data languages (such as JSON, XML, CBOR, etc.)`
			`in a single programming language: Rust.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			```rust
			`pub struct BookOutline {`
			`pub sections: Vec<BookItem>,`
			`}`
			`pub enum BookItem {`
			`Chapter(Chapter),`
			`Separator,`
			`PartTitle(String),`
			`}`
			`pub struct Chapter {`
			`pub name: String,`
			`pub sub_items: Vec<BookItem>,`
			`}`
			```

More manual 2022-02-11 22:56:11 +00:00			`The (de)serializers are able to convert between in-memory and serialized representations such`
			`as the following JSON document. The focus is on Rust: interpreting the produced documents from`
			`other languages is out-of-scope for Serde.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			```json
			`{`
			`"sections": [`
			`{ "PartTitle": "Part I" },`
			`"Separator",`
			`{`
			`"Chapter": {`
			`"name": "Chapter One",`
			`"sub_items": []`
			`}`
			`},`
			`{`
			`"Chapter": {`
			`"name": "Chapter Two",`
			`"sub_items": []`
			`}`
			`}`
			`]`
			`}`
			```

More manual 2022-02-11 22:56:11 +00:00			`By contrast, Preserves schemas map a single data language to and from multiple programming`
			`languages. Each specific programming language has its own schema compiler, which generates type`
			`definitions and (de)serialization code for that language from a language-independent grammar.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
More manual 2022-02-11 22:56:11 +00:00			`For example, a schema able to parse values compatible with those produced by Serde for the type`
			`definitions above is the following:`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			```preserves
			`version 1 .`
Tweaks 2022-02-11 21:51:04 +00:00
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`BookOutline = {`
			`"sections": @sections [BookItem ...],`
			`} .`
Tweaks 2022-02-11 21:51:04 +00:00
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`BookItem = @chapter { "Chapter": @value Chapter }`
			`/ @separator "Separator"`
			`/ @partTitle { "PartTitle": @value string } .`
Tweaks 2022-02-11 21:51:04 +00:00
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`Chapter = {`
			`"name": @name string,`
			`"sub_items": @sub_items [BookItem ...],`
			`} .`
			```

More manual 2022-02-11 22:56:11 +00:00			`Using the Rust schema compiler, we see types such as the following, which are similar to but`
			`not the same as the original Rust types above:`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			```rust
			`pub struct BookOutline {`
			`pub sections: std::vec::Vec<BookItem>`
			`}`
			`pub enum BookItem {`
			`Chapter { value: std::boxed::Box<Chapter> },`
			`Separator,`
			`PartTitle { value: std::string::String }`
			`}`
			`pub struct Chapter {`
			`pub name: std::string::String,`
			`pub sub_items: std::vec::Vec<BookItem>`
			`}`
			```

			`Using the TypeScript schema compiler, we see`

			```typescript
			`export type BookOutline = {"sections": Array<BookItem>};`
Tweaks 2022-02-11 21:51:04 +00:00
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`export type BookItem = (`
			`{"_variant": "chapter", "value": Chapter} \|`
			`{"_variant": "separator"} \|`
			`{"_variant": "partTitle", "value": string}`
			`);`
Tweaks 2022-02-11 21:51:04 +00:00
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`export type Chapter = {"name": string, "sub_items": Array<BookItem>};`
			```

			`Using the Racket schema compiler, we see`

			```racket
			`(struct BookOutline (sections))`
			`(define (BookItem? p)`
Tweaks 2022-02-11 21:51:04 +00:00			`(or (BookItem-chapter? p)`
			`(BookItem-separator? p)`
			`(BookItem-partTitle? p)))`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`(struct BookItem-chapter (value))`
			`(struct BookItem-separator ())`
			`(struct BookItem-partTitle (value))`
			`(struct Chapter (name sub_items))`
			```

			`and so on.`

			`### Example: Book Outline redux, using Records`

			`The schema for book outlines above accepts Preserves (JSON) documents compatible with the`
			`(de)serializers produced by Serde for a Rust-native type.`

			`Instead, we might choose to define a Preserves-native data definition, and to work from`
			`that:[^lose-compatibility]`

			```preserves
			`version 1 .`
			`BookOutline = <book-outline @sections [BookItem ...]> .`
			`BookItem = Chapter / =separator / @partTitle string .`
			`Chapter = <chapter @name string @sub_items [BookItem ...]> .`
			```

More manual 2022-02-11 22:56:11 +00:00			`The schema compilers produce exactly the same type definitions[^well-almost-exactly] for`
			`this variation. The differences are in the (de)serialization code only.`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
			`Here's the Preserves value equivalent to the example above, expressed using the Preserves-native schema:`

			```preserves
			`<book-outline [`
			`"Part I"`
			`separator`
			`<chapter "Chapter One" []>`
			`<chapter "Chapter Two" []>`
			`]>`
			```

			`---`

			`#### Notes`

More manual 2022-02-11 22:21:50 +00:00			`[^preserves-ordering-exists-too]: The specification defines a total order relation over`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`Preserves values as well.`

			`[^dataspaces-need-data-with-semantics]: In particular, dataspaces need the assertion data`
			`they contain to have a sensible equivalence predicate in order to be useful at all. If you`
			`can't reliably tell whether two values are the same or different, how are you supposed to`
			`use them to look things up in anything database-like?`
			`Languages like JSON, which [don't have a well-defined equivalence`
preserves.dev 2022-05-24 12:04:03 +00:00			`relation](https://preserves.dev/why-not-json.html#json-syntax-doesnt-mean-anything),`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`aren't good enough. When programs communicate with each other, they need to be sure that`
			`their peers will understand the information they receive exactly as it was sent.`

			`[^syrup]: Besides the two core syntaxes, other serialization syntaxes are in use in other`
			`systems. For example, the [Spritely](https://gitlab.com/spritely)`
			`[Goblins](https://gitlab.com/spritely/goblins) actor library uses a serialization syntax`
			`called [Syrup](https://github.com/ocapn/syrup#pseudo-specification), reminiscent of`
			[`bencode`](https://en.wikipedia.org/wiki/Bencode).

Manual updates 2022-02-24 12:25:03 +00:00			`[^you-have-to-use-a-preserves-reader]: You have to use a Preserves text-syntax reader on JSON`
			terms to do this, though: JSON values like `null`, `true`, and `false` naively read as
			Preserves symbols. Preserves doesn't have the concept of `null`.

Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			`[^this-example-from-mdbook]: This example is a simplified form of the preprocessor type`
			`definitions for`
			`[mdBook](https://rust-lang.github.io/mdBook/for_developers/preprocessors.html), the system`
Manual updates 2022-02-24 12:25:03 +00:00			`used to render these pages. I use a real [Preserves schema`
Repair links 2023-10-16 15:02:22 +00:00			`definition](https://git.syndicate-lang.org/synit/synit-manual/src/branch/main/book.prs) for`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00			parsing and producing Serde's JSON representation of mdBook `Book` structures in order to
Repair links 2023-10-16 15:02:22 +00:00			`[preprocess the text](https://git.syndicate-lang.org/synit/synit-manual/src/branch/main/mdbook-ditaa).`
Preserves stuff for the manual 2022-02-11 21:08:59 +00:00
More manual 2022-02-11 22:21:50 +00:00			`[^lose-compatibility]: By doing so, we lose compatibility with the Serde structures, but the`
			`point is to show the kinds of schemas available to us once we move away from strict`
			`compatibility with existing data formats.`
More manual 2022-02-11 22:56:11 +00:00
			`[^well-almost-exactly]: Well, almost exactly the same. The only difference is in the Rust`
			`types, which use tuple-style instead of record-style structs for chapters and part titles.`