Begin sketch of Preserves Schema semantics

This commit is contained in:
Tony Garnock-Jones 2022-06-09 21:30:56 +02:00
parent 5b01beb2f6
commit b332f2668e
3 changed files with 142 additions and 3 deletions

View File

@ -4,7 +4,7 @@ title: "Preserves Schema"
---
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
June 2022. Version 0.2.0.
June 2022. Version 0.3.0.
[abnf]: https://tools.ietf.org/html/rfc7405
@ -423,13 +423,119 @@ the given name in the overall record type for a definition. The type
of value contained in the field will correspond to the `Pattern` or
`SimplePattern` given.
## Semantics
Having covered concrete syntax, we now give semantics for the schema
language in terms of the [abstract syntax][schema.prs] and of the
language of Preserves `Value`s.
[schema.prs]: https://gitlab.com/preserves/preserves/-/blob/main/schema/schema.prs
### Metaschema interpreter
(TODO: this subsection is to define an interpreter for metaschema values
applied to Preserves `Value`s.)
### Host-language types
The host-language types corresponding to a metaschema instance can
themselves be described according to a grammar.
The definitions in this section should be understood as being part of a
module named `host`, in a bundle alongside a module named `schema`
corresponding to the metaschema in the appendix below.
#### Abstract host language types
Definition = <union @variants [Variant ...]> / Simple .
Variant = [@label symbol @type Simple] .
The host-language type corresponding to a definition will either be a
tagged union (side condition: at least two `Variant`s are present in a
`union`) or a *simple* type.
Simple = Field / Record .
Record = <rec @fields [NamedField ...]> .
NamedField = [@name symbol @type Field] .
A *simple* type may be either a single, simple value of *field* type, or
a record of multiple named fields, each having a specific *field* type.
Field = =unit
/ =any
/ =embedded
/ <array @element Field>
/ <set @element Field>
/ <map @key Field @value Field>
/ <ref @name schema.Ref>
/ schema.AtomKind .
A *field* type is either
- the language's unit type (the empty tuple, the "void" value),
- the universal type of all Preserves `Value`s,
- the type of some host-language [embedded value](./preserves.html#embeddeds) in some context,
- the type of a uniform array having elements of a specific *field* type,
- the type of a set having elements of a specific *field* type,
- the type of a dictionary connecting keys of specific type to values of specific type,
- the type associated with some other named definition in scope in the current Schema bundle, or
- the type of a specific kind of Preserves [`Atom`](./preserves#values).
#### Computing abstract types from a metaschema instance
Given a metaschema definition *d* : `schema.Definition`, the function
**typeof**{:.pseudocode} yields a `host.Definition`.
{:.pseudocode #def:typeof}
> **typeof** : `schema.Definition``host.Definition`
> **typeof** `<or [[`*n`1`* *p`1`*`]` ... `[`*n`n`* *p`n`*`]]>` = `<union [[`*n`1`* (**pat** *p`1`*)`]` ... `[`*n`n`* (**pat** *p`n`*)`]]>`
> **typeof** `<and [`*f`1` ... f`n`*`]>` = **product** `[`*f`1` ... f`n`*`]`
> **typeof** *p* = **pat** *p*, when *p*`schema.Pattern`
{:.pseudocode #def:pat}
> **pat** : `schema.Pattern``host.Simple`
> **pat** *s* = **field** *s*, when *s*`schema.SimplePattern`
> **pat** *c* = **product** `[`*c*`]`, when *c*`schema.CompoundPattern`
{:.pseudocode #def:field}
> **field** : `schema.SimplePattern``host.Field`
> **field** `any` = `any`
> **field** `<atom` *k*`>` = *k*
> **field** `<embedded` *s*`>` = `embedded`
> **field** `<lit` *v*`>` = `unit`
> **field** `<seqof` *s*`>` = `<array` (**field** s)`>`
> **field** `<setof` *s*`>` = `<set` (**field** s)`>`
> **field** `<dictof` *s`k`* *s`v`*`>` = `<map` (**field** *s`k`*) (**field** *s`v`*)`>`
> **field** *r* = *r*, when *r*`schema.Ref`
The helper function **product**{:.pseudocode} is where `unit`-valued
fields are omitted from the computed host-language type. If all fields
are so omitted, or if there were (recursively) no bindings in the input
patterns, **product**{:.pseudocode} yields `unit` type itself.
{:.pseudocode #def:product}
> **product** : `[schema.NamedPattern` ...`]` ⟶ `host.Simple`
> **product** `[`*f`1` ... f`n`*`]` = `unit`, if *t* = `[]`;
> `<rec` *t*`>`, otherwise
> where *t* = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*
{:.pseudocode #def:gather}
> **gather** : `schema.NamedPattern``[host.NamedField ...]`
> **gather** `<named` *n* *p*`>` = `[]`, if (**field** *p*) = `unit`;
> `[[`*n* (**field** *p*)`]]`, otherwise
> **gather** `<rec` *f`label`* *f`fields`*`>` = **gather** *f`label`***gather** *f`fields`*
> **gather** `<tuple [`*f`1` .. f`n`*`]>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*
> **gather** `<tuplePrefix [`*f`1` ... f`n`*`]` *f`repeated`*`>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`***gather** *f`repeated`*
> **gather** `<dict {`*v`1`*`:`*f`1` ... v`n`*`:`*f`n`*`}>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*,
> where (*f`1` ⋯ f`n`*) are (*f`1` ⋯ f`n`*) sorted according to [Preserves term order](./preserves.html#total-order).
## Appendix: Metaschema
The metaschema defines the structure of the abstract syntax (AST) of
schemas, using the concrete DSL syntax described above.
The text below is taken from
[`schema/schema.prs`](https://gitlab.com/preserves/preserves/-/blob/main/schema/schema.prs)
[`schema/schema.prs`][schema.prs]
in the source code repository.
A `Bundle` collects a number of `Schema`s, each named by a

View File

@ -1,5 +1,9 @@
:root {
--sans-font: -apple-system, BlinkMacSystemFont, avenir next, avenir, segoe ui, helvetica neue, helvetica, Cantarell, Ubuntu, roboto, noto, arial, sans-serif;
--serif-font: palatino, "Palatino Linotype", "Palatino LT STD", "URW Palladio L", "TeX Gyre Pagella", serif;
}
body {
font-family: palatino, "Palatino Linotype", "Palatino LT STD", "URW Palladio L", "TeX Gyre Pagella", serif;
font-family: var(--serif-font);
box-sizing: border-box;
line-height: 1.414;
}
@ -103,6 +107,34 @@ blockquote {
background-color: #e9f0f9;
}
blockquote.pseudocode {
border-left: none;
padding: 0;
}
.pseudocode strong, strong.pseudocode {
font-family: var(--sans-font);
font-weight: inherit;
font-size: 90%;
}
.pseudocode :first-child {
margin-top: 0;
}
.pseudocode :last-child {
margin-bottom: 0;
}
.pseudocode pre, .pseudocode code { background-color: inherit; }
.pseudocode p {
white-space: pre;
}
.pseudocode em > code, .pseudocode strong > code {
vertical-align: sub;
font-size: 75%;
font-family: inherit;
}
.pseudocode code {
font-size: 80%;
}
/*---------------------------------------------------------------------------*/
/* Rouge syntax classes */

View File

@ -35,6 +35,7 @@ Taking inspiration from functional programming, we start with a
definition of the *values* that we want to work with and give them
meaning independent of their syntax.
<a id="values"></a>
Our `Value`s fall into two broad categories: *atomic* and *compound*
data. Every `Value` is finite and non-cyclic. Embedded values, called
`Embedded`s, are a third, special-case category.