Begin sketch of Preserves Schema semantics
This commit is contained in:
parent
5b01beb2f6
commit
b332f2668e
|
@ -4,7 +4,7 @@ title: "Preserves Schema"
|
|||
---
|
||||
|
||||
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
|
||||
June 2022. Version 0.2.0.
|
||||
June 2022. Version 0.3.0.
|
||||
|
||||
[abnf]: https://tools.ietf.org/html/rfc7405
|
||||
|
||||
|
@ -423,13 +423,119 @@ the given name in the overall record type for a definition. The type
|
|||
of value contained in the field will correspond to the `Pattern` or
|
||||
`SimplePattern` given.
|
||||
|
||||
## Semantics
|
||||
|
||||
Having covered concrete syntax, we now give semantics for the schema
|
||||
language in terms of the [abstract syntax][schema.prs] and of the
|
||||
language of Preserves `Value`s.
|
||||
|
||||
[schema.prs]: https://gitlab.com/preserves/preserves/-/blob/main/schema/schema.prs
|
||||
|
||||
### Metaschema interpreter
|
||||
|
||||
(TODO: this subsection is to define an interpreter for metaschema values
|
||||
applied to Preserves `Value`s.)
|
||||
|
||||
### Host-language types
|
||||
|
||||
The host-language types corresponding to a metaschema instance can
|
||||
themselves be described according to a grammar.
|
||||
|
||||
The definitions in this section should be understood as being part of a
|
||||
module named `host`, in a bundle alongside a module named `schema`
|
||||
corresponding to the metaschema in the appendix below.
|
||||
|
||||
#### Abstract host language types
|
||||
|
||||
Definition = <union @variants [Variant ...]> / Simple .
|
||||
Variant = [@label symbol @type Simple] .
|
||||
|
||||
The host-language type corresponding to a definition will either be a
|
||||
tagged union (side condition: at least two `Variant`s are present in a
|
||||
`union`) or a *simple* type.
|
||||
|
||||
Simple = Field / Record .
|
||||
Record = <rec @fields [NamedField ...]> .
|
||||
NamedField = [@name symbol @type Field] .
|
||||
|
||||
A *simple* type may be either a single, simple value of *field* type, or
|
||||
a record of multiple named fields, each having a specific *field* type.
|
||||
|
||||
Field = =unit
|
||||
/ =any
|
||||
/ =embedded
|
||||
/ <array @element Field>
|
||||
/ <set @element Field>
|
||||
/ <map @key Field @value Field>
|
||||
/ <ref @name schema.Ref>
|
||||
/ schema.AtomKind .
|
||||
|
||||
A *field* type is either
|
||||
|
||||
- the language's unit type (the empty tuple, the "void" value),
|
||||
- the universal type of all Preserves `Value`s,
|
||||
- the type of some host-language [embedded value](./preserves.html#embeddeds) in some context,
|
||||
- the type of a uniform array having elements of a specific *field* type,
|
||||
- the type of a set having elements of a specific *field* type,
|
||||
- the type of a dictionary connecting keys of specific type to values of specific type,
|
||||
- the type associated with some other named definition in scope in the current Schema bundle, or
|
||||
- the type of a specific kind of Preserves [`Atom`](./preserves#values).
|
||||
|
||||
#### Computing abstract types from a metaschema instance
|
||||
|
||||
Given a metaschema definition *d* : `schema.Definition`, the function
|
||||
**typeof**{:.pseudocode} yields a `host.Definition`.
|
||||
|
||||
{:.pseudocode #def:typeof}
|
||||
> **typeof** : `schema.Definition` ⟶ `host.Definition`
|
||||
> **typeof** `<or [[`*n`1`* *p`1`*`]` ... `[`*n`n`* *p`n`*`]]>` = `<union [[`*n`1`* (**pat** *p`1`*)`]` ... `[`*n`n`* (**pat** *p`n`*)`]]>`
|
||||
> **typeof** `<and [`*f`1` ... f`n`*`]>` = **product** `[`*f`1` ... f`n`*`]`
|
||||
> **typeof** *p* = **pat** *p*, when *p* ∈ `schema.Pattern`
|
||||
|
||||
{:.pseudocode #def:pat}
|
||||
> **pat** : `schema.Pattern` ⟶ `host.Simple`
|
||||
> **pat** *s* = **field** *s*, when *s* ∈ `schema.SimplePattern`
|
||||
> **pat** *c* = **product** `[`*c*`]`, when *c* ∈ `schema.CompoundPattern`
|
||||
|
||||
{:.pseudocode #def:field}
|
||||
> **field** : `schema.SimplePattern` ⟶ `host.Field`
|
||||
> **field** `any` = `any`
|
||||
> **field** `<atom` *k*`>` = *k*
|
||||
> **field** `<embedded` *s*`>` = `embedded`
|
||||
> **field** `<lit` *v*`>` = `unit`
|
||||
> **field** `<seqof` *s*`>` = `<array` (**field** s)`>`
|
||||
> **field** `<setof` *s*`>` = `<set` (**field** s)`>`
|
||||
> **field** `<dictof` *s`k`* *s`v`*`>` = `<map` (**field** *s`k`*) (**field** *s`v`*)`>`
|
||||
> **field** *r* = *r*, when *r* ∈ `schema.Ref`
|
||||
|
||||
The helper function **product**{:.pseudocode} is where `unit`-valued
|
||||
fields are omitted from the computed host-language type. If all fields
|
||||
are so omitted, or if there were (recursively) no bindings in the input
|
||||
patterns, **product**{:.pseudocode} yields `unit` type itself.
|
||||
|
||||
{:.pseudocode #def:product}
|
||||
> **product** : `[schema.NamedPattern` ...`]` ⟶ `host.Simple`
|
||||
> **product** `[`*f`1` ... f`n`*`]` = `unit`, if *t* = `[]`;
|
||||
> `<rec` *t*`>`, otherwise
|
||||
> where *t* = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*
|
||||
|
||||
{:.pseudocode #def:gather}
|
||||
> **gather** : `schema.NamedPattern` ⟶ `[host.NamedField ...]`
|
||||
> **gather** `<named` *n* *p*`>` = `[]`, if (**field** *p*) = `unit`;
|
||||
> `[[`*n* (**field** *p*)`]]`, otherwise
|
||||
> **gather** `<rec` *f`label`* *f`fields`*`>` = **gather** *f`label`* ⧺ **gather** *f`fields`*
|
||||
> **gather** `<tuple [`*f`1` .. f`n`*`]>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*
|
||||
> **gather** `<tuplePrefix [`*f`1` ... f`n`*`]` *f`repeated`*`>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`* ⧺ **gather** *f`repeated`*
|
||||
> **gather** `<dict {`*v`1`*`:`*f`1` ... v`n`*`:`*f`n`*`}>` = **gather** *f`1`′* ⧺ ⋯ ⧺ **gather** *f`n`′*,
|
||||
> where (*f`1`′ ⋯ f`n`′*) are (*f`1` ⋯ f`n`*) sorted according to [Preserves term order](./preserves.html#total-order).
|
||||
|
||||
## Appendix: Metaschema
|
||||
|
||||
The metaschema defines the structure of the abstract syntax (AST) of
|
||||
schemas, using the concrete DSL syntax described above.
|
||||
|
||||
The text below is taken from
|
||||
[`schema/schema.prs`](https://gitlab.com/preserves/preserves/-/blob/main/schema/schema.prs)
|
||||
[`schema/schema.prs`][schema.prs]
|
||||
in the source code repository.
|
||||
|
||||
A `Bundle` collects a number of `Schema`s, each named by a
|
||||
|
|
|
@ -1,5 +1,9 @@
|
|||
:root {
|
||||
--sans-font: -apple-system, BlinkMacSystemFont, avenir next, avenir, segoe ui, helvetica neue, helvetica, Cantarell, Ubuntu, roboto, noto, arial, sans-serif;
|
||||
--serif-font: palatino, "Palatino Linotype", "Palatino LT STD", "URW Palladio L", "TeX Gyre Pagella", serif;
|
||||
}
|
||||
body {
|
||||
font-family: palatino, "Palatino Linotype", "Palatino LT STD", "URW Palladio L", "TeX Gyre Pagella", serif;
|
||||
font-family: var(--serif-font);
|
||||
box-sizing: border-box;
|
||||
line-height: 1.414;
|
||||
}
|
||||
|
@ -103,6 +107,34 @@ blockquote {
|
|||
background-color: #e9f0f9;
|
||||
}
|
||||
|
||||
blockquote.pseudocode {
|
||||
border-left: none;
|
||||
padding: 0;
|
||||
}
|
||||
.pseudocode strong, strong.pseudocode {
|
||||
font-family: var(--sans-font);
|
||||
font-weight: inherit;
|
||||
font-size: 90%;
|
||||
}
|
||||
.pseudocode :first-child {
|
||||
margin-top: 0;
|
||||
}
|
||||
.pseudocode :last-child {
|
||||
margin-bottom: 0;
|
||||
}
|
||||
.pseudocode pre, .pseudocode code { background-color: inherit; }
|
||||
.pseudocode p {
|
||||
white-space: pre;
|
||||
}
|
||||
.pseudocode em > code, .pseudocode strong > code {
|
||||
vertical-align: sub;
|
||||
font-size: 75%;
|
||||
font-family: inherit;
|
||||
}
|
||||
.pseudocode code {
|
||||
font-size: 80%;
|
||||
}
|
||||
|
||||
/*---------------------------------------------------------------------------*/
|
||||
/* Rouge syntax classes */
|
||||
|
||||
|
|
|
@ -35,6 +35,7 @@ Taking inspiration from functional programming, we start with a
|
|||
definition of the *values* that we want to work with and give them
|
||||
meaning independent of their syntax.
|
||||
|
||||
<a id="values"></a>
|
||||
Our `Value`s fall into two broad categories: *atomic* and *compound*
|
||||
data. Every `Value` is finite and non-cyclic. Embedded values, called
|
||||
`Embedded`s, are a third, special-case category.
|
||||
|
|
Loading…
Reference in New Issue