2021-05-25 12:11:33 +00:00
|
|
|
|
---
|
|
|
|
|
no_site_title: true
|
|
|
|
|
title: "Preserves Schema"
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
|
2023-11-03 09:31:44 +00:00
|
|
|
|
October 2023. Version 0.3.4.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
[abnf]: https://tools.ietf.org/html/rfc7405
|
2023-11-03 09:31:44 +00:00
|
|
|
|
[identifierlike]: #sufficiently-identifierlike-values
|
|
|
|
|
[valid identifier]: #identifiers-and-capitalization-conventions
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
This document proposes a Schema language for the
|
|
|
|
|
[Preserves data model](./preserves.html).
|
|
|
|
|
|
|
|
|
|
## Introduction
|
|
|
|
|
|
2023-03-17 14:24:53 +00:00
|
|
|
|
{% include what-is-preserves-schema.md %}
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
**Portability.** Preserves Schema is broadly portable. Any host-language
|
|
|
|
|
type system that can represent [algebraic
|
|
|
|
|
types](https://en.wikipedia.org/wiki/Algebraic_data_type) in some way
|
|
|
|
|
should be suitable as a compilation target.
|
|
|
|
|
|
|
|
|
|
This includes ML-family languages like [Rust][rust-impl] and Haskell,
|
|
|
|
|
object-oriented languages like Java, [Python][python-impl] and
|
|
|
|
|
[Smalltalk][smalltalk-impl], and multiparadigm languages like
|
|
|
|
|
[JavaScript][ts-impl], [TypeScript][ts-impl], [Racket][racket-impl],
|
|
|
|
|
[Nim][nim-impl] and Erlang.
|
|
|
|
|
|
|
|
|
|
[nim-impl]: https://git.syndicate-lang.org/ehmry/preserves-nim
|
|
|
|
|
[python-impl]: https://gitlab.com/preserves/preserves/-/blob/main/implementations/python/preserves/schema.py
|
|
|
|
|
[racket-impl]: https://gitlab.com/preserves/preserves/-/tree/main/implementations/racket/preserves/preserves-schema
|
|
|
|
|
[rust-impl]: https://gitlab.com/preserves/preserves/-/tree/main/implementations/rust/preserves-schema
|
|
|
|
|
[smalltalk-impl]: https://squeaksource.com/Preserves.html
|
|
|
|
|
[ts-impl]: https://gitlab.com/preserves/preserves/-/tree/main/implementations/javascript/packages/schema
|
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
**Example.** Sending the schema
|
|
|
|
|
|
|
|
|
|
version 1 .
|
|
|
|
|
Date = <date @year int @month int @day int>.
|
|
|
|
|
Person = <person @name string @birthday Date>.
|
|
|
|
|
|
|
|
|
|
to the TypeScript schema compiler produces types,
|
|
|
|
|
|
|
|
|
|
type Date = {"year": number, "month": number, "day": number};
|
|
|
|
|
type Person = {"name": string, "birthday": Date};
|
|
|
|
|
|
|
|
|
|
constructors,
|
|
|
|
|
|
|
|
|
|
function Date({year, month, day}: {year: number, month: number, day: number}): Date;
|
|
|
|
|
function Person({name, birthday}: {name: string, birthday: Date}): Person;
|
|
|
|
|
|
|
|
|
|
partial parsing functions which throw on parse failure,
|
|
|
|
|
|
|
|
|
|
function asDate(v: _val): Date;
|
|
|
|
|
function asPerson(v: _val): Person;
|
|
|
|
|
|
|
|
|
|
total parsing functions which yield `undefined` on parse failure,
|
|
|
|
|
|
|
|
|
|
function toDate(v: _val): undefined | Date;
|
|
|
|
|
function toPerson(v: _val): undefined | Person;
|
|
|
|
|
|
|
|
|
|
and total serialization functions,
|
|
|
|
|
|
|
|
|
|
function fromDate(_v: Date): _val;
|
|
|
|
|
function fromPerson(_v: Person): _val;
|
|
|
|
|
|
|
|
|
|
## Concepts
|
|
|
|
|
|
|
|
|
|
**Bundle.** A collection of schemas, each named by a module path.
|
|
|
|
|
|
|
|
|
|
**Definition.** A named pattern within a schema. When compiled, a
|
|
|
|
|
definition will usually produce a type (plus associated constructors
|
|
|
|
|
and predicates), a parser function, and a serializer function.
|
|
|
|
|
|
|
|
|
|
**Metaschema.** The Preserves metaschema is a schema describing the
|
|
|
|
|
abstract syntax of all schema instances (including itself).
|
|
|
|
|
|
|
|
|
|
**Module path.** A sequence of symbols, denoting a leaf in a tree with
|
|
|
|
|
symbol-labelled edges.
|
|
|
|
|
|
|
|
|
|
**Pattern.** A pattern describes a collection of `Value`s as well as
|
|
|
|
|
providing names for the portions of matching `Value`s that should be
|
|
|
|
|
captured in a host-language data type.
|
|
|
|
|
|
|
|
|
|
**Schema abstract syntax tree (AST).** Schema-manipulating tools will
|
|
|
|
|
usually work with schema AST; that is, with `Value`s conforming to the
|
|
|
|
|
metaschema or instances of the corresponding host-language
|
|
|
|
|
datastructures.
|
|
|
|
|
|
|
|
|
|
**Schema domain-specific language (DSL).** While human beings *can*
|
|
|
|
|
work directly with Preserves documents matching the metaschema, the
|
|
|
|
|
schema DSL provides an easier-to-read and -write language for working
|
|
|
|
|
with schemas that can be translated into instances
|
|
|
|
|
|
|
|
|
|
**Schema.** A collection of definitions, plus an optional schema-wide
|
|
|
|
|
reference to a schema describing embedded values.
|
|
|
|
|
|
2021-10-14 11:11:40 +00:00
|
|
|
|
## Identifiers and Capitalization Conventions
|
|
|
|
|
|
|
|
|
|
Throughout, `id` is used in the grammar to denote an *identifier*,
|
|
|
|
|
which is a symbol that matches the regular expression
|
|
|
|
|
`^[a-zA-Z][a-zA-Z_0-9]*$`. This is a lowest-common-denominator
|
|
|
|
|
constraint that allows for a reasonable mapping to the identifiers of
|
|
|
|
|
many programming languages.
|
|
|
|
|
|
|
|
|
|
Identifiers are case-sensitive. Schemas should be written with an
|
|
|
|
|
awareness of the fact that some programming languages cannot preserve
|
|
|
|
|
case differences. Avoid using two identifiers in the same context that
|
|
|
|
|
differ only in case.
|
|
|
|
|
|
|
|
|
|
Schemas should be written using the following capitalization
|
|
|
|
|
conventions:
|
|
|
|
|
|
|
|
|
|
- `UpperCamelCase` for *definition* names.
|
|
|
|
|
|
|
|
|
|
- Either `lowerCamelCase` or `UpperCamelCase` for definition-unique
|
2022-06-09 12:59:53 +00:00
|
|
|
|
names for alternatives within an alternation definition.
|
2021-10-14 11:11:40 +00:00
|
|
|
|
|
|
|
|
|
- `lowerCamelCase` for *module* names (schema names, package names)
|
|
|
|
|
and *field* or *variable* names.
|
|
|
|
|
|
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
## The Preserves Schema Language
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
In this section, we use an [ABNF][abnf]-like notation to define a
|
|
|
|
|
textual syntax that is easy for people to read and write. Most of the
|
2022-06-09 13:37:29 +00:00
|
|
|
|
examples in this document are written using this syntax. An appendix
|
|
|
|
|
defines the abstract syntax that this surface syntax translates into.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
### Schema files and bundles.
|
|
|
|
|
|
|
|
|
|
Each schema should be placed in a single file. Schema files usually
|
|
|
|
|
end with extension `.prs`, and consist of a sequence of Preserves
|
|
|
|
|
`Value`s[^like-sexps] separated into *clauses* by the Preserves
|
|
|
|
|
`Symbol` "`.`".
|
|
|
|
|
|
|
|
|
|
[^like-sexps]: That is, schema files use Preserves as a kind of
|
|
|
|
|
S-expression!
|
|
|
|
|
|
|
|
|
|
A bundle of schema files is a directory tree containing `.prs` files.
|
|
|
|
|
|
|
|
|
|
### Clauses.
|
|
|
|
|
|
2021-05-25 18:13:18 +00:00
|
|
|
|
Clause = (Version / EmbeddedTypeName / Include / Definition) "."
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
Version = "version" "1"
|
|
|
|
|
EmbeddedTypeName = "embeddedType" ("#f" / Ref)
|
2021-05-25 18:13:18 +00:00
|
|
|
|
Include = "include" string
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
**Version specification.** Mandatory. Names the version of the schema
|
|
|
|
|
language used in the file. This version of the specification is
|
|
|
|
|
referred to in schema files as `version 1`.
|
|
|
|
|
|
|
|
|
|
**Embedded type name.** Optional. If given as `#f` (the default), it
|
|
|
|
|
declares that values parsed by the schema do not contain embedded
|
|
|
|
|
`Value`s of any particular type. If given as a `Ref`, a reference to a
|
|
|
|
|
definition in this or a neighbouring schema, it declares that embedded
|
|
|
|
|
`Value`s must themselves conform to the named definition.
|
|
|
|
|
|
2021-05-25 18:13:18 +00:00
|
|
|
|
**Include.** *Experimental.* Includes the contents of a neighbouring
|
|
|
|
|
file as if it were textually inserted in place of this clause. The
|
|
|
|
|
file path may be relative to the current file, or absolute.
|
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
### Definitions.
|
|
|
|
|
|
|
|
|
|
Definition = id "=" (OrPattern / AndPattern / Pattern)
|
|
|
|
|
|
|
|
|
|
Each definition clause connects a pattern over `Value`s with a
|
|
|
|
|
host-language type name (derived from the supplied `id`) and set of
|
|
|
|
|
associated functions.
|
|
|
|
|
|
|
|
|
|
A definition may be
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
- an *alternation* of patterns, allowing for biased choice among alternatives;
|
|
|
|
|
- an *intersection* of patterns, allowing for composition and reuse of patterns; or
|
|
|
|
|
- the base case, an ordinary pattern.
|
|
|
|
|
|
|
|
|
|
**Host-language types.** Each definition includes *bindings* that
|
|
|
|
|
capture information from a parsed `Value` and expose it to programs in
|
|
|
|
|
the host language. When more than one binding is present in a
|
|
|
|
|
definition, a host-language record (product, structure, tuple) will be
|
|
|
|
|
the result of a parse; otherwise, a simple value will result. When a
|
|
|
|
|
definition involves *alternation*, a host-language representation of a
|
|
|
|
|
sum over the types of each branch of the alternation will result. For
|
|
|
|
|
example, a compiler targeting an object-oriented host language would
|
|
|
|
|
produce a base class for each definition, with a field for each binding
|
|
|
|
|
and a subclass for each variant alternative. A functional host language
|
|
|
|
|
with algebraic data types would produce a labelled-sum-of-products type.
|
|
|
|
|
|
|
|
|
|
### Alternation definitions.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2023-10-29 13:51:03 +00:00
|
|
|
|
OrPattern = [orsep] AltPattern 1*(orsep AltPattern) [orsep]
|
|
|
|
|
orsep = 1*"/"
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2023-10-29 13:51:03 +00:00
|
|
|
|
The right-hand-side of a definition may supply two or more *alternatives*.
|
|
|
|
|
Alternatives are separated by any number of slashes `/`, and leading or
|
|
|
|
|
trailing slashes are ignored. When parsing, the alternatives are tried in
|
|
|
|
|
order; the result of the first successful alternative is the result of the
|
|
|
|
|
entire parse.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
**Host-language types.** The type corresponding to an `OrPattern` is an
|
|
|
|
|
algebraic sum type, a union type, a variant type, or a concrete subclass
|
|
|
|
|
of an abstract superclass, depending on the host language.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
**Variant names.** Each alternative with an `OrPattern` must have a
|
|
|
|
|
definition-unique *name*. The name is used to uniquely label the
|
|
|
|
|
alternative's host-language representation (for example, a subclass, or
|
|
|
|
|
a member of a tagged union type).
|
|
|
|
|
|
2023-10-29 14:20:09 +00:00
|
|
|
|
A variant name can either be given explicitly as `@name` or
|
|
|
|
|
inferred.[^variant-names-unlike-binding-names] It can only be inferred
|
|
|
|
|
from the label of a record pattern, from the name of a reference to
|
2023-11-03 09:31:44 +00:00
|
|
|
|
another definition, or from the text of a "[sufficiently
|
|
|
|
|
identifierlike][identifierlike]" literal pattern - one that matches a
|
|
|
|
|
string, symbol or boolean:
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2023-10-29 14:20:09 +00:00
|
|
|
|
AltPattern = "@" id Pattern
|
2021-09-30 12:46:19 +00:00
|
|
|
|
/ "<" id PatternSequence ">"
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ Ref
|
|
|
|
|
/ LiteralPattern -- with a side condition
|
|
|
|
|
|
2023-10-29 14:20:09 +00:00
|
|
|
|
[^variant-names-unlike-binding-names]: Note that explicitly-given
|
|
|
|
|
*variant* names are unlike *binding* names in that binding names give
|
|
|
|
|
rise to a field in the record type for a definition, while variant
|
|
|
|
|
names are used as labels for alternatives in a sum type for a
|
|
|
|
|
definition.
|
|
|
|
|
|
|
|
|
|
A host language will likely use the same ordering of variants in a sum
|
|
|
|
|
type as specified by the schema. It is therefore recommended to specify
|
|
|
|
|
first the alternative best suited as a default initialization value (if
|
2021-11-11 20:23:03 +00:00
|
|
|
|
there is any).
|
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
### Intersection definitions.
|
|
|
|
|
|
2023-10-29 13:51:03 +00:00
|
|
|
|
AndPattern = [andsep] NamedPattern 1*(andsep NamedPattern) [andsep]
|
|
|
|
|
andsep = 1*"&"
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
The right-hand-side of a definition may supply two or more patterns, the
|
|
|
|
|
*intersection* of whose denotations is the denotation of the overall
|
2023-10-29 13:51:03 +00:00
|
|
|
|
definition. The patterns are separated by any number of ampersands `&`,
|
|
|
|
|
and leading or trailing ampersands are ignored. When parsing, every
|
|
|
|
|
pattern is tried: if all succeed, the resulting information is combined
|
|
|
|
|
into a single type; otherwise, the overall parse fails.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
When serializing, the terms resulting from serializing at each pattern
|
|
|
|
|
are *merged* together.
|
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
**Host-language types.** Compiling an intersection definition produces a
|
|
|
|
|
host-language type that is effectively the algebraic product of the
|
|
|
|
|
types of the parts of the intersection. Practically, this usually means
|
|
|
|
|
a record (product, structure, tuple) type.
|
|
|
|
|
|
|
|
|
|
{:.rationale}
|
|
|
|
|
> #### Experimental.
|
|
|
|
|
>
|
|
|
|
|
> Intersections are an experimental feature. They can be used to express
|
|
|
|
|
> *optional dictionary entries*:
|
|
|
|
|
>
|
|
|
|
|
> MyDict = {a: int, b: string} & @c MaybeC .
|
|
|
|
|
> MaybeC = @present {c: symbol} / @invalid {c: any} / @absent {} .
|
|
|
|
|
>
|
|
|
|
|
> They can also be used to express something reminiscent of *inheritance*:
|
|
|
|
|
>
|
|
|
|
|
> Type = @base BaseFields & @detail SubType .
|
|
|
|
|
> BaseFields = {a: int, b: string} .
|
|
|
|
|
> SubType = @base {}
|
|
|
|
|
> / @variantA { x: int }
|
|
|
|
|
> / @mid Mid .
|
|
|
|
|
> Mid = { y: symbol } & @detail SubSubType .
|
|
|
|
|
> SubSubType = @variantB { z: "type-b" }
|
|
|
|
|
> / @variantC { z: "type-c" }
|
|
|
|
|
>
|
|
|
|
|
> It is not yet clear whether they pull their weight.
|
|
|
|
|
>
|
|
|
|
|
> From the point of view of the user of the schema language, using
|
|
|
|
|
> intersections to express optional values is cumbersome. Not only is it
|
|
|
|
|
> verbose, requiring auxiliary definitions, but it leaves responsibility
|
|
|
|
|
> for checking for invalid inputs up to the user, rather than handling
|
|
|
|
|
> it completely at the Schema layer. A future Schema version will likely
|
|
|
|
|
> include first-class support for optionality.
|
2021-05-31 09:48:52 +00:00
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
### Patterns.
|
|
|
|
|
|
|
|
|
|
Pattern = SimplePattern / CompoundPattern
|
|
|
|
|
|
|
|
|
|
Patterns come in two kinds:
|
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
- The parsers for *simple patterns* yield a single host-language
|
|
|
|
|
value—for example, a string, an array, a number, or a pointer—or
|
|
|
|
|
even, in the case of `LiteralPattern`s, no host-language values at
|
|
|
|
|
all.[^no-values-at-all]
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
- The parsers for *compound patterns* yield zero or more *fields*
|
2021-05-25 12:11:33 +00:00
|
|
|
|
which combine into an overall record type associated with a
|
|
|
|
|
definition.
|
|
|
|
|
|
2022-06-09 12:59:53 +00:00
|
|
|
|
[^no-values-at-all]: The case of a `LiteralPattern` yielding no
|
|
|
|
|
host-language values is interesting. All the information required to
|
|
|
|
|
reversibly store the result of a parse is already in the schema, so
|
|
|
|
|
nothing need be stored at runtime in host-language data type
|
|
|
|
|
instances. Concretely, a definition consisting only of a
|
|
|
|
|
`LiteralPattern` might correspond to a host-language unit type (the
|
|
|
|
|
empty tuple, the "void" value). Definitions consisting of
|
|
|
|
|
`CompoundPattern`s involving `LiteralPattern`s do not even need to
|
|
|
|
|
store this much: fields of unit type in a host-language record type
|
|
|
|
|
can simply be omitted without loss.
|
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
#### Simple patterns
|
|
|
|
|
|
|
|
|
|
SimplePattern = AnyPattern
|
|
|
|
|
/ AtomKindPattern
|
|
|
|
|
/ EmbeddedPattern
|
|
|
|
|
/ LiteralPattern
|
|
|
|
|
/ SequenceOfPattern
|
|
|
|
|
/ SetOfPattern
|
|
|
|
|
/ DictOfPattern
|
|
|
|
|
/ Ref
|
|
|
|
|
|
|
|
|
|
The `any` pattern matches any input `Value`:
|
|
|
|
|
|
|
|
|
|
AnyPattern = "any"
|
|
|
|
|
|
|
|
|
|
Specifying the name of a kind of `Atom` matches that kind of atom:
|
|
|
|
|
|
|
|
|
|
AtomKindPattern = "bool" / "float" / "double" / "int" / "string" / "bytes" / "symbol"
|
|
|
|
|
|
2021-06-01 14:10:04 +00:00
|
|
|
|
Embedded input `Value`s are matched with embedded patterns. The
|
|
|
|
|
portion under the `#!` prefix is the *interface* schema for the
|
|
|
|
|
embedded value.[^interface-schema] The result of a match is an
|
|
|
|
|
instance of the schema-wide `embeddedType`, if one is supplied.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2021-06-01 14:10:04 +00:00
|
|
|
|
EmbeddedPattern = "#!" SimplePattern
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
A literal pattern may be expressed in any of three ways: non-symbol
|
|
|
|
|
atoms stand for themselves directly; symbols, prefixed with an equal
|
|
|
|
|
sign, are matched literally; and any `Value` at all may be quoted by
|
|
|
|
|
placing it in a `<<lit> ... >` record:
|
|
|
|
|
|
|
|
|
|
LiteralPattern = "="symbol / "<<lit>" value ">" / non-symbol-atom
|
|
|
|
|
|
|
|
|
|
Brackets containing an item pattern and a literal ellipsis match a
|
|
|
|
|
sequence of items, each matching the nested item pattern. Sets and
|
|
|
|
|
uniform dictionaries are similar.
|
|
|
|
|
|
|
|
|
|
SequenceOfPattern = "[" SimplePattern "..." "]"
|
|
|
|
|
SetOfPattern = "#{" SimplePattern "}"
|
|
|
|
|
DictOfPattern = "{" SimplePattern ":" SimplePattern "...:..." "}"
|
|
|
|
|
|
|
|
|
|
Finally, a reference to some other definition, in this schema or a
|
|
|
|
|
neighbouring schema within this bundle, is made by mentioning the
|
|
|
|
|
possibly-qualified name of the definition as a bare symbol:
|
|
|
|
|
|
|
|
|
|
Ref = symbol
|
|
|
|
|
|
|
|
|
|
Periods "`.`" in such symbols are special:
|
|
|
|
|
|
|
|
|
|
- `Name` refers to the definition named `Name` in the current schema.
|
|
|
|
|
- `Mod.Submod.Name` refers to definition `Name` in `Mod.Submod`, some other schema in the bundle.
|
|
|
|
|
|
2021-06-25 07:45:07 +00:00
|
|
|
|
Each period-separated portion of a reference name must be an `id`, an
|
|
|
|
|
identifier.
|
|
|
|
|
|
2021-06-01 14:10:04 +00:00
|
|
|
|
[^interface-schema]: Embedded patterns are experimental. One
|
|
|
|
|
interpretation is that an embedded value denotes a reference to
|
|
|
|
|
some stateful actor in a potentially-distributed system, and that
|
|
|
|
|
the interface schema associated with an embedded value describes
|
|
|
|
|
the messages that may be sent to that actor.
|
|
|
|
|
|
|
|
|
|
**Examples.** `#!any` may denote a reference to an Actor able to
|
|
|
|
|
receive any value as a message; `#!#t`, a reference to an Actor
|
|
|
|
|
expecting *only* the "true" message; `#!Session`, a reference to
|
|
|
|
|
an Actor expecting any message matching a schema defined as
|
|
|
|
|
`Session` in this file.
|
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
#### Compound patterns
|
|
|
|
|
|
|
|
|
|
CompoundPattern = RecordPattern
|
|
|
|
|
/ TuplePattern
|
|
|
|
|
/ VariableTuplePattern
|
|
|
|
|
/ DictionaryPattern
|
|
|
|
|
|
|
|
|
|
A record pattern matches an input record. It may be specified as a
|
|
|
|
|
record with a literal in the label position, or as a quoted `<<rec>
|
2021-09-30 12:46:19 +00:00
|
|
|
|
... >` record with a pattern for each of the label and field-sequence
|
|
|
|
|
positions:[^record-shorthand]
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2021-09-30 12:46:19 +00:00
|
|
|
|
RecordPattern = "<<rec>" NamedPattern NamedPattern ">"
|
|
|
|
|
/ "<" value PatternSequence ">"
|
|
|
|
|
|
|
|
|
|
PatternSequence = *(NamedPattern) [NamedSimplePattern "..."]
|
|
|
|
|
|
|
|
|
|
[^record-shorthand]: Note that `<label `*ps*`>` can be thought of as
|
|
|
|
|
roughly equivalent to `<<rec> <<lit> label> [`*ps*`]>`. The
|
|
|
|
|
following two definitions are equivalent:
|
|
|
|
|
|
|
|
|
|
D1 = <foo @a string @b string @extra any ... >.
|
|
|
|
|
D2 = <<rec> <<lit> foo> [@a string @b string @extra any ...]>.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
A tuple pattern matches a fixed-length sequence with specific patterns
|
|
|
|
|
in each position. A variable tuple pattern is the same, but with an
|
|
|
|
|
additional pattern for matching additional elements following the
|
|
|
|
|
fixed-position patterns.
|
|
|
|
|
|
|
|
|
|
TuplePattern = "[" *(NamedPattern) "]"
|
2021-09-30 12:46:19 +00:00
|
|
|
|
VariableTuplePattern = "[" *(NamedPattern) NamedSimplePattern "..." "]"
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
A dictionary pattern matches specific literal keys in an input
|
|
|
|
|
dictionary. If no explicit name is given for a particular
|
2023-11-03 09:31:44 +00:00
|
|
|
|
`NamedSimplePattern`, but the key for the pattern is "[sufficiently
|
|
|
|
|
identifierlike][identifierlike]" (a string, symbol or boolean), then a
|
|
|
|
|
symbol formed from that key is used as the name for that dictionary
|
|
|
|
|
entry.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
DictionaryPattern = "{" *(value ":" NamedSimplePattern) "}"
|
|
|
|
|
|
2021-06-25 07:45:07 +00:00
|
|
|
|
### Identifiers and Bindings: NamedPattern and NamedSimplePattern
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
Compound patterns specifications contain `NamedPattern`s or
|
|
|
|
|
`NamedSimplePattern`s rather than ordinary `Pattern`s:
|
|
|
|
|
|
2021-06-25 07:45:07 +00:00
|
|
|
|
NamedPattern = "@" id SimplePattern / Pattern
|
|
|
|
|
NamedSimplePattern = "@" id SimplePattern / SimplePattern
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
Use of an `@name` prefix generally results in creation of a field with
|
|
|
|
|
the given name in the overall record type for a definition. The type
|
|
|
|
|
of value contained in the field will correspond to the `Pattern` or
|
|
|
|
|
`SimplePattern` given.
|
|
|
|
|
|
2023-11-03 09:31:44 +00:00
|
|
|
|
### "Sufficiently Identifierlike" Values
|
|
|
|
|
|
|
|
|
|
In some places in a schema, names can be inferred from some nearby
|
|
|
|
|
literal pattern element. In an `OrPattern`, variant names can be
|
|
|
|
|
inferred; in a `DictionaryPattern`, names for dictionary entries can be
|
|
|
|
|
inferred.
|
|
|
|
|
|
|
|
|
|
The rules are simple: if the literal pattern would match a specific
|
|
|
|
|
symbol or string, then that specific value is converted to a symbol and
|
|
|
|
|
used as the name. If the pattern would match `#t`, the name will be
|
|
|
|
|
`true`; if it would match `#f`, the name will be `false`.
|
|
|
|
|
|
|
|
|
|
For example, in the following grammar, the names for the variants of
|
|
|
|
|
`Example1` are the symbols `foo` and `bar` and `false`, and the names
|
|
|
|
|
for the two fields in `Example2` are `example` and `|testing strings|`.
|
|
|
|
|
Note that `|testing strings|` is a symbol whose name contains a space,
|
|
|
|
|
which will be rejected because it is not a [valid identifier][].
|
|
|
|
|
|
|
|
|
|
```preserves-schema
|
|
|
|
|
Example1 = =foo / "bar" / #f .
|
|
|
|
|
Example2 = { "testing strings": int, example: string } .
|
|
|
|
|
```
|
|
|
|
|
|
2022-06-09 19:30:56 +00:00
|
|
|
|
## Semantics
|
|
|
|
|
|
|
|
|
|
Having covered concrete syntax, we now give semantics for the schema
|
|
|
|
|
language in terms of the [abstract syntax][schema.prs] and of the
|
|
|
|
|
language of Preserves `Value`s.
|
|
|
|
|
|
|
|
|
|
[schema.prs]: https://gitlab.com/preserves/preserves/-/blob/main/schema/schema.prs
|
|
|
|
|
|
|
|
|
|
### Metaschema interpreter
|
|
|
|
|
|
|
|
|
|
(TODO: this subsection is to define an interpreter for metaschema values
|
|
|
|
|
applied to Preserves `Value`s.)
|
|
|
|
|
|
|
|
|
|
### Host-language types
|
|
|
|
|
|
|
|
|
|
The host-language types corresponding to a metaschema instance can
|
|
|
|
|
themselves be described according to a grammar.
|
|
|
|
|
|
|
|
|
|
The definitions in this section should be understood as being part of a
|
|
|
|
|
module named `host`, in a bundle alongside a module named `schema`
|
|
|
|
|
corresponding to the metaschema in the appendix below.
|
|
|
|
|
|
|
|
|
|
#### Abstract host language types
|
|
|
|
|
|
|
|
|
|
Definition = <union @variants [Variant ...]> / Simple .
|
|
|
|
|
Variant = [@label symbol @type Simple] .
|
|
|
|
|
|
|
|
|
|
The host-language type corresponding to a definition will either be a
|
|
|
|
|
tagged union (side condition: at least two `Variant`s are present in a
|
|
|
|
|
`union`) or a *simple* type.
|
|
|
|
|
|
|
|
|
|
Simple = Field / Record .
|
|
|
|
|
Record = <rec @fields [NamedField ...]> .
|
|
|
|
|
NamedField = [@name symbol @type Field] .
|
|
|
|
|
|
|
|
|
|
A *simple* type may be either a single, simple value of *field* type, or
|
|
|
|
|
a record of multiple named fields, each having a specific *field* type.
|
|
|
|
|
|
|
|
|
|
Field = =unit
|
|
|
|
|
/ =any
|
|
|
|
|
/ =embedded
|
|
|
|
|
/ <array @element Field>
|
|
|
|
|
/ <set @element Field>
|
|
|
|
|
/ <map @key Field @value Field>
|
|
|
|
|
/ <ref @name schema.Ref>
|
|
|
|
|
/ schema.AtomKind .
|
|
|
|
|
|
|
|
|
|
A *field* type is either
|
|
|
|
|
|
|
|
|
|
- the language's unit type (the empty tuple, the "void" value),
|
|
|
|
|
- the universal type of all Preserves `Value`s,
|
|
|
|
|
- the type of some host-language [embedded value](./preserves.html#embeddeds) in some context,
|
|
|
|
|
- the type of a uniform array having elements of a specific *field* type,
|
|
|
|
|
- the type of a set having elements of a specific *field* type,
|
|
|
|
|
- the type of a dictionary connecting keys of specific type to values of specific type,
|
|
|
|
|
- the type associated with some other named definition in scope in the current Schema bundle, or
|
|
|
|
|
- the type of a specific kind of Preserves [`Atom`](./preserves#values).
|
|
|
|
|
|
|
|
|
|
#### Computing abstract types from a metaschema instance
|
|
|
|
|
|
|
|
|
|
Given a metaschema definition *d* : `schema.Definition`, the function
|
|
|
|
|
**typeof**{:.pseudocode} yields a `host.Definition`.
|
|
|
|
|
|
|
|
|
|
{:.pseudocode #def:typeof}
|
|
|
|
|
> **typeof** : `schema.Definition` ⟶ `host.Definition`
|
|
|
|
|
> **typeof** `<or [[`*n`1`* *p`1`*`]` ... `[`*n`n`* *p`n`*`]]>` = `<union [[`*n`1`* (**pat** *p`1`*)`]` ... `[`*n`n`* (**pat** *p`n`*)`]]>`
|
|
|
|
|
> **typeof** `<and [`*f`1` ... f`n`*`]>` = **product** `[`*f`1` ... f`n`*`]`
|
|
|
|
|
> **typeof** *p* = **pat** *p*, when *p* ∈ `schema.Pattern`
|
|
|
|
|
|
|
|
|
|
{:.pseudocode #def:pat}
|
|
|
|
|
> **pat** : `schema.Pattern` ⟶ `host.Simple`
|
|
|
|
|
> **pat** *s* = **field** *s*, when *s* ∈ `schema.SimplePattern`
|
|
|
|
|
> **pat** *c* = **product** `[`*c*`]`, when *c* ∈ `schema.CompoundPattern`
|
|
|
|
|
|
|
|
|
|
{:.pseudocode #def:field}
|
|
|
|
|
> **field** : `schema.SimplePattern` ⟶ `host.Field`
|
|
|
|
|
> **field** `any` = `any`
|
|
|
|
|
> **field** `<atom` *k*`>` = *k*
|
|
|
|
|
> **field** `<embedded` *s*`>` = `embedded`
|
|
|
|
|
> **field** `<lit` *v*`>` = `unit`
|
|
|
|
|
> **field** `<seqof` *s*`>` = `<array` (**field** s)`>`
|
|
|
|
|
> **field** `<setof` *s*`>` = `<set` (**field** s)`>`
|
|
|
|
|
> **field** `<dictof` *s`k`* *s`v`*`>` = `<map` (**field** *s`k`*) (**field** *s`v`*)`>`
|
|
|
|
|
> **field** *r* = *r*, when *r* ∈ `schema.Ref`
|
|
|
|
|
|
|
|
|
|
The helper function **product**{:.pseudocode} is where `unit`-valued
|
|
|
|
|
fields are omitted from the computed host-language type. If all fields
|
|
|
|
|
are so omitted, or if there were (recursively) no bindings in the input
|
|
|
|
|
patterns, **product**{:.pseudocode} yields `unit` type itself.
|
|
|
|
|
|
|
|
|
|
{:.pseudocode #def:product}
|
|
|
|
|
> **product** : `[schema.NamedPattern` ...`]` ⟶ `host.Simple`
|
|
|
|
|
> **product** `[`*f`1` ... f`n`*`]` = `unit`, if *t* = `[]`;
|
|
|
|
|
> `<rec` *t*`>`, otherwise
|
|
|
|
|
> where *t* = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*
|
|
|
|
|
|
|
|
|
|
{:.pseudocode #def:gather}
|
|
|
|
|
> **gather** : `schema.NamedPattern` ⟶ `[host.NamedField ...]`
|
|
|
|
|
> **gather** `<named` *n* *p*`>` = `[]`, if (**field** *p*) = `unit`;
|
|
|
|
|
> `[[`*n* (**field** *p*)`]]`, otherwise
|
|
|
|
|
> **gather** `<rec` *f`label`* *f`fields`*`>` = **gather** *f`label`* ⧺ **gather** *f`fields`*
|
|
|
|
|
> **gather** `<tuple [`*f`1` .. f`n`*`]>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`*
|
|
|
|
|
> **gather** `<tuplePrefix [`*f`1` ... f`n`*`]` *f`repeated`*`>` = **gather** *f`1`* ⧺ ⋯ ⧺ **gather** *f`n`* ⧺ **gather** *f`repeated`*
|
|
|
|
|
> **gather** `<dict {`*v`1`*`:`*f`1` ... v`n`*`:`*f`n`*`}>` = **gather** *f`1`′* ⧺ ⋯ ⧺ **gather** *f`n`′*,
|
|
|
|
|
> where (*f`1`′ ⋯ f`n`′*) are (*f`1` ⋯ f`n`*) sorted according to [Preserves term order](./preserves.html#total-order).
|
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
## Appendix: Metaschema
|
|
|
|
|
|
|
|
|
|
The metaschema defines the structure of the abstract syntax (AST) of
|
|
|
|
|
schemas, using the concrete DSL syntax described above.
|
|
|
|
|
|
|
|
|
|
The text below is taken from
|
2022-06-09 19:30:56 +00:00
|
|
|
|
[`schema/schema.prs`][schema.prs]
|
2021-05-25 12:11:33 +00:00
|
|
|
|
in the source code repository.
|
|
|
|
|
|
|
|
|
|
A `Bundle` collects a number of `Schema`s, each named by a
|
|
|
|
|
`ModulePath`:[^todo-semantics-of-bundles]
|
|
|
|
|
|
|
|
|
|
Bundle = <bundle @modules Modules>.
|
|
|
|
|
Modules = { ModulePath: Schema ...:... }.
|
|
|
|
|
ModulePath = [symbol ...].
|
|
|
|
|
|
|
|
|
|
Schema = <schema {
|
|
|
|
|
version: Version
|
|
|
|
|
embeddedType: EmbeddedTypeName
|
|
|
|
|
definitions: Definitions
|
|
|
|
|
}>.
|
|
|
|
|
|
|
|
|
|
A `Version` names the version of the schema language in use. At
|
|
|
|
|
present, it must be `1`.
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# version 1 .
|
2021-05-25 12:11:33 +00:00
|
|
|
|
Version = 1 .
|
|
|
|
|
|
|
|
|
|
An `EmbeddedTypeName` specifies the type of embedded values within
|
|
|
|
|
values parsed by a given schema:
|
|
|
|
|
|
2023-02-07 09:35:03 +00:00
|
|
|
|
EmbeddedTypeName = #f / Ref .
|
2021-05-25 12:11:33 +00:00
|
|
|
|
Ref = <ref @module ModulePath @name symbol>.
|
|
|
|
|
|
|
|
|
|
The `Definitions` are a named collection of definitions within a
|
|
|
|
|
schema. Note the special mention of `pattern0` and `pattern1`: these
|
|
|
|
|
ensure that each `or` or `and` record has at least two members.
|
|
|
|
|
|
|
|
|
|
Definitions = { symbol: Definition ...:... }.
|
|
|
|
|
|
|
|
|
|
Definition =
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# Pattern / Pattern / ...
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <or [@pattern0 NamedAlternative
|
|
|
|
|
@pattern1 NamedAlternative
|
|
|
|
|
@patternN NamedAlternative ...]>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# Pattern & Pattern & ...
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <and [@pattern0 NamedPattern
|
|
|
|
|
@pattern1 NamedPattern
|
|
|
|
|
@patternN NamedPattern ...]>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# Pattern
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ Pattern
|
|
|
|
|
.
|
|
|
|
|
|
|
|
|
|
NamedAlternative = [@variantLabel string @pattern Pattern].
|
|
|
|
|
|
|
|
|
|
Each `Pattern` is either a simple or compound pattern:
|
|
|
|
|
|
|
|
|
|
Pattern = SimplePattern / CompoundPattern .
|
|
|
|
|
|
|
|
|
|
Simple patterns are as described above:
|
|
|
|
|
|
|
|
|
|
SimplePattern =
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# any
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ =any
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# special builtins: bool, float, double, int, string, bytes, symbol
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <atom @atomKind AtomKind>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# matches an embedded value in the input: #!p
|
2021-06-01 14:10:04 +00:00
|
|
|
|
/ <embedded @interface SimplePattern>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# =symbol, <<lit> any>, or plain non-symbol atom
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <lit @value any>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# [p ...] ----> <seqof <ref p>># see also tuplePrefix below.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <seqof @pattern SimplePattern>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# #{p} ----> <setof <ref p>>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <setof @pattern SimplePattern>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# {k: v, ...:...} ----> <dictof <ref k> <ref v>>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <dictof @key SimplePattern @value SimplePattern>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# symbol, symbol.symbol, symbol.symbol.symbol, ...
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ Ref
|
|
|
|
|
.
|
|
|
|
|
|
|
|
|
|
AtomKind = =Boolean
|
|
|
|
|
/ =Float
|
|
|
|
|
/ =Double
|
|
|
|
|
/ =SignedInteger
|
|
|
|
|
/ =String
|
|
|
|
|
/ =ByteString
|
|
|
|
|
/ =Symbol .
|
|
|
|
|
|
|
|
|
|
Compound patterns involve optionally-named subpatterns:
|
|
|
|
|
|
|
|
|
|
CompoundPattern =
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# <label a b c> ----> <rec <lit label> <tuple [<ref a> <ref b> <ref c>]>>
|
|
|
|
|
# except for record labels
|
|
|
|
|
# <<rec> x y> ---> <rec <ref x> <ref y>>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <rec @label NamedPattern @fields NamedPattern>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# [a b c] ----> <tuple [<ref a> <ref b> <ref c>]>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <tuple @patterns [NamedPattern ...]>
|
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# [a b c ...] ----> <tuplePrefix [<ref a> <ref b>] <seqof <ref c>>>
|
2021-06-25 07:45:07 +00:00
|
|
|
|
/ <tuplePrefix @fixed [NamedPattern ...] @variable NamedSimplePattern>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
2023-10-15 13:11:27 +00:00
|
|
|
|
# {a: b, c: d} ----> <dict {a: <ref b>, c: <ref d>}>
|
2021-05-25 12:11:33 +00:00
|
|
|
|
/ <dict @entries DictionaryEntries>
|
|
|
|
|
.
|
|
|
|
|
|
|
|
|
|
DictionaryEntries = { any: NamedSimplePattern ...:... }.
|
|
|
|
|
|
|
|
|
|
Explicitly-named subpatterns are always `SimplePattern`s; but,
|
2021-05-25 21:01:16 +00:00
|
|
|
|
depending on context, if a name is omitted, the pattern may be a
|
2021-05-25 12:11:33 +00:00
|
|
|
|
`Pattern` or may be restricted to `SimplePattern` as well:
|
|
|
|
|
|
2021-06-25 08:25:26 +00:00
|
|
|
|
NamedSimplePattern = @named Binding / @anonymous SimplePattern .
|
|
|
|
|
NamedPattern = @named Binding / @anonymous Pattern .
|
|
|
|
|
Binding = <named @name symbol @pattern SimplePattern>.
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
[^todo-semantics-of-bundles]: The semantics of module path references
|
|
|
|
|
remain to be specified!
|
|
|
|
|
|
|
|
|
|
## Appendix: Metaschema instance
|
|
|
|
|
|
|
|
|
|
The following is a (lightly-reformatted) Preserves document which is
|
|
|
|
|
the output of DSL-to-AST compilation of the DSL source text of the
|
|
|
|
|
metaschema.
|
|
|
|
|
|
|
|
|
|
<schema {
|
|
|
|
|
version: 1,
|
|
|
|
|
embeddedType: #f,
|
|
|
|
|
definitions: {
|
|
|
|
|
|
|
|
|
|
Pattern: <or [
|
|
|
|
|
["SimplePattern", <ref [] SimplePattern>],
|
|
|
|
|
["CompoundPattern", <ref [] CompoundPattern>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
CompoundPattern: <or [
|
|
|
|
|
["rec", <rec <lit rec> <tuple [
|
|
|
|
|
<named label <ref [] NamedPattern>>,
|
|
|
|
|
<named fields <ref [] NamedPattern>>
|
|
|
|
|
]>>],
|
|
|
|
|
["tuple", <rec <lit tuple> <tuple [<named patterns <seqof <ref [] NamedPattern>>>]>>],
|
2021-06-25 07:45:07 +00:00
|
|
|
|
["tuplePrefix", <rec <lit tuplePrefix> <tuple [
|
2021-05-25 12:11:33 +00:00
|
|
|
|
<named fixed <seqof <ref [] NamedPattern>>>,
|
|
|
|
|
<named variable <ref [] NamedSimplePattern>>
|
|
|
|
|
]>>],
|
|
|
|
|
["dict", <rec <lit dict> <tuple [<named entries <ref [] DictionaryEntries>>]>>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
Modules: <dictof <ref [] ModulePath> <ref [] Schema>>,
|
|
|
|
|
|
|
|
|
|
Ref: <rec <lit ref> <tuple [
|
|
|
|
|
<named module <ref [] ModulePath>>,
|
|
|
|
|
<named name <atom Symbol>>
|
|
|
|
|
]>>,
|
|
|
|
|
|
|
|
|
|
Bundle: <rec <lit bundle> <tuple [<named modules <ref [] Modules>>]>>,
|
|
|
|
|
|
2021-06-25 08:25:26 +00:00
|
|
|
|
Binding: <rec <lit named> <tuple [
|
2021-05-25 12:11:33 +00:00
|
|
|
|
<named name <atom Symbol>>,
|
|
|
|
|
<named pattern <ref [] SimplePattern>>
|
|
|
|
|
]>>,
|
|
|
|
|
|
|
|
|
|
Definition: <or [
|
2021-06-25 07:45:07 +00:00
|
|
|
|
["or", <rec <lit or> <tuple [<tuplePrefix [
|
2021-05-25 12:11:33 +00:00
|
|
|
|
<named pattern0 <ref [] NamedAlternative>>,
|
|
|
|
|
<named pattern1 <ref [] NamedAlternative>>
|
|
|
|
|
] <named patternN <seqof <ref [] NamedAlternative>>>>]>>],
|
2021-06-25 07:45:07 +00:00
|
|
|
|
["and", <rec <lit and> <tuple [<tuplePrefix [
|
2021-05-25 12:11:33 +00:00
|
|
|
|
<named pattern0 <ref [] NamedPattern>>,
|
|
|
|
|
<named pattern1 <ref [] NamedPattern>>
|
|
|
|
|
] <named patternN <seqof <ref [] NamedPattern>>>>]>>],
|
|
|
|
|
["Pattern", <ref [] Pattern>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
NamedSimplePattern: <or [
|
2021-06-25 08:25:26 +00:00
|
|
|
|
["named", <ref [] Binding>],
|
2021-05-25 12:11:33 +00:00
|
|
|
|
["anonymous", <ref [] SimplePattern>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
EmbeddedTypeName: <or [
|
2023-02-07 09:35:03 +00:00
|
|
|
|
["false", <lit #f>],
|
|
|
|
|
["Ref", <ref [] Ref>]
|
2021-05-25 12:11:33 +00:00
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
ModulePath: <seqof <atom Symbol>>,
|
|
|
|
|
|
|
|
|
|
AtomKind: <or [
|
|
|
|
|
["Boolean", <lit Boolean>],
|
|
|
|
|
["Float", <lit Float>],
|
|
|
|
|
["Double", <lit Double>],
|
|
|
|
|
["SignedInteger", <lit SignedInteger>],
|
|
|
|
|
["String", <lit String>],
|
|
|
|
|
["ByteString", <lit ByteString>],
|
|
|
|
|
["Symbol", <lit Symbol>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
DictionaryEntries: <dictof any <ref [] NamedSimplePattern>>,
|
|
|
|
|
|
|
|
|
|
Version: <lit 1>,
|
|
|
|
|
|
|
|
|
|
NamedPattern: <or [
|
2021-06-25 08:25:26 +00:00
|
|
|
|
["named", <ref [] Binding>],
|
2021-05-25 12:11:33 +00:00
|
|
|
|
["anonymous", <ref [] Pattern>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
SimplePattern: <or [
|
|
|
|
|
["any", <lit any>],
|
|
|
|
|
["atom", <rec <lit atom> <tuple [<named atomKind <ref [] AtomKind>>]>>],
|
2021-06-01 14:10:04 +00:00
|
|
|
|
["embedded", <rec <lit embedded> <tuple [<named interface <ref [] SimplePattern>>]>>],
|
2021-05-25 12:11:33 +00:00
|
|
|
|
["lit", <rec <lit lit> <tuple [<named value any>]>>],
|
|
|
|
|
["seqof", <rec <lit seqof> <tuple [<named pattern <ref [] SimplePattern>>]>>],
|
|
|
|
|
["setof", <rec <lit setof> <tuple [<named pattern <ref [] SimplePattern>>]>>],
|
|
|
|
|
["dictof", <rec <lit dictof> <tuple [
|
|
|
|
|
<named key <ref [] SimplePattern>>,
|
|
|
|
|
<named value <ref [] SimplePattern>>
|
|
|
|
|
]>>],
|
|
|
|
|
["Ref", <ref [] Ref>]
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
NamedAlternative: <tuple [
|
|
|
|
|
<named variantLabel <atom String>>,
|
|
|
|
|
<named pattern <ref [] Pattern>>
|
|
|
|
|
]>,
|
|
|
|
|
|
|
|
|
|
Definitions: <dictof <atom Symbol> <ref [] Definition>>,
|
|
|
|
|
|
|
|
|
|
Schema: <rec <lit schema> <tuple [<dict {
|
|
|
|
|
version: <named version <ref [] Version>>,
|
|
|
|
|
embeddedType: <named embeddedType <ref [] EmbeddedTypeName>>,
|
|
|
|
|
definitions: <named definitions <ref [] Definitions>>
|
|
|
|
|
}>]>>
|
|
|
|
|
}
|
|
|
|
|
}>
|
|
|
|
|
|
|
|
|
|
## Appendix: Example generated types
|
|
|
|
|
|
|
|
|
|
The following are the (abridged) TypeScript and Racket generated type
|
|
|
|
|
definitions for the metaschema.
|
|
|
|
|
|
|
|
|
|
### TypeScript.
|
|
|
|
|
|
|
|
|
|
import * as _ from "@preserves/core";
|
|
|
|
|
|
|
|
|
|
// ...
|
|
|
|
|
export type _embedded = any;
|
|
|
|
|
export type _val = _.Value<_embedded>;
|
|
|
|
|
// ...
|
|
|
|
|
|
|
|
|
|
export type Bundle = {"modules": Modules};
|
|
|
|
|
|
|
|
|
|
export type Modules = _.KeyedDictionary<ModulePath, Schema, _embedded>;
|
|
|
|
|
|
|
|
|
|
export type Schema = {
|
|
|
|
|
"version": Version,
|
|
|
|
|
"embeddedType": EmbeddedTypeName,
|
|
|
|
|
"definitions": Definitions
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
export type Version = null;
|
|
|
|
|
|
2023-02-07 09:35:03 +00:00
|
|
|
|
export type EmbeddedTypeName = ({"_variant": "false"} | {"_variant": "Ref", "value": Ref});
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
export type Definitions = _.KeyedDictionary<symbol, Definition, _embedded>;
|
|
|
|
|
|
|
|
|
|
export type Definition = (
|
|
|
|
|
{
|
|
|
|
|
"_variant": "or",
|
|
|
|
|
"pattern0": NamedAlternative,
|
|
|
|
|
"pattern1": NamedAlternative,
|
|
|
|
|
"patternN": Array<NamedAlternative>
|
|
|
|
|
} |
|
|
|
|
|
{
|
|
|
|
|
"_variant": "and",
|
|
|
|
|
"pattern0": NamedPattern,
|
|
|
|
|
"pattern1": NamedPattern,
|
|
|
|
|
"patternN": Array<NamedPattern>
|
|
|
|
|
} |
|
|
|
|
|
{"_variant": "Pattern", "value": Pattern}
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
export type Pattern = (
|
|
|
|
|
{"_variant": "SimplePattern", "value": SimplePattern} |
|
|
|
|
|
{"_variant": "CompoundPattern", "value": CompoundPattern}
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
export type SimplePattern = (
|
|
|
|
|
{"_variant": "any"} |
|
|
|
|
|
{"_variant": "atom", "atomKind": AtomKind} |
|
2021-06-01 14:10:04 +00:00
|
|
|
|
{"_variant": "embedded", "interface": SimplePattern} |
|
2021-05-25 12:11:33 +00:00
|
|
|
|
{"_variant": "lit", "value": _val} |
|
|
|
|
|
{"_variant": "seqof", "pattern": SimplePattern} |
|
|
|
|
|
{"_variant": "setof", "pattern": SimplePattern} |
|
|
|
|
|
{"_variant": "dictof", "key": SimplePattern, "value": SimplePattern} |
|
|
|
|
|
{"_variant": "Ref", "value": Ref}
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
export type CompoundPattern = (
|
|
|
|
|
{"_variant": "rec", "label": NamedPattern, "fields": NamedPattern} |
|
|
|
|
|
{"_variant": "tuple", "patterns": Array<NamedPattern>} |
|
|
|
|
|
{
|
2021-06-25 07:45:07 +00:00
|
|
|
|
"_variant": "tuplePrefix",
|
2021-05-25 12:11:33 +00:00
|
|
|
|
"fixed": Array<NamedPattern>,
|
|
|
|
|
"variable": NamedSimplePattern
|
|
|
|
|
} |
|
|
|
|
|
{"_variant": "dict", "entries": DictionaryEntries}
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
export type DictionaryEntries = _.KeyedDictionary<_val, NamedSimplePattern, _embedded>;
|
|
|
|
|
|
|
|
|
|
export type AtomKind = (
|
|
|
|
|
{"_variant": "Boolean"} |
|
|
|
|
|
{"_variant": "Float"} |
|
|
|
|
|
{"_variant": "Double"} |
|
|
|
|
|
{"_variant": "SignedInteger"} |
|
|
|
|
|
{"_variant": "String"} |
|
|
|
|
|
{"_variant": "ByteString"} |
|
|
|
|
|
{"_variant": "Symbol"}
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
export type NamedAlternative = {"variantLabel": string, "pattern": Pattern};
|
|
|
|
|
|
|
|
|
|
export type NamedSimplePattern = (
|
2021-06-25 08:25:26 +00:00
|
|
|
|
{"_variant": "named", "value": Binding} |
|
2021-05-25 12:11:33 +00:00
|
|
|
|
{"_variant": "anonymous", "value": SimplePattern}
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
export type NamedPattern = (
|
2021-06-25 08:25:26 +00:00
|
|
|
|
{"_variant": "named", "value": Binding} |
|
2021-05-25 12:11:33 +00:00
|
|
|
|
{"_variant": "anonymous", "value": Pattern}
|
|
|
|
|
);
|
|
|
|
|
|
2021-06-25 08:25:26 +00:00
|
|
|
|
export type Binding = {"name": symbol, "pattern": SimplePattern};
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
export type Ref = {"module": ModulePath, "name": symbol};
|
|
|
|
|
|
|
|
|
|
export type ModulePath = Array<symbol>;
|
|
|
|
|
|
|
|
|
|
### Racket.
|
|
|
|
|
|
|
|
|
|
(struct AtomKind-Symbol () #:prefab)
|
|
|
|
|
(struct AtomKind-ByteString () #:prefab)
|
|
|
|
|
(struct AtomKind-String () #:prefab)
|
|
|
|
|
(struct AtomKind-SignedInteger () #:prefab)
|
|
|
|
|
(struct AtomKind-Double () #:prefab)
|
|
|
|
|
(struct AtomKind-Float () #:prefab)
|
|
|
|
|
(struct AtomKind-Boolean () #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct Bundle (modules) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct CompoundPattern-dict (entries) #:prefab)
|
2021-06-25 07:45:07 +00:00
|
|
|
|
(struct CompoundPattern-tuplePrefix (fixed variable) #:prefab)
|
2021-05-25 12:11:33 +00:00
|
|
|
|
(struct CompoundPattern-tuple (patterns) #:prefab)
|
|
|
|
|
(struct CompoundPattern-rec (label fields) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct Definition-Pattern (value) #:prefab)
|
|
|
|
|
(struct Definition-and (pattern0 pattern1 patternN) #:prefab)
|
|
|
|
|
(struct Definition-or (pattern0 pattern1 patternN) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct EmbeddedTypeName-false () #:prefab)
|
|
|
|
|
(struct EmbeddedTypeName-Ref (value) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct NamedAlternative (variantLabel pattern) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct NamedPattern-anonymous (value) #:prefab)
|
|
|
|
|
(struct NamedPattern-named (value) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct NamedSimplePattern-anonymous (value) #:prefab)
|
|
|
|
|
(struct NamedSimplePattern-named (value) #:prefab)
|
|
|
|
|
|
2021-06-25 08:25:26 +00:00
|
|
|
|
(struct Binding (name pattern) #:prefab)
|
2021-05-25 12:11:33 +00:00
|
|
|
|
|
|
|
|
|
(struct Pattern-CompoundPattern (value) #:prefab)
|
|
|
|
|
(struct Pattern-SimplePattern (value) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct Ref (module name) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct Schema (definitions embeddedType version) #:prefab)
|
|
|
|
|
|
|
|
|
|
(struct SimplePattern-Ref (value) #:prefab)
|
|
|
|
|
(struct SimplePattern-dictof (key value) #:prefab)
|
|
|
|
|
(struct SimplePattern-setof (pattern) #:prefab)
|
|
|
|
|
(struct SimplePattern-seqof (pattern) #:prefab)
|
|
|
|
|
(struct SimplePattern-lit (value) #:prefab)
|
2021-06-01 14:10:04 +00:00
|
|
|
|
(struct SimplePattern-embedded (interface) #:prefab)
|
2021-05-25 12:11:33 +00:00
|
|
|
|
(struct SimplePattern-atom (atomKind) #:prefab)
|
|
|
|
|
(struct SimplePattern-any () #:prefab)
|
|
|
|
|
|
|
|
|
|
## Appendix: Future work
|
|
|
|
|
|
|
|
|
|
- There are side conditions on AST instances. It would be nice to
|
|
|
|
|
eventually be able to express these within the metaschema.
|
|
|
|
|
|
2021-05-25 12:37:44 +00:00
|
|
|
|
- It'd be interesting to,
|
|
|
|
|
[Ometa](https://en.wikipedia.org/wiki/OMeta)-like, be able to
|
|
|
|
|
specify the DSL-to-AST translation process as a schema. One
|
|
|
|
|
challenge in doing so is the way schemas are required to be
|
|
|
|
|
*reversible* at present.
|
|
|
|
|
|
2021-05-25 18:13:18 +00:00
|
|
|
|
- Should `include` accept URLs, to be able to retrieve schema from
|
|
|
|
|
the web?
|
|
|
|
|
|
2021-06-01 14:10:04 +00:00
|
|
|
|
- It'd be nice to firm up the interpretation of embedded interface
|
|
|
|
|
schemas. I have in mind something like the
|
|
|
|
|
[higher-order contracts of Dimoulas](https://www2.ccs.neu.edu/racket/pubs/dissertation-dimoulas.pdf).
|
|
|
|
|
Essentially, a schema *is* a contract, and embedded
|
|
|
|
|
pointers-to-behaviour are like closures/channels/objects/etc, which
|
|
|
|
|
demand higher-order contracts. Future work could pin this down
|
|
|
|
|
further; also, consideration of *dependent* schemas (analogous to
|
|
|
|
|
dependent contracts) could be of interest.
|
|
|
|
|
|
|
|
|
|
**Example.** In the following fragment, `#!Session` is the handle a
|
|
|
|
|
connected user uses to interact with a chatroom. In the
|
|
|
|
|
implementation, `Says` messages are dropped if their `who` doesn't
|
|
|
|
|
match the `uid` supplied in the `Join` assertion. It'd be nice to
|
|
|
|
|
capture that using a dependent schema, passing in the specific
|
|
|
|
|
`uid` value to the `Session` constructor, something like
|
|
|
|
|
`#!(Session uid)`.
|
|
|
|
|
|
|
|
|
|
Join = <joinedUser @uid UserId @handle #!Session>.
|
|
|
|
|
Session = @observeSpeech <Observe =says @observer #!Says> / Says .
|
|
|
|
|
Says = <says @who UserId @what string>.
|
|
|
|
|
|
|
|
|
|
|
2021-05-25 12:11:33 +00:00
|
|
|
|
<!-- Heading to visually offset the footnotes from the main document: -->
|
|
|
|
|
## Notes
|