Merge branch 'main' into comment-syntax-hash-space
This commit is contained in:
commit
2445ab4a5a
37
README.md
37
README.md
|
@ -36,23 +36,30 @@ automatic, perfect-fidelity conversion between syntaxes.
|
|||
|
||||
## Implementations
|
||||
|
||||
Implementations of the data model, plus the textual and/or binary transfer syntaxes:
|
||||
#### Implementations of the data model, plus Preserves textual and binary transfer syntax
|
||||
|
||||
- [Preserves for Nim](https://git.syndicate-lang.org/ehmry/preserves-nim)
|
||||
- [Preserves for Python]({{page.projecttree}}/implementations/python/) ([`pip install preserves`](https://pypi.org/project/preserves/); [documentation available online](python/latest/))
|
||||
- [Preserves for Racket]({{page.projecttree}}/implementations/racket/preserves/) ([`raco pkg install preserves`](https://pkgs.racket-lang.org/package/preserves))
|
||||
- [Preserves for Rust]({{page.projecttree}}/implementations/rust/) ([crates.io package](https://crates.io/crates/preserves))
|
||||
- [Preserves for Squeak Smalltalk](https://squeaksource.com/Preserves.html) (`Installer ss project: 'Preserves'; install: 'Preserves'`)
|
||||
- [Preserves for TypeScript and JavaScript]({{page.projecttree}}/implementations/javascript/) ([`yarn add @preserves/core`](https://www.npmjs.com/package/@preserves/core))
|
||||
- (Pre-alpha) Preserves for [C]({{page.projecttree}}/implementations/c/) and [C++]({{page.projecttree}}/implementations/cpp/)
|
||||
| Language[^pre-alpha-implementations] | Code | Package | Docs |
|
||||
|-----------------------|------------------------------------------------------------------------------|--------------------------------------------------------------------------------|-------------------------------------------|
|
||||
| Nim | [git.syndicate-lang.org](https://git.syndicate-lang.org/ehmry/preserves-nim) | | |
|
||||
| Python | [preserves.dev]({{page.projecttree}}/implementations/python/) | [`pip install preserves`](https://pypi.org/project/preserves/) | [docs](python/latest/) |
|
||||
| Racket | [preserves.dev]({{page.projecttree}}/implementations/racket/preserves/) | [`raco pkg install preserves`](https://pkgs.racket-lang.org/package/preserves) | |
|
||||
| Rust | [preserves.dev]({{page.projecttree}}/implementations/rust/) | [`cargo add preserves`](https://crates.io/crates/preserves) | [docs](https://docs.rs/preserves/latest/) |
|
||||
| Squeak Smalltalk | [SqueakSource](https://squeaksource.com/Preserves.html) | `Installer ss project: 'Preserves';`<br>` install: 'Preserves'` | |
|
||||
| TypeScript/JavaScript | [preserves.dev]({{page.projecttree}}/implementations/javascript/) | [`yarn add @preserves/core`](https://www.npmjs.com/package/@preserves/core) | |
|
||||
|
||||
Implementations of the data model, plus Syrup transfer syntax:
|
||||
[^pre-alpha-implementations]: Pre-alpha implementations also exist for
|
||||
[C]({{page.projecttree}}/implementations/c/) and
|
||||
[C++]({{page.projecttree}}/implementations/cpp/).
|
||||
|
||||
- [Syrup for Racket](https://github.com/ocapn/syrup/blob/master/impls/racket/syrup/syrup.rkt)
|
||||
- [Syrup for Guile](https://github.com/ocapn/syrup/blob/master/impls/guile/syrup.scm)
|
||||
- [Syrup for Python](https://github.com/ocapn/syrup/blob/master/impls/python/syrup.py)
|
||||
- [Syrup for JavaScript](https://github.com/zarutian/agoric-sdk/blob/zarutian/captp_variant/packages/captp/lib/syrup.js)
|
||||
- [Syrup for Haskell](https://github.com/zenhack/haskell-preserves)
|
||||
#### Implementations of the data model, plus Syrup transfer syntax
|
||||
|
||||
| Language | Code |
|
||||
|------------|----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| Guile | [github.com/ocapn/syrup](https://github.com/ocapn/syrup/blob/master/impls/guile/syrup.scm) |
|
||||
| Haskell | [github.com/zenhack/haskell-preserves](https://github.com/zenhack/haskell-preserves) |
|
||||
| JavaScript | [github.com/zarutian/agoric-sdk](https://github.com/zarutian/agoric-sdk/blob/zarutian/captp_variant/packages/captp/lib/syrup.js) |
|
||||
| Python | [github.com/ocapn/syrup](https://github.com/ocapn/syrup/blob/master/impls/python/syrup.py) |
|
||||
| Racket | [github.com/ocapn/syrup](https://github.com/ocapn/syrup/blob/master/impls/racket/syrup/syrup.rkt) |
|
||||
|
||||
## Tools
|
||||
|
||||
|
@ -81,3 +88,5 @@ The contents of this repository are made available to you under the
|
|||
[Apache License, version 2.0](LICENSE)
|
||||
(<http://www.apache.org/licenses/LICENSE-2.0>), and are Copyright
|
||||
2018-2022 Tony Garnock-Jones.
|
||||
|
||||
## Notes
|
||||
|
|
|
@ -14,4 +14,4 @@ defaults:
|
|||
|
||||
title: "Preserves"
|
||||
version_date: "October 2023"
|
||||
version: "0.990.0"
|
||||
version: "0.990.1"
|
||||
|
|
|
@ -0,0 +1,33 @@
|
|||
For a value `V`, we write `«V»` for the binary encoding of `V`.
|
||||
|
||||
```text
|
||||
«#f» = [0x80]
|
||||
«#t» = [0x81]
|
||||
|
||||
«@W V» = [0x85] ++ «W» ++ «V»
|
||||
«#!V» = [0x86] ++ «V»
|
||||
|
||||
«V» if V ∈ Float = [0x87, 0x04] ++ binary32(V)
|
||||
«V» if V ∈ Double = [0x87, 0x08] ++ binary64(V)
|
||||
|
||||
«V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
|
||||
«V» if V ∈ String = [0xB1] ++ varint(|utf8(V)|) ++ utf8(V)
|
||||
«V» if V ∈ ByteString = [0xB2] ++ varint(|V|) ++ V
|
||||
«V» if V ∈ Symbol = [0xB3] ++ varint(|utf8(V)|) ++ utf8(V)
|
||||
|
||||
«<L F_1...F_m>» = [0xB4] ++ «L» ++ «F_1» ++...++ «F_m» ++ [0x84]
|
||||
«[X_1...X_m]» = [0xB5] ++ «X_1» ++...++ «X_m» ++ [0x84]
|
||||
«#{E_1...E_m}» = [0xB6] ++ «E_1» ++...++ «E_m» ++ [0x84]
|
||||
«{K_1:V_1...K_m:V_m}» = [0xB7] ++ «K_1» ++ «V_1» ++...++ «K_m» ++ «V_m» ++ [0x84]
|
||||
|
||||
varint(n) = [n] if n < 128
|
||||
[(n & 127) | 128] ++ varint(n >> 7) if n ≥ 128
|
||||
|
||||
intbytes(n) = the empty sequence if n = 0, otherwise signedBigEndian(n)
|
||||
|
||||
signedBigEndian(n) = [n & 255] if -128 ≤ n ≤ 127
|
||||
signedBigEndian(n >> 8) ++ [n & 255] otherwise
|
||||
```
|
||||
|
||||
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
|
||||
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
|
|
@ -51,36 +51,3 @@ division](https://en.wikipedia.org/wiki/Euclidean_division); that is, if
|
|||
<span class="postcard-grammar binarysyntax">*n* = *dq* + *r*</span> and
|
||||
<span class="postcard-grammar binarysyntax">0 ≤ *r* < |d|</span>.
|
||||
-->
|
||||
|
||||
<!--
|
||||
For a value `V`, we write `«V»` for the binary encoding of `V`.
|
||||
|
||||
«#f» = [0x80]
|
||||
«#t» = [0x81]
|
||||
|
||||
«@W V» = [0x85] ++ «W» ++ «V»
|
||||
«#!V» = [0x86] ++ «V»
|
||||
|
||||
«V» if V ∈ Float = [0x87, 0x04] ++ binary32(V)
|
||||
«V» if V ∈ Double = [0x87, 0x08] ++ binary64(V)
|
||||
|
||||
«V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
|
||||
«V» if V ∈ String = [0xB1] ++ varint(|utf8(V)|) ++ utf8(V)
|
||||
«V» if V ∈ ByteString = [0xB2] ++ varint(|V|) ++ V
|
||||
«V» if V ∈ Symbol = [0xB3] ++ varint(|utf8(V)|) ++ utf8(V)
|
||||
|
||||
«<L F_1...F_m>» = [0xB4] ++ «L» ++ «F_1» ++...++ «F_m» ++ [0x84]
|
||||
«[X_1...X_m]» = [0xB5] ++ «X_1» ++...++ «X_m» ++ [0x84]
|
||||
«#{E_1...E_m}» = [0xB6] ++ «E_1» ++...++ «E_m» ++ [0x84]
|
||||
«{K_1:V_1...K_m:V_m}» = [0xB7] ++ «K_1» ++ «V_1» ++...++ «K_m» ++ «V_m» ++ [0x84]
|
||||
|
||||
varint(v) = [v] if v < 128
|
||||
[(v & 0x7F) + 128] ++ varint(v >> 7) if v ≥ 128
|
||||
|
||||
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
|
||||
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
|
||||
|
||||
The function `intbytes(x)` is a big-endian two's-complement signed binary representation of
|
||||
`x`, taking exactly as many whole bytes as needed to unambiguously identify the value and its
|
||||
sign. In particular, `intbytes(0)` is the empty byte sequence.
|
||||
-->
|
||||
|
|
|
@ -0,0 +1,44 @@
|
|||
```text
|
||||
Document := Value ws
|
||||
Value := ws (Record | Collection | Atom | Embedded | Annotated)
|
||||
Collection := Sequence | Dictionary | Set
|
||||
Atom := Boolean | ByteString | String | QuotedSymbol | Symbol | Number
|
||||
ws := (space | tab | cr | lf | `,`)*
|
||||
|
||||
Record := `<` Value+ ws `>`
|
||||
Sequence := `[` Value* ws `]`
|
||||
Dictionary := `{` (Value ws `:` Value)* ws `}`
|
||||
Set := `#{` Value* ws `}`
|
||||
|
||||
Boolean := `#t` | `#f`
|
||||
ByteString := `#"` binchar* `"`
|
||||
| `#x"` (ws hex hex)* ws `"`
|
||||
| `#[` (ws base64char)* ws `]`
|
||||
String := `"` («any unicode scalar except `\` or `"`» | escaped | `\"`)* `"`
|
||||
QuotedSymbol := `|` («any unicode scalar except `\` or `|`» | escaped | `\|`)* `|`
|
||||
Symbol := (`A`..`Z` | `a`..`z` | `0`..`9` | sympunct | symuchar)+
|
||||
Number := Float | Double | SignedInteger
|
||||
Float := flt (`f`|`F`) | `#xf"` (ws hex hex)4 ws `"`
|
||||
Double := flt | `#xd"` (ws hex hex)8 ws `"`
|
||||
SignedInteger := int
|
||||
|
||||
Embedded := `#!` Value
|
||||
Annotated := Annotation Value
|
||||
Annotation := `@` Value | `;` «any unicode scalar except cr or lf»* (cr | lf)
|
||||
|
||||
escaped := `\\` | `\/` | `\b` | `\f` | `\n` | `\r` | `\t` | `\u` hex hex hex hex
|
||||
binescaped := `\\` | `\/` | `\b` | `\f` | `\n` | `\r` | `\t` | `\x` hex hex
|
||||
binchar := «any scalar ≥32 and ≤126, except `\` or `"`» | binescaped | `\"`
|
||||
base64char := `A`..`Z` | `a`..`z` | `0`..`9` | `+` | `/` | `-` | `_` | `=`
|
||||
sympunct := `~` | `!` | `$` | `%` | `^` | `&` | `*` | `?`
|
||||
| `_` | `=` | `+` | `-` | `/` | `.`
|
||||
symuchar := «any scalar value ≥128 whose Unicode category is
|
||||
Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd, Nl, No, Pc,
|
||||
Pd, Po, Sc, Sm, Sk, So, or Co»
|
||||
|
||||
flt := int ( frac exp | frac | exp )
|
||||
int := (`-`|`+`) (`0`..`9`)+
|
||||
frac := `.` (`0`..`9`)+
|
||||
exp := (`e`|`E`) (`-`|`+`) (`0`..`9`)+
|
||||
hex := `A`..`F` | `a`..`f` | `0`..`9`
|
||||
```
|
|
@ -1,17 +1,18 @@
|
|||
Value = Atom
|
||||
| Compound
|
||||
| Embedded
|
||||
```text
|
||||
Value = Atom
|
||||
| Compound
|
||||
| Embedded
|
||||
|
||||
Atom = Boolean
|
||||
| Float
|
||||
| Double
|
||||
| SignedInteger
|
||||
| String
|
||||
| ByteString
|
||||
| Symbol
|
||||
|
||||
Compound = Record
|
||||
| Sequence
|
||||
| Set
|
||||
| Dictionary
|
||||
Atom = Boolean
|
||||
| Float
|
||||
| Double
|
||||
| SignedInteger
|
||||
| String
|
||||
| ByteString
|
||||
| Symbol
|
||||
|
||||
Compound = Record
|
||||
| Sequence
|
||||
| Set
|
||||
| Dictionary
|
||||
```
|
||||
|
|
|
@ -0,0 +1,11 @@
|
|||
---
|
||||
no_site_title: true
|
||||
title: "Preserves Quick Reference (Plaintext)"
|
||||
---
|
||||
|
||||
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
|
||||
{{ site.version_date }}. Version {{ site.version }}.
|
||||
|
||||
{% include cheatsheet-binary-plaintext.md %}
|
||||
|
||||
{% include cheatsheet-text-plaintext.md %}
|
|
@ -14,7 +14,7 @@ inputs.
|
|||
You will usually not need to use the `preserves-schema-rs`
|
||||
command-line program. Instead, access the preserves-schema compiler
|
||||
API from your `build.rs`. The following example is taken from
|
||||
[`build.rs` for the `preserves-path` crate](https://gitlab.com/preserves/preserves/-/blob/18ac9168996026073ee16164fce108054b2a0ed7/implementations/rust/preserves-path/build.rs):
|
||||
[`build.rs` for the `preserves-path` crate](https://gitlab.com/preserves/preserves/-/blob/af5de5b836ffc51999db93797d1995ff677cf6f8/implementations/rust/preserves-path/build.rs):
|
||||
|
||||
use preserves_schema::compiler::*;
|
||||
|
||||
|
@ -30,14 +30,14 @@ API from your `build.rs`. The following example is taken from
|
|||
let mut c = CompilerConfig::new(gen_dir, "crate::schemas".to_owned());
|
||||
|
||||
let inputs = expand_inputs(&vec!["path.bin".to_owned()])?;
|
||||
c.load_schemas_and_bundles(&inputs)?;
|
||||
c.load_schemas_and_bundles(&inputs, &vec![])?;
|
||||
|
||||
compile(&c)
|
||||
}
|
||||
|
||||
This approach also requires an `include!` from your main, hand-written
|
||||
source tree. The following is a snippet from
|
||||
[`preserves-path/src/lib.rs`](https://gitlab.com/preserves/preserves/-/blob/18ac9168996026073ee16164fce108054b2a0ed7/implementations/rust/preserves-path/src/lib.rs):
|
||||
[`preserves-path/src/lib.rs`](https://gitlab.com/preserves/preserves/-/blob/af5de5b836ffc51999db93797d1995ff677cf6f8/implementations/rust/preserves-path/src/lib.rs):
|
||||
|
||||
pub mod schemas {
|
||||
include!(concat!(env!("OUT_DIR"), "/src/schemas/mod.rs"));
|
||||
|
@ -52,20 +52,23 @@ Then, `cargo install preserves-schema`.
|
|||
|
||||
## Usage
|
||||
|
||||
preserves-schema 1.0.0
|
||||
preserves-schema 3.990.2
|
||||
|
||||
USAGE:
|
||||
preserves-schema-rs [OPTIONS] --output-dir <output-dir> --prefix <prefix> [--] [input-glob]...
|
||||
preserves-schema-rs [FLAGS] [OPTIONS] --output-dir <output-dir> --prefix <prefix>
|
||||
[--] [input-glob]...
|
||||
|
||||
FLAGS:
|
||||
-h, --help Prints help information
|
||||
-V, --version Prints version information
|
||||
-h, --help Prints help information
|
||||
--rustfmt-skip
|
||||
-V, --version Prints version information
|
||||
|
||||
OPTIONS:
|
||||
--module <module>...
|
||||
-o, --output-dir <output-dir>
|
||||
-p, --prefix <prefix>
|
||||
--support-crate <support-crate>
|
||||
--xref <xref>...
|
||||
|
||||
ARGS:
|
||||
<input-glob>...
|
||||
|
|
|
@ -3,6 +3,12 @@
|
|||
set -e
|
||||
exec 1>&2
|
||||
|
||||
COMMAND=cmp
|
||||
if [ "$1" = "--fix" ];
|
||||
then
|
||||
COMMAND=cp
|
||||
fi
|
||||
|
||||
# https://gitlab.com/preserves/preserves/-/issues/30
|
||||
#
|
||||
# So it turns out that Racket's git-checkout mechanism pays attention
|
||||
|
@ -16,10 +22,19 @@ exec 1>&2
|
|||
|
||||
# Ensure that various copies of schema.prs, schema.bin, path.bin,
|
||||
# samples.pr and samples.bin are in fact identical.
|
||||
cmp path/path.bin implementations/python/preserves/path.prb
|
||||
cmp path/path.bin implementations/rust/preserves-path/path.bin
|
||||
cmp schema/schema.bin implementations/python/preserves/schema.prb
|
||||
cmp schema/schema.prs implementations/racket/preserves/preserves-schema/schema.prs
|
||||
cmp tests/samples.bin implementations/python/tests/samples.bin
|
||||
cmp tests/samples.pr implementations/python/tests/samples.pr
|
||||
cmp tests/samples.pr implementations/racket/preserves/preserves/tests/samples.pr
|
||||
${COMMAND} path/path.bin implementations/python/preserves/path.prb
|
||||
${COMMAND} path/path.bin implementations/rust/preserves-path/path.bin
|
||||
|
||||
${COMMAND} schema/schema.bin implementations/python/preserves/schema.prb
|
||||
${COMMAND} schema/schema.prs implementations/racket/preserves/preserves-schema/schema.prs
|
||||
|
||||
${COMMAND} tests/samples.bin implementations/python/tests/samples.bin
|
||||
${COMMAND} tests/samples.pr implementations/python/tests/samples.pr
|
||||
${COMMAND} tests/samples.pr implementations/racket/preserves/preserves/tests/samples.pr
|
||||
|
||||
${COMMAND} _includes/what-is-preserves.md implementations/rust/preserves/doc/what-is-preserves.md
|
||||
${COMMAND} _includes/cheatsheet-binary-plaintext.md implementations/rust/preserves/doc/cheatsheet-binary-plaintext.md
|
||||
${COMMAND} _includes/cheatsheet-text-plaintext.md implementations/rust/preserves/doc/cheatsheet-text-plaintext.md
|
||||
${COMMAND} _includes/value-grammar.md implementations/rust/preserves/doc/value-grammar.md
|
||||
|
||||
${COMMAND} _includes/what-is-preserves-schema.md implementations/rust/preserves-schema/doc/what-is-preserves-schema.md
|
||||
|
|
|
@ -0,0 +1,2 @@
|
|||
dist/
|
||||
lib/
|
|
@ -0,0 +1 @@
|
|||
version-tag-prefix javascript-@preserves/schema-cli@
|
|
@ -0,0 +1 @@
|
|||
# Preserves Schema for TypeScript/JavaScript: Command-line tools
|
|
@ -0,0 +1,39 @@
|
|||
{
|
||||
"name": "@preserves/schema-cli",
|
||||
"version": "0.990.1",
|
||||
"description": "Command-line tools for Preserves Schema",
|
||||
"homepage": "https://gitlab.com/preserves/preserves",
|
||||
"license": "Apache-2.0",
|
||||
"publishConfig": {
|
||||
"access": "public"
|
||||
},
|
||||
"repository": "gitlab:preserves/preserves",
|
||||
"author": "Tony Garnock-Jones <tonyg@leastfixedpoint.com>",
|
||||
"scripts": {
|
||||
"clean": "rm -rf lib dist",
|
||||
"prepare": "yarn compile && yarn rollup",
|
||||
"compile": "tsc",
|
||||
"compile:watch": "yarn compile -w",
|
||||
"rollup": "rollup -c",
|
||||
"rollup:watch": "yarn rollup -w",
|
||||
"test": "true",
|
||||
"veryclean": "yarn run clean && rm -rf node_modules"
|
||||
},
|
||||
"bin": {
|
||||
"preserves-schema-ts": "./bin/preserves-schema-ts.js",
|
||||
"preserves-schemac": "./bin/preserves-schemac.js"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/glob": "^7.1",
|
||||
"@types/minimatch": "^3.0"
|
||||
},
|
||||
"dependencies": {
|
||||
"@preserves/core": "^0.990.0",
|
||||
"@preserves/schema": "^0.990.1",
|
||||
"chalk": "^4.1",
|
||||
"chokidar": "^3.5",
|
||||
"commander": "^7.2",
|
||||
"glob": "^7.1",
|
||||
"minimatch": "^3.0"
|
||||
}
|
||||
}
|
|
@ -0,0 +1,17 @@
|
|||
import terser from '@rollup/plugin-terser';
|
||||
|
||||
function cli(name) {
|
||||
return {
|
||||
input: `lib/bin/${name}.js`,
|
||||
output: [{file: `dist/bin/${name}.js`, format: 'commonjs'}],
|
||||
external: [
|
||||
'@preserves/core',
|
||||
'@preserves/schema',
|
||||
],
|
||||
};
|
||||
}
|
||||
|
||||
export default [
|
||||
cli('preserves-schema-ts'),
|
||||
cli('preserves-schemac'),
|
||||
];
|
|
@ -1,10 +1,9 @@
|
|||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import { glob } from 'glob';
|
||||
import { IdentitySet, formatPosition, Position } from '@preserves/core';
|
||||
import { readSchema } from '../reader';
|
||||
import chalk from 'chalk';
|
||||
import * as M from '../meta';
|
||||
import { IdentitySet, formatPosition, Position } from '@preserves/core';
|
||||
import { readSchema, Meta as M } from '@preserves/schema';
|
||||
|
||||
export interface Diagnostic {
|
||||
type: 'warn' | 'error';
|
|
@ -1,9 +1,8 @@
|
|||
import { compile } from '../index';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import minimatch from 'minimatch';
|
||||
import { Command } from 'commander';
|
||||
import * as M from '../meta';
|
||||
import { compile, Meta as M } from '@preserves/schema';
|
||||
import chalk from 'chalk';
|
||||
import { is, Position } from '@preserves/core';
|
||||
import chokidar from 'chokidar';
|
|
@ -2,7 +2,7 @@ import { Command } from 'commander';
|
|||
import { canonicalEncode, KeyedDictionary, underlying } from '@preserves/core';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import * as M from '../meta';
|
||||
import { Meta as M } from '@preserves/schema';
|
||||
import { expandInputGlob, formatFailures } from './cli-utils';
|
||||
|
||||
export type CommandLineArguments = {
|
|
@ -0,0 +1,16 @@
|
|||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2017",
|
||||
"lib": ["es2019", "DOM"],
|
||||
"declaration": true,
|
||||
"baseUrl": "./src",
|
||||
"rootDir": "./src",
|
||||
"outDir": "./lib",
|
||||
"declarationDir": "./lib",
|
||||
"esModuleInterop": true,
|
||||
"moduleResolution": "node",
|
||||
"sourceMap": true,
|
||||
"strict": true
|
||||
},
|
||||
"include": ["src/**/*"]
|
||||
}
|
|
@ -2,3 +2,7 @@
|
|||
|
||||
This is an implementation of [Preserves Schema](https://preserves.dev/preserves-schema.html)
|
||||
for TypeScript and JavaScript.
|
||||
|
||||
This package implements a Schema runtime and a Schema-to-TypeScript compiler, but offers no
|
||||
command line interfaces. See `@preserves/schema-cli` for command-line tools for working with
|
||||
Schema and compiling from Schema to TypeScript.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{
|
||||
"name": "@preserves/schema",
|
||||
"version": "0.990.0",
|
||||
"version": "0.990.1",
|
||||
"description": "Schema support for Preserves data serialization format",
|
||||
"homepage": "https://gitlab.com/preserves/preserves",
|
||||
"license": "Apache-2.0",
|
||||
|
@ -13,7 +13,7 @@
|
|||
"types": "lib/index.d.ts",
|
||||
"author": "Tony Garnock-Jones <tonyg@leastfixedpoint.com>",
|
||||
"scripts": {
|
||||
"regenerate": "rm -rf ./src/gen && yarn copy-schema && ./bin/preserves-schema-ts.js --output ./src/gen ./dist:schema.prs",
|
||||
"regenerate": "rm -rf ./src/gen && yarn copy-schema && ../schema-cli/bin/preserves-schema-ts.js --output ./src/gen ./dist:schema.prs",
|
||||
"clean": "rm -rf lib dist",
|
||||
"prepare": "yarn compile && yarn rollup && yarn copy-schema",
|
||||
"compile": "tsc",
|
||||
|
@ -25,18 +25,7 @@
|
|||
"test:watch": "jest --watch",
|
||||
"veryclean": "yarn run clean && rm -rf node_modules"
|
||||
},
|
||||
"bin": {
|
||||
"preserves-schema-ts": "./bin/preserves-schema-ts.js",
|
||||
"preserves-schemac": "./bin/preserves-schemac.js"
|
||||
},
|
||||
"dependencies": {
|
||||
"@preserves/core": "^0.990.0",
|
||||
"@types/glob": "^7.1",
|
||||
"@types/minimatch": "^3.0",
|
||||
"chalk": "^4.1",
|
||||
"chokidar": "^3.5",
|
||||
"commander": "^7.2",
|
||||
"glob": "^7.1",
|
||||
"minimatch": "^3.0"
|
||||
"@preserves/core": "^0.990.0"
|
||||
}
|
||||
}
|
||||
|
|
|
@ -31,13 +31,6 @@ function cli(name) {
|
|||
output: [{file: `dist/bin/${name}.js`, format: 'commonjs'}],
|
||||
external: [
|
||||
'@preserves/core',
|
||||
'chalk',
|
||||
'chokidar',
|
||||
'fs',
|
||||
'glob',
|
||||
'minimatch',
|
||||
'path',
|
||||
'commander',
|
||||
],
|
||||
};
|
||||
}
|
||||
|
@ -53,6 +46,4 @@ export default [
|
|||
],
|
||||
external: ['@preserves/core'],
|
||||
},
|
||||
cli('preserves-schema-ts'),
|
||||
cli('preserves-schemac'),
|
||||
];
|
||||
|
|
|
@ -20,5 +20,7 @@ open "cd packages/core; yarn run test:watch"
|
|||
open "cd packages/schema; yarn run compile:watch"
|
||||
open "cd packages/schema; yarn run rollup:watch"
|
||||
open "cd packages/schema; yarn run test:watch"
|
||||
open "cd packages/schema-cli; yarn run compile:watch"
|
||||
open "cd packages/schema-cli; yarn run rollup:watch"
|
||||
|
||||
tmux select-layout even-vertical
|
||||
|
|
|
@ -5,6 +5,9 @@
|
|||
all:
|
||||
cargo build --all-targets
|
||||
|
||||
doc:
|
||||
cargo doc --workspace
|
||||
|
||||
x86_64-binary: x86_64-binary-release
|
||||
|
||||
x86_64-binary-release:
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
[package]
|
||||
name = "preserves-schema"
|
||||
version = "3.990.0"
|
||||
version = "3.990.3"
|
||||
authors = ["Tony Garnock-Jones <tonyg@leastfixedpoint.com>"]
|
||||
edition = "2018"
|
||||
description = "Implementation of Preserves Schema code generation and support for Rust."
|
||||
|
|
|
@ -1,4 +1,6 @@
|
|||
# Preserves Schema for Rust
|
||||
```shell
|
||||
cargo add preserves preserves-schema
|
||||
```
|
||||
|
||||
This is an implementation of [Preserves Schema](https://preserves.dev/preserves-schema.html)
|
||||
for Rust.
|
||||
This crate ([`preserves-schema` on crates.io](https://crates.io/crates/preserves-schema)) is an
|
||||
implementation of [Preserves Schema](https://preserves.dev/preserves-schema.html) for Rust.
|
||||
|
|
|
@ -0,0 +1,112 @@
|
|||
# Example
|
||||
|
||||
[preserves-schemac]: https://preserves.dev/doc/preserves-schemac.html
|
||||
[preserves-schema-rs]: https://preserves.dev/doc/preserves-schema-rs.html
|
||||
|
||||
Preserves schemas are written in a syntax that (ab)uses [Preserves text
|
||||
syntax][preserves::value::text] as a kind of S-expression. Schema source code looks like this:
|
||||
|
||||
```preserves-schema
|
||||
version 1 .
|
||||
Present = <Present @username string> .
|
||||
Says = <Says @who string @what string> .
|
||||
UserStatus = <Status @username string @status Status> .
|
||||
Status = =here / <away @since TimeStamp> .
|
||||
TimeStamp = string .
|
||||
```
|
||||
|
||||
Conventionally, schema source code is stored in `*.prs` files. In this example, the source code
|
||||
above is placed in `simpleChatProtocol.prs`.
|
||||
|
||||
The Rust code generator for schemas requires not source code, but instances of the [Preserves
|
||||
metaschema](https://preserves.dev/preserves-schema.html#appendix-metaschema). To compile schema
|
||||
source code to metaschema instances, use [preserves-schemac][]:
|
||||
|
||||
```shell
|
||||
yarn global add @preserves/schema
|
||||
preserves-schemac .:simpleChatProtocol.prs > simpleChatProtocol.prb
|
||||
```
|
||||
|
||||
Binary-syntax metaschema instances are conventionally stored in `*.prb` files. If you have a
|
||||
whole directory tree of `*.prs` files, you can supply just "`.`" without the "`:`"-prefixed
|
||||
fileglob part.[^converting-metaschema-to-text] See the [preserves-schemac documentation][preserves-schemac].
|
||||
|
||||
[^converting-metaschema-to-text]:
|
||||
Converting the `simpleChatProtocol.prb` file to Preserves text syntax lets us read the
|
||||
metaschema instance corresponding to the source code:
|
||||
```shell
|
||||
cat simpleChatProtocol.prb | preserves-tool convert
|
||||
```
|
||||
The result:
|
||||
```preserves
|
||||
<bundle {
|
||||
[
|
||||
simpleChatProtocol
|
||||
]: <schema {
|
||||
definitions: {
|
||||
Present: <rec <lit Present> <tuple [
|
||||
<named username <atom String>>
|
||||
]>>
|
||||
Says: <rec <lit Says> <tuple [
|
||||
<named who <atom String>>
|
||||
<named what <atom String>>
|
||||
]>>
|
||||
Status: <or [
|
||||
[
|
||||
"here"
|
||||
<lit here>
|
||||
]
|
||||
[
|
||||
"away"
|
||||
<rec <lit away> <tuple [
|
||||
<named since <ref [] TimeStamp>>
|
||||
]>>
|
||||
]
|
||||
]>
|
||||
TimeStamp: <atom String>
|
||||
UserStatus: <rec <lit Status> <tuple [
|
||||
<named username <atom String>>
|
||||
<named status <ref [] Status>>
|
||||
]>>
|
||||
}
|
||||
embeddedType: #f
|
||||
version: 1
|
||||
}>
|
||||
}>
|
||||
```
|
||||
|
||||
#### Generating Rust code from a schema
|
||||
|
||||
Generate Rust definitions corresponding to a metaschema instance with [preserves-schema-rs][].
|
||||
The best way to use it is to integrate it into your `build.rs` (see [the
|
||||
docs][preserves-schema-rs]), but you can also use it as a standalone command-line tool.
|
||||
|
||||
The following command generates a directory `./rs/chat` containing rust sources for a module
|
||||
that expects to be called `chat` in Rust code:
|
||||
|
||||
```shell
|
||||
preserves-schema-rs --output-dir rs/chat --prefix chat simpleChatProtocol.prb
|
||||
```
|
||||
|
||||
Representative excerpts from one of the generated files, `./rs/chat/simple_chat_protocol.rs`:
|
||||
|
||||
```rust,noplayground
|
||||
pub struct Present {
|
||||
pub username: std::string::String
|
||||
}
|
||||
pub struct Says {
|
||||
pub who: std::string::String,
|
||||
pub what: std::string::String
|
||||
}
|
||||
pub struct UserStatus {
|
||||
pub username: std::string::String,
|
||||
pub status: Status
|
||||
}
|
||||
pub enum Status {
|
||||
Here,
|
||||
Away {
|
||||
since: std::boxed::Box<TimeStamp>
|
||||
}
|
||||
}
|
||||
pub struct TimeStamp(pub std::string::String);
|
||||
```
|
|
@ -0,0 +1,16 @@
|
|||
A Preserves schema connects Preserves `Value`s to host-language data
|
||||
structures. Each definition within a schema can be processed by a
|
||||
compiler to produce
|
||||
|
||||
- a simple host-language *type definition*;
|
||||
|
||||
- a partial *parsing* function from `Value`s to instances of the
|
||||
produced type; and
|
||||
|
||||
- a total *serialization* function from instances of the type to
|
||||
`Value`s.
|
||||
|
||||
Every parsed `Value` retains enough information to always be able to
|
||||
be serialized again, and every instance of a host-language data
|
||||
structure contains, by construction, enough information to be
|
||||
successfully serialized.
|
|
@ -1,3 +1,6 @@
|
|||
//! Command-line Rust code generator for Preserves Schema. See the documentation at
|
||||
//! <https://preserves.dev/doc/preserves-schema-rs.html>.
|
||||
|
||||
use std::io::Error;
|
||||
use std::io::ErrorKind;
|
||||
use std::path::PathBuf;
|
||||
|
|
|
@ -1,3 +1,39 @@
|
|||
//! Implementation of the Schema-to-Rust compiler; this is the core of the
|
||||
//! [preserves-schema-rs][] program.
|
||||
//!
|
||||
//! See the [documentation for preserves-schema-rs][preserves-schema-rs] for examples of how to
|
||||
//! use the compiler programmatically from a `build.rs` script, but very briefly, use
|
||||
//! [preserves-schemac](https://preserves.dev/doc/preserves-schemac.html) to generate a
|
||||
//! metaschema instance `*.prb` file, and then put something like this in `build.rs`:
|
||||
//!
|
||||
//! ```rust,ignore
|
||||
//! use preserves_schema::compiler::*;
|
||||
//!
|
||||
//! const PATH_TO_PRB_FILE: &'static str = "your-metaschema-instance-file.prb";
|
||||
//!
|
||||
//! fn main() -> Result<(), std::io::Error> {
|
||||
//! let buildroot = std::path::PathBuf::from(std::env::var_os("OUT_DIR").unwrap());
|
||||
//!
|
||||
//! let mut gen_dir = buildroot.clone();
|
||||
//! gen_dir.push("src/schemas");
|
||||
//! let mut c = CompilerConfig::new(gen_dir, "crate::schemas".to_owned());
|
||||
//!
|
||||
//! let inputs = expand_inputs(&vec![PATH_TO_PRB_FILE.to_owned()])?;
|
||||
//! c.load_schemas_and_bundles(&inputs, &vec![])?;
|
||||
//! compile(&c)
|
||||
//! }
|
||||
//! ```
|
||||
//!
|
||||
//! plus something like this in your `lib.rs` or main program:
|
||||
//!
|
||||
//! ```rust,ignore
|
||||
//! pub mod schemas {
|
||||
//! include!(concat!(env!("OUT_DIR"), "/src/schemas/mod.rs"));
|
||||
//! }
|
||||
//! ```
|
||||
//!
|
||||
//! [preserves-schema-rs]: https://preserves.dev/doc/preserves-schema-rs.html
|
||||
|
||||
pub mod context;
|
||||
pub mod cycles;
|
||||
pub mod names;
|
||||
|
@ -29,11 +65,18 @@ use std::io::Read;
|
|||
use std::io::Write;
|
||||
use std::path::PathBuf;
|
||||
|
||||
/// Names a Schema module within a (collection of) Schema bundle(s).
|
||||
pub type ModulePath = Vec<String>;
|
||||
|
||||
/// Implement this trait to extend the compiler with custom code generation support. The main
|
||||
/// code generators are also implemented as plugins.
|
||||
///
|
||||
/// For an example of its use outside the core compiler, see [`build.rs` for the `syndicate-rs` project](https://git.syndicate-lang.org/syndicate-lang/syndicate-rs/src/commit/60e6c6badfcbcbccc902994f4f32db6048f60d1f/syndicate/build.rs).
|
||||
pub trait Plugin: std::fmt::Debug {
|
||||
/// Use `_module_ctxt` to emit code at a per-module level.
|
||||
fn generate_module(&self, _module_ctxt: &mut ModuleContext) {}
|
||||
|
||||
/// Use `module_ctxt` to emit code at a per-Schema-[Definition] level.
|
||||
fn generate_definition(
|
||||
&self,
|
||||
module_ctxt: &mut ModuleContext,
|
||||
|
@ -110,17 +153,30 @@ impl ExternalModule {
|
|||
}
|
||||
}
|
||||
|
||||
/// Main entry point to the compiler.
|
||||
#[derive(Debug)]
|
||||
pub struct CompilerConfig {
|
||||
/// All known Schema modules, indexed by [ModulePath] and annotated with a [Purpose].
|
||||
pub bundle: Map<ModulePath, (Schema, Purpose)>,
|
||||
/// Where output Rust code files will be placed.
|
||||
pub output_dir: PathBuf,
|
||||
/// Fully-qualified Rust module prefix to use for each generated module.
|
||||
pub fully_qualified_module_prefix: String,
|
||||
/// Rust module path to the [preserves_schema::support][crate::support] module.
|
||||
pub support_crate: String,
|
||||
/// External modules for cross-referencing.
|
||||
pub external_modules: Map<ModulePath, ExternalModule>,
|
||||
/// Plugins active in this compiler instance.
|
||||
pub plugins: Vec<Box<dyn Plugin>>,
|
||||
/// If true, a directive is emitted in each module instructing
|
||||
/// [rustfmt](https://github.com/rust-lang/rustfmt) to ignore it.
|
||||
pub rustfmt_skip: bool,
|
||||
}
|
||||
|
||||
/// Loads a [Schema] or [Bundle] from path `i` into `bundle` for the given `purpose`.
|
||||
///
|
||||
/// If `i` holds a [Schema], then the file stem of `i` is used as the module name when placing
|
||||
/// the schema in `bundle`.
|
||||
pub fn load_schema_or_bundle_with_purpose(
|
||||
bundle: &mut Map<ModulePath, (Schema, Purpose)>,
|
||||
i: &PathBuf,
|
||||
|
@ -134,6 +190,11 @@ pub fn load_schema_or_bundle_with_purpose(
|
|||
Ok(())
|
||||
}
|
||||
|
||||
/// Loads a [Schema] or [Bundle] from raw binary encoded value `input` into `bundle` for the
|
||||
/// given `purpose`.
|
||||
///
|
||||
/// If `input` corresponds to a [Schema], then `prefix` is used as its module name; otherwise,
|
||||
/// it's a [Bundle], and `prefix` is ignored.
|
||||
pub fn load_schema_or_bundle_bin_with_purpose(
|
||||
bundle: &mut Map<ModulePath, (Schema, Purpose)>,
|
||||
prefix: &str,
|
||||
|
@ -165,6 +226,10 @@ fn bundle_prefix(i: &PathBuf) -> io::Result<&str> {
|
|||
})
|
||||
}
|
||||
|
||||
/// Loads a [Schema] or [Bundle] from path `i` into `bundle`.
|
||||
///
|
||||
/// If `i` holds a [Schema], then the file stem of `i` is used as the module name when placing
|
||||
/// the schema in `bundle`.
|
||||
pub fn load_schema_or_bundle(bundle: &mut Map<ModulePath, Schema>, i: &PathBuf) -> io::Result<()> {
|
||||
let mut f = File::open(&i)?;
|
||||
let mut bs = vec![];
|
||||
|
@ -172,6 +237,10 @@ pub fn load_schema_or_bundle(bundle: &mut Map<ModulePath, Schema>, i: &PathBuf)
|
|||
load_schema_or_bundle_bin(bundle, bundle_prefix(i)?, &bs[..])
|
||||
}
|
||||
|
||||
/// Loads a [Schema] or [Bundle] from raw binary encoded value `input` into `bundle`.
|
||||
///
|
||||
/// If `input` corresponds to a [Schema], then `prefix` is used as its module name; otherwise,
|
||||
/// it's a [Bundle], and `prefix` is ignored.
|
||||
pub fn load_schema_or_bundle_bin(
|
||||
bundle: &mut Map<ModulePath, Schema>,
|
||||
prefix: &str,
|
||||
|
@ -199,6 +268,8 @@ pub fn load_schema_or_bundle_bin(
|
|||
}
|
||||
|
||||
impl CompilerConfig {
|
||||
/// Construct a [CompilerConfig] configured to send output files to `output_dir`, and to
|
||||
/// use `fully_qualified_module_prefix` as the Rust module prefix for generated code.
|
||||
pub fn new(output_dir: PathBuf, fully_qualified_module_prefix: String) -> Self {
|
||||
CompilerConfig {
|
||||
bundle: Map::new(),
|
||||
|
@ -277,6 +348,7 @@ impl CompilerConfig {
|
|||
}
|
||||
}
|
||||
|
||||
/// Expands a vector of [mod@glob]s to a vector of actual paths.
|
||||
pub fn expand_inputs(globs: &Vec<String>) -> io::Result<Vec<PathBuf>> {
|
||||
let mut result = Vec::new();
|
||||
for g in globs.iter() {
|
||||
|
@ -322,6 +394,7 @@ impl Schema {
|
|||
}
|
||||
}
|
||||
|
||||
/// Main entry point: runs the compilation process.
|
||||
pub fn compile(config: &CompilerConfig) -> io::Result<()> {
|
||||
let mut b = BundleContext::new(config);
|
||||
|
||||
|
|
|
@ -1,4 +1,12 @@
|
|||
#![doc = concat!(
|
||||
include_str!("../README.md"),
|
||||
"# What is Preserves Schema?\n\n",
|
||||
include_str!("../doc/what-is-preserves-schema.md"),
|
||||
include_str!("../doc/example.md"),
|
||||
)]
|
||||
|
||||
pub mod compiler;
|
||||
/// Auto-generated Preserves Schema Metaschema types, parsers, and unparsers.
|
||||
pub mod gen;
|
||||
pub mod support;
|
||||
pub mod syntax;
|
||||
|
|
|
@ -1,3 +1,6 @@
|
|||
//! Interpreter for instances of Preserves Schema Metaschema, for schema-directed dynamic
|
||||
//! parsing and unparsing of terms.
|
||||
|
||||
use crate::gen::schema::*;
|
||||
|
||||
use preserves::value::merge::merge2;
|
||||
|
@ -5,8 +8,10 @@ use preserves::value::Map;
|
|||
use preserves::value::NestedValue;
|
||||
use preserves::value::Value;
|
||||
|
||||
/// Represents an environment mapping schema module names to [Schema] instances.
|
||||
pub type Env<V> = Map<Vec<String>, Schema<V>>;
|
||||
|
||||
/// Context for a given interpretation of a [Schema].
|
||||
#[derive(Debug)]
|
||||
pub struct Context<'a, V: NestedValue> {
|
||||
pub env: &'a Env<V>,
|
||||
|
@ -20,6 +25,7 @@ enum DynField<V: NestedValue> {
|
|||
}
|
||||
|
||||
impl<'a, V: NestedValue> Context<'a, V> {
|
||||
/// Construct a new [Context] with the given [Env].
|
||||
pub fn new(env: &'a Env<V>) -> Self {
|
||||
Context {
|
||||
env,
|
||||
|
@ -27,6 +33,8 @@ impl<'a, V: NestedValue> Context<'a, V> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Parse `v` using the rule named `name` from the module at path `module` in `self.env`.
|
||||
/// Yields `Some(...)` if the parse succeeds, and `None` otherwise.
|
||||
pub fn dynamic_parse(&mut self, module: &Vec<String>, name: &str, v: &V) -> Option<V> {
|
||||
let old_module =
|
||||
(module.len() > 0).then(|| std::mem::replace(&mut self.module, module.clone()));
|
||||
|
@ -39,6 +47,7 @@ impl<'a, V: NestedValue> Context<'a, V> {
|
|||
result
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn dynamic_unparse(&mut self, _module: &Vec<String>, _name: &str, _w: &V) -> Option<V> {
|
||||
panic!("Not yet implemented");
|
||||
}
|
||||
|
|
|
@ -1,3 +1,7 @@
|
|||
//! The runtime support library for compiled Schemas.
|
||||
|
||||
#[doc(hidden)]
|
||||
/// Reexport lazy_static for generated code to use.
|
||||
pub use lazy_static::lazy_static;
|
||||
|
||||
pub use preserves;
|
||||
|
@ -21,10 +25,16 @@ use std::sync::Arc;
|
|||
|
||||
use thiserror::Error;
|
||||
|
||||
/// Every [language][crate::define_language] implements [NestedValueCodec] as a marker trait.
|
||||
pub trait NestedValueCodec {} // marker trait
|
||||
impl NestedValueCodec for () {}
|
||||
|
||||
/// Implementors of [Parse] can produce instances of themselves from a [Value], given a
|
||||
/// supporting [language][crate::define_language]. All Schema-compiler-produced types implement
|
||||
/// [Parse].
|
||||
pub trait Parse<L, Value: NestedValue>: Sized {
|
||||
/// Decode the given `value` (using auxiliary structure from the `language` instance) to
|
||||
/// produce an instance of [Self].
|
||||
fn parse(language: L, value: &Value) -> Result<Self, ParseError>;
|
||||
}
|
||||
|
||||
|
@ -34,7 +44,10 @@ impl<'a, T: NestedValueCodec, Value: NestedValue> Parse<&'a T, Value> for Value
|
|||
}
|
||||
}
|
||||
|
||||
/// Implementors of [Unparse] can convert themselves into a [Value], given a supporting
|
||||
/// [language][crate::define_language]. All Schema-compiler-produced types implement [Unparse].
|
||||
pub trait Unparse<L, Value: NestedValue> {
|
||||
/// Encode `self` into a [Value] (using auxiliary structure from the `language` instance).
|
||||
fn unparse(&self, language: L) -> Value;
|
||||
}
|
||||
|
||||
|
@ -44,8 +57,13 @@ impl<'a, T: NestedValueCodec, Value: NestedValue> Unparse<&'a T, Value> for Valu
|
|||
}
|
||||
}
|
||||
|
||||
/// Every [language][crate::define_language] implements [Codec], which supplies convenient
|
||||
/// shorthand for invoking [Parse::parse] and [Unparse::unparse].
|
||||
pub trait Codec<N: NestedValue> {
|
||||
/// Delegates to [`T::parse`][Parse::parse], using `self` as language and the given `value`
|
||||
/// as input.
|
||||
fn parse<'a, T: Parse<&'a Self, N>>(&'a self, value: &N) -> Result<T, ParseError>;
|
||||
/// Delegates to [`value.unparse`][Unparse::unparse], using `self` as language.
|
||||
fn unparse<'a, T: Unparse<&'a Self, N>>(&'a self, value: &T) -> N;
|
||||
}
|
||||
|
||||
|
@ -59,6 +77,11 @@ impl<L, N: NestedValue> Codec<N> for L {
|
|||
}
|
||||
}
|
||||
|
||||
/// Implementors of [Deserialize] can produce instances of themselves from a [Value]. All
|
||||
/// Schema-compiler-produced types implement [Deserialize].
|
||||
///
|
||||
/// The difference between [Deserialize] and [Parse] is that implementors of [Deserialize] know
|
||||
/// which [language][crate::define_language] to use.
|
||||
pub trait Deserialize<N: NestedValue>
|
||||
where
|
||||
Self: Sized,
|
||||
|
@ -66,10 +89,14 @@ where
|
|||
fn deserialize<'de, R: Reader<'de, N>>(r: &mut R) -> Result<Self, ParseError>;
|
||||
}
|
||||
|
||||
/// Extracts a simple literal term from a byte array using
|
||||
/// [PackedReader][preserves::value::packed::PackedReader]. No embedded values are permitted.
|
||||
pub fn decode_lit<N: NestedValue>(bs: &[u8]) -> io::Result<N> {
|
||||
preserves::value::packed::from_bytes(bs, NoEmbeddedDomainCodec)
|
||||
}
|
||||
|
||||
/// When `D` can parse itself from an [IOValue], this function parses all embedded [IOValue]s
|
||||
/// into `D`s.
|
||||
pub fn decode_embedded<D: Domain>(v: &IOValue) -> Result<ArcValue<Arc<D>>, ParseError>
|
||||
where
|
||||
for<'a> D: TryFrom<&'a IOValue, Error = ParseError>,
|
||||
|
@ -77,6 +104,8 @@ where
|
|||
v.copy_via(&mut |d| Ok(Value::Embedded(Arc::new(D::try_from(d)?))))
|
||||
}
|
||||
|
||||
/// When `D` can unparse itself into an [IOValue], this function converts all embedded `D`s
|
||||
/// into [IOValue]s.
|
||||
pub fn encode_embedded<D: Domain>(v: &ArcValue<Arc<D>>) -> IOValue
|
||||
where
|
||||
for<'a> IOValue: From<&'a D>,
|
||||
|
@ -85,10 +114,13 @@ where
|
|||
.unwrap()
|
||||
}
|
||||
|
||||
/// Error value yielded when parsing of an [IOValue] into a Schema-compiler-produced type.
|
||||
#[derive(Error, Debug)]
|
||||
pub enum ParseError {
|
||||
/// Signalled when the input does not match the Preserves Schema associated with the type.
|
||||
#[error("Input not conformant with Schema: {0}")]
|
||||
ConformanceError(&'static str),
|
||||
/// Signalled when the underlying Preserves library signals an error.
|
||||
#[error(transparent)]
|
||||
Preserves(preserves::error::Error),
|
||||
}
|
||||
|
@ -120,10 +152,12 @@ impl From<ParseError> for io::Error {
|
|||
}
|
||||
|
||||
impl ParseError {
|
||||
/// Constructs a [ParseError::ConformanceError].
|
||||
pub fn conformance_error(context: &'static str) -> Self {
|
||||
ParseError::ConformanceError(context)
|
||||
}
|
||||
|
||||
/// True iff `self` is a [ParseError::ConformanceError].
|
||||
pub fn is_conformance_error(&self) -> bool {
|
||||
return if let ParseError::ConformanceError(_) = self {
|
||||
true
|
||||
|
|
|
@ -1,12 +1,21 @@
|
|||
//! A library for emitting pretty-formatted structured source code.
|
||||
//!
|
||||
//! The main entry points are [Formatter::to_string] and [Formatter::write], plus the utilities
|
||||
//! in the [macros] submodule.
|
||||
|
||||
use std::fmt::Write;
|
||||
use std::str;
|
||||
|
||||
/// Default width for pretty-formatting, in columns.
|
||||
pub const DEFAULT_WIDTH: usize = 80;
|
||||
|
||||
/// All pretty-formattable items must implement this trait.
|
||||
pub trait Emittable: std::fmt::Debug {
|
||||
/// Serializes `self`, as pretty-printed code, on `f`.
|
||||
fn write_on(&self, f: &mut Formatter);
|
||||
}
|
||||
|
||||
/// Tailoring of behaviour for [Vertical] groupings.
|
||||
#[derive(Clone, PartialEq, Eq)]
|
||||
pub enum VerticalMode {
|
||||
Variable,
|
||||
|
@ -14,13 +23,16 @@ pub enum VerticalMode {
|
|||
ExtraNewline,
|
||||
}
|
||||
|
||||
/// Vertical formatting for [Emittable]s.
|
||||
pub trait Vertical {
|
||||
fn set_vertical_mode(&mut self, mode: VerticalMode);
|
||||
fn write_vertically_on(&self, f: &mut Formatter);
|
||||
}
|
||||
|
||||
/// Polymorphic [Emittable], used consistently in the API.
|
||||
pub type Item = std::rc::Rc<dyn Emittable>;
|
||||
|
||||
/// A possibly-vertical sequence of items with item-separating and -terminating text.
|
||||
#[derive(Clone)]
|
||||
pub struct Sequence {
|
||||
pub items: Vec<Item>,
|
||||
|
@ -29,6 +41,8 @@ pub struct Sequence {
|
|||
pub terminator: &'static str,
|
||||
}
|
||||
|
||||
/// A sequence of items, indented when formatted vertically, surrounded by opening and closing
|
||||
/// text.
|
||||
#[derive(Clone)]
|
||||
pub struct Grouping {
|
||||
pub sequence: Sequence,
|
||||
|
@ -36,14 +50,18 @@ pub struct Grouping {
|
|||
pub close: &'static str,
|
||||
}
|
||||
|
||||
/// State needed for pretty-formatting of [Emittable]s.
|
||||
pub struct Formatter {
|
||||
/// Number of available columns. Used to decide between horizontal and vertical layouts.
|
||||
pub width: usize,
|
||||
indent_delta: String,
|
||||
current_indent: String,
|
||||
/// Mutable output buffer. Accumulates emitted text during writing.
|
||||
pub buffer: String,
|
||||
}
|
||||
|
||||
impl Formatter {
|
||||
/// Construct a Formatter using [DEFAULT_WIDTH] and a four-space indent.
|
||||
pub fn new() -> Self {
|
||||
Formatter {
|
||||
width: DEFAULT_WIDTH,
|
||||
|
@ -53,6 +71,7 @@ impl Formatter {
|
|||
}
|
||||
}
|
||||
|
||||
/// Construct a Formatter just like `self` but with an empty `buffer`.
|
||||
pub fn copy_empty(&self) -> Formatter {
|
||||
Formatter {
|
||||
width: self.width,
|
||||
|
@ -62,28 +81,37 @@ impl Formatter {
|
|||
}
|
||||
}
|
||||
|
||||
/// Yields the indent size.
|
||||
pub fn indent_size(self) -> usize {
|
||||
self.indent_delta.len()
|
||||
}
|
||||
|
||||
/// Updates the indent size.
|
||||
pub fn set_indent_size(&mut self, n: usize) {
|
||||
self.indent_delta = str::repeat(" ", n)
|
||||
}
|
||||
|
||||
/// Accumulates a text serialization of `e` in `buffer`.
|
||||
pub fn write<E: Emittable>(&mut self, e: E) {
|
||||
e.write_on(self)
|
||||
}
|
||||
|
||||
/// Emits a newline followed by indentation into `buffer`.
|
||||
pub fn newline(&mut self) {
|
||||
self.buffer.push_str(&self.current_indent)
|
||||
}
|
||||
|
||||
/// Creates a default Formatter, uses it to [write][Formatter::write] `e`, and yields the
|
||||
/// contents of its `buffer`.
|
||||
pub fn to_string<E: Emittable>(e: E) -> String {
|
||||
let mut f = Formatter::new();
|
||||
f.write(e);
|
||||
f.buffer
|
||||
}
|
||||
|
||||
/// Calls `f` in a context where the indentation has been increased by
|
||||
/// [Formatter::indent_size] spaces. Restores the indentation level after `f` returns.
|
||||
/// Yields the result of the call to `f`.
|
||||
pub fn with_indent<R, F: FnOnce(&mut Self) -> R>(&mut self, f: F) -> R {
|
||||
let old_indent = self.current_indent.clone();
|
||||
self.current_indent += &self.indent_delta;
|
||||
|
@ -93,6 +121,12 @@ impl Formatter {
|
|||
}
|
||||
}
|
||||
|
||||
impl Default for Formatter {
|
||||
fn default() -> Self {
|
||||
Self::new()
|
||||
}
|
||||
}
|
||||
|
||||
impl Default for VerticalMode {
|
||||
fn default() -> Self {
|
||||
Self::Variable
|
||||
|
@ -238,6 +272,12 @@ impl std::fmt::Debug for Grouping {
|
|||
|
||||
//---------------------------------------------------------------------------
|
||||
|
||||
/// Escapes `s` by substituting `\\` for `\`, `\"` for `"`, and `\u{...}` for characters
|
||||
/// outside the range 32..126, inclusive.
|
||||
///
|
||||
/// This process is intended to generate literals compatible with `rustc`; see [the language
|
||||
/// reference on "Character and string
|
||||
/// literals"](https://doc.rust-lang.org/reference/tokens.html#character-and-string-literals).
|
||||
pub fn escape_string(s: &str) -> String {
|
||||
let mut buf = String::new();
|
||||
buf.push('"');
|
||||
|
@ -253,6 +293,13 @@ pub fn escape_string(s: &str) -> String {
|
|||
buf
|
||||
}
|
||||
|
||||
/// Escapes `bs` into a Rust byte string literal, treating each byte as its ASCII equivalent
|
||||
/// except producing `\\` for 0x5c, `\"` for 0x22, and `\x..` for bytes outside the range
|
||||
/// 0x20..0x7e, inclusive.
|
||||
///
|
||||
/// This process is intended to generate literals compatible with `rustc`; see [the language
|
||||
/// reference on "Byte string
|
||||
/// literals"](https://doc.rust-lang.org/reference/tokens.html#byte-string-literals).
|
||||
pub fn escape_bytes(bs: &[u8]) -> String {
|
||||
let mut buf = String::new();
|
||||
buf.push_str("b\"");
|
||||
|
@ -262,7 +309,7 @@ pub fn escape_bytes(bs: &[u8]) -> String {
|
|||
'\\' => buf.push_str("\\\\"),
|
||||
'"' => buf.push_str("\\\""),
|
||||
_ if c >= ' ' && c <= '~' => buf.push(c),
|
||||
_ => write!(&mut buf, "\\x{{{:02x}}}", b).expect("no IO errors building a string"),
|
||||
_ => write!(&mut buf, "\\x{:02x}", b).expect("no IO errors building a string"),
|
||||
}
|
||||
}
|
||||
buf.push('"');
|
||||
|
@ -271,6 +318,7 @@ pub fn escape_bytes(bs: &[u8]) -> String {
|
|||
|
||||
//---------------------------------------------------------------------------
|
||||
|
||||
/// Utilities for constructing many useful kinds of [Sequence] and [Grouping].
|
||||
pub mod constructors {
|
||||
use super::Emittable;
|
||||
use super::Grouping;
|
||||
|
@ -279,10 +327,12 @@ pub mod constructors {
|
|||
use super::Vertical;
|
||||
use super::VerticalMode;
|
||||
|
||||
/// Produces a polymorphic, reference-counted [Item] from some generic [Emittable].
|
||||
pub fn item<E: 'static + Emittable>(i: E) -> Item {
|
||||
std::rc::Rc::new(i)
|
||||
}
|
||||
|
||||
/// *a*`::`*b*`::`*...*`::`*z*
|
||||
pub fn name(pieces: Vec<Item>) -> Sequence {
|
||||
Sequence {
|
||||
items: pieces,
|
||||
|
@ -292,6 +342,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// *ab...z* (directly adjacent, no separators or terminators)
|
||||
pub fn seq(items: Vec<Item>) -> Sequence {
|
||||
Sequence {
|
||||
items: items,
|
||||
|
@ -301,6 +352,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// *a*`, `*b*`, `*...*`, `*z*
|
||||
pub fn commas(items: Vec<Item>) -> Sequence {
|
||||
Sequence {
|
||||
items: items,
|
||||
|
@ -310,6 +362,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// `(`*a*`, `*b*`, `*...*`, `*z*`)`
|
||||
pub fn parens(items: Vec<Item>) -> Grouping {
|
||||
Grouping {
|
||||
sequence: commas(items),
|
||||
|
@ -318,6 +371,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// `[`*a*`, `*b*`, `*...*`, `*z*`]`
|
||||
pub fn brackets(items: Vec<Item>) -> Grouping {
|
||||
Grouping {
|
||||
sequence: commas(items),
|
||||
|
@ -326,6 +380,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// `<`*a*`, `*b*`, `*...*`, `*z*`>`
|
||||
pub fn anglebrackets(items: Vec<Item>) -> Grouping {
|
||||
Grouping {
|
||||
sequence: commas(items),
|
||||
|
@ -334,6 +389,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// `{`*a*`, `*b*`, `*...*`, `*z*`}`
|
||||
pub fn braces(items: Vec<Item>) -> Grouping {
|
||||
Grouping {
|
||||
sequence: commas(items),
|
||||
|
@ -342,6 +398,7 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// `{`*a*` `*b*` `*...*` `*z*`}`
|
||||
pub fn block(items: Vec<Item>) -> Grouping {
|
||||
Grouping {
|
||||
sequence: Sequence {
|
||||
|
@ -355,10 +412,12 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// As [block], but always vertical
|
||||
pub fn codeblock(items: Vec<Item>) -> Grouping {
|
||||
vertical(false, block(items))
|
||||
}
|
||||
|
||||
/// `{`*a*`; `*b*`; `*...*`; `*z*`}`
|
||||
pub fn semiblock(items: Vec<Item>) -> Grouping {
|
||||
Grouping {
|
||||
sequence: Sequence {
|
||||
|
@ -372,6 +431,9 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// Overrides `v` to be always vertical.
|
||||
///
|
||||
/// If `spaced` is true, inserts an extra newline between items.
|
||||
pub fn vertical<V: Vertical>(spaced: bool, mut v: V) -> V {
|
||||
v.set_vertical_mode(if spaced {
|
||||
VerticalMode::ExtraNewline
|
||||
|
@ -381,6 +443,7 @@ pub mod constructors {
|
|||
v
|
||||
}
|
||||
|
||||
/// Adds a layer of indentation to the given [Sequence].
|
||||
pub fn indented(sequence: Sequence) -> Grouping {
|
||||
Grouping {
|
||||
sequence,
|
||||
|
@ -390,52 +453,84 @@ pub mod constructors {
|
|||
}
|
||||
}
|
||||
|
||||
/// Ergonomic syntax for using the constructors in submodule [constructors]; see the
|
||||
/// documentation for the macros, which appears on the [page for the crate
|
||||
/// itself][crate#macros].
|
||||
pub mod macros {
|
||||
/// `name!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ *a*`::`*b*`::`*...*`::`*z*
|
||||
///
|
||||
/// See [super::constructors::name].
|
||||
#[macro_export]
|
||||
macro_rules! name {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::name(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `seq!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ *ab...z*
|
||||
///
|
||||
/// See [super::constructors::seq].
|
||||
#[macro_export]
|
||||
macro_rules! seq {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::seq(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `commas!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ *a*`, `*b*`, `*...*`, `*z*
|
||||
///
|
||||
/// See [super::constructors::commas].
|
||||
#[macro_export]
|
||||
macro_rules! commas {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::commas(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `parens!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ `(`*a*`, `*b*`, `*...*`, `*z*`)`
|
||||
///
|
||||
/// See [super::constructors::parens].
|
||||
#[macro_export]
|
||||
macro_rules! parens {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::parens(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `brackets!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ `[`*a*`, `*b*`, `*...*`, `*z*`]`
|
||||
///
|
||||
/// See [super::constructors::brackets].
|
||||
#[macro_export]
|
||||
macro_rules! brackets {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::brackets(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `anglebrackets!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ `<`*a*`, `*b*`, `*...*`, `*z*`>`
|
||||
///
|
||||
/// See [super::constructors::anglebrackets].
|
||||
#[macro_export]
|
||||
macro_rules! anglebrackets {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::anglebrackets(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `braces!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ `{`*a*`, `*b*`, `*...*`, `*z*`}`
|
||||
///
|
||||
/// See [super::constructors::braces].
|
||||
#[macro_export]
|
||||
macro_rules! braces {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::braces(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `block!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ `{`*a*` `*b*` `*...*` `*z*`}`
|
||||
///
|
||||
/// See [super::constructors::block].
|
||||
#[macro_export]
|
||||
macro_rules! block {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::block(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// As [`block`]`!`, but always vertical. See
|
||||
/// [constructors::codeblock][super::constructors::codeblock].
|
||||
#[macro_export]
|
||||
macro_rules! codeblock {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::codeblock(vec![$(std::rc::Rc::new($item)),*])}
|
||||
}
|
||||
|
||||
/// `semiblock!(`*a*`, `*b*`, `*...*`, `*z*`)` ⟶ `{`*a*`; `*b*`; `*...*`; `*z*`}`
|
||||
///
|
||||
/// See [super::constructors::semiblock].
|
||||
#[macro_export]
|
||||
macro_rules! semiblock {
|
||||
($($item:expr),*) => {$crate::syntax::block::constructors::semiblock(vec![$(std::rc::Rc::new($item)),*])}
|
||||
|
|
|
@ -1 +1,3 @@
|
|||
//! A library for emitting pretty-formatted structured source code.
|
||||
|
||||
pub mod block;
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
[package]
|
||||
name = "preserves"
|
||||
version = "3.990.0"
|
||||
version = "3.990.2"
|
||||
authors = ["Tony Garnock-Jones <tonyg@leastfixedpoint.com>"]
|
||||
edition = "2018"
|
||||
description = "Implementation of the Preserves serialization format via serde."
|
||||
|
|
|
@ -0,0 +1,23 @@
|
|||
```shell
|
||||
cargo add preserves
|
||||
```
|
||||
|
||||
This crate ([`preserves` on crates.io](https://crates.io/crates/preserves)) implements
|
||||
[Preserves](https://preserves.dev/) for Rust. It provides the core
|
||||
[semantics](https://preserves.dev/preserves.html#semantics) as well as both the [human-readable
|
||||
text syntax][crate::value::text] (a superset of JSON) and [machine-oriented binary
|
||||
format][crate::value::packed] (including
|
||||
[canonicalization](https://preserves.dev/canonical-binary.html)) for Preserves.
|
||||
|
||||
This crate is the foundation for others such as
|
||||
|
||||
- [`preserves-schema`](https://docs.rs/preserves-schema/), which implements [Preserves
|
||||
Schema](https://preserves.dev/preserves-schema.html);
|
||||
- [`preserves-path`](https://docs.rs/preserves-path/), which implements [Preserves
|
||||
Path](https://preserves.dev/preserves-path.html); and
|
||||
- [`preserves-tools`](https://crates.io/crates/preserves-tools), which provides command-line
|
||||
utilities for working with Preserves, in particular
|
||||
[`preserves-tool`](https://preserves.dev/doc/preserves-tool.html), a kind of Preserves
|
||||
Swiss-army knife.
|
||||
|
||||
It also includes [Serde](https://serde.rs/) support (modules [de], [ser], [symbol], [set]).
|
|
@ -0,0 +1,33 @@
|
|||
For a value `V`, we write `«V»` for the binary encoding of `V`.
|
||||
|
||||
```text
|
||||
«#f» = [0x80]
|
||||
«#t» = [0x81]
|
||||
|
||||
«@W V» = [0x85] ++ «W» ++ «V»
|
||||
«#!V» = [0x86] ++ «V»
|
||||
|
||||
«V» if V ∈ Float = [0x87, 0x04] ++ binary32(V)
|
||||
«V» if V ∈ Double = [0x87, 0x08] ++ binary64(V)
|
||||
|
||||
«V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
|
||||
«V» if V ∈ String = [0xB1] ++ varint(|utf8(V)|) ++ utf8(V)
|
||||
«V» if V ∈ ByteString = [0xB2] ++ varint(|V|) ++ V
|
||||
«V» if V ∈ Symbol = [0xB3] ++ varint(|utf8(V)|) ++ utf8(V)
|
||||
|
||||
«<L F_1...F_m>» = [0xB4] ++ «L» ++ «F_1» ++...++ «F_m» ++ [0x84]
|
||||
«[X_1...X_m]» = [0xB5] ++ «X_1» ++...++ «X_m» ++ [0x84]
|
||||
«#{E_1...E_m}» = [0xB6] ++ «E_1» ++...++ «E_m» ++ [0x84]
|
||||
«{K_1:V_1...K_m:V_m}» = [0xB7] ++ «K_1» ++ «V_1» ++...++ «K_m» ++ «V_m» ++ [0x84]
|
||||
|
||||
varint(n) = [n] if n < 128
|
||||
[(n & 127) | 128] ++ varint(n >> 7) if n ≥ 128
|
||||
|
||||
intbytes(n) = the empty sequence if n = 0, otherwise signedBigEndian(n)
|
||||
|
||||
signedBigEndian(n) = [n & 255] if -128 ≤ n ≤ 127
|
||||
signedBigEndian(n >> 8) ++ [n & 255] otherwise
|
||||
```
|
||||
|
||||
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
|
||||
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
|
|
@ -0,0 +1,44 @@
|
|||
```text
|
||||
Document := Value ws
|
||||
Value := ws (Record | Collection | Atom | Embedded | Annotated)
|
||||
Collection := Sequence | Dictionary | Set
|
||||
Atom := Boolean | ByteString | String | QuotedSymbol | Symbol | Number
|
||||
ws := (space | tab | cr | lf | `,`)*
|
||||
|
||||
Record := `<` Value+ ws `>`
|
||||
Sequence := `[` Value* ws `]`
|
||||
Dictionary := `{` (Value ws `:` Value)* ws `}`
|
||||
Set := `#{` Value* ws `}`
|
||||
|
||||
Boolean := `#t` | `#f`
|
||||
ByteString := `#"` binchar* `"`
|
||||
| `#x"` (ws hex hex)* ws `"`
|
||||
| `#[` (ws base64char)* ws `]`
|
||||
String := `"` («any unicode scalar except `\` or `"`» | escaped | `\"`)* `"`
|
||||
QuotedSymbol := `|` («any unicode scalar except `\` or `|`» | escaped | `\|`)* `|`
|
||||
Symbol := (`A`..`Z` | `a`..`z` | `0`..`9` | sympunct | symuchar)+
|
||||
Number := Float | Double | SignedInteger
|
||||
Float := flt (`f`|`F`) | `#xf"` (ws hex hex)4 ws `"`
|
||||
Double := flt | `#xd"` (ws hex hex)8 ws `"`
|
||||
SignedInteger := int
|
||||
|
||||
Embedded := `#!` Value
|
||||
Annotated := Annotation Value
|
||||
Annotation := `@` Value | `;` «any unicode scalar except cr or lf»* (cr | lf)
|
||||
|
||||
escaped := `\\` | `\/` | `\b` | `\f` | `\n` | `\r` | `\t` | `\u` hex hex hex hex
|
||||
binescaped := `\\` | `\/` | `\b` | `\f` | `\n` | `\r` | `\t` | `\x` hex hex
|
||||
binchar := «any scalar ≥32 and ≤126, except `\` or `"`» | binescaped | `\"`
|
||||
base64char := `A`..`Z` | `a`..`z` | `0`..`9` | `+` | `/` | `-` | `_` | `=`
|
||||
sympunct := `~` | `!` | `$` | `%` | `^` | `&` | `*` | `?`
|
||||
| `_` | `=` | `+` | `-` | `/` | `.`
|
||||
symuchar := «any scalar value ≥128 whose Unicode category is
|
||||
Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd, Nl, No, Pc,
|
||||
Pd, Po, Sc, Sm, Sk, So, or Co»
|
||||
|
||||
flt := int ( frac exp | frac | exp )
|
||||
int := (`-`|`+`) (`0`..`9`)+
|
||||
frac := `.` (`0`..`9`)+
|
||||
exp := (`e`|`E`) (`-`|`+`) (`0`..`9`)+
|
||||
hex := `A`..`F` | `a`..`f` | `0`..`9`
|
||||
```
|
|
@ -0,0 +1,18 @@
|
|||
```text
|
||||
Value = Atom
|
||||
| Compound
|
||||
| Embedded
|
||||
|
||||
Atom = Boolean
|
||||
| Float
|
||||
| Double
|
||||
| SignedInteger
|
||||
| String
|
||||
| ByteString
|
||||
| Symbol
|
||||
|
||||
Compound = Record
|
||||
| Sequence
|
||||
| Set
|
||||
| Dictionary
|
||||
```
|
|
@ -0,0 +1,12 @@
|
|||
*Preserves* is a data model, with associated serialization formats.
|
||||
|
||||
It supports *records* with user-defined *labels*, embedded
|
||||
*references*, and the usual suite of atomic and compound data types,
|
||||
including *binary* data as a distinct type from text strings. Its
|
||||
*annotations* allow separation of data from metadata such as comments,
|
||||
trace information, and provenance information.
|
||||
|
||||
Preserves departs from many other data languages in defining how to
|
||||
*compare* two values. Comparison is based on the data model, not on
|
||||
syntax or on data structures of any particular implementation
|
||||
language.
|
|
@ -1,3 +1,5 @@
|
|||
//! Support for Serde deserialization of Preserves terms described by Rust data types.
|
||||
|
||||
use serde::de::{DeserializeSeed, EnumAccess, MapAccess, SeqAccess, VariantAccess, Visitor};
|
||||
use serde::Deserialize;
|
||||
|
||||
|
@ -11,13 +13,21 @@ use super::value::{IOValue, IOValueDomainCodec, PackedReader, TextReader, ViaCod
|
|||
|
||||
pub use super::error::Error;
|
||||
|
||||
/// A [std::result::Result] type including [Error], the Preserves Serde deserialization error
|
||||
/// type, as its error.
|
||||
pub type Result<T> = std::result::Result<T, Error>;
|
||||
|
||||
/// Serde deserializer for Preserves-encoded Rust data. Use [Deserializer::from_reader] to
|
||||
/// construct instances, or [from_bytes]/[from_text]/[from_read]/[from_reader] etc to
|
||||
/// deserialize single terms directly.
|
||||
pub struct Deserializer<'de, 'r, R: Reader<'de, IOValue>> {
|
||||
/// The underlying Preserves [reader][crate::value::reader::Reader].
|
||||
pub read: &'r mut R,
|
||||
phantom: PhantomData<&'de ()>,
|
||||
}
|
||||
|
||||
/// Deserialize a `T` from `bytes`, which must contain a Preserves [machine-oriented binary
|
||||
/// syntax][crate::value::packed] term corresponding to the Serde serialization of a `T`.
|
||||
pub fn from_bytes<'de, T>(bytes: &'de [u8]) -> Result<T>
|
||||
where
|
||||
T: Deserialize<'de>,
|
||||
|
@ -28,6 +38,8 @@ where
|
|||
))
|
||||
}
|
||||
|
||||
/// Deserialize a `T` from `text`, which must contain a Preserves [text
|
||||
/// syntax][crate::value::text] term corresponding to the Serde serialization of a `T`.
|
||||
pub fn from_text<'de, T>(text: &'de str) -> Result<T>
|
||||
where
|
||||
T: Deserialize<'de>,
|
||||
|
@ -38,6 +50,8 @@ where
|
|||
))
|
||||
}
|
||||
|
||||
/// Deserialize a `T` from `read`, which must yield a Preserves [machine-oriented binary
|
||||
/// syntax][crate::value::packed] term corresponding to the Serde serialization of a `T`.
|
||||
pub fn from_read<'de, 'r, IOR: io::Read + io::Seek, T>(read: &'r mut IOR) -> Result<T>
|
||||
where
|
||||
T: Deserialize<'de>,
|
||||
|
@ -48,6 +62,8 @@ where
|
|||
))
|
||||
}
|
||||
|
||||
/// Deserialize a `T` from `read`, which must yield a Preserves term corresponding to the Serde
|
||||
/// serialization of a `T`.
|
||||
pub fn from_reader<'r, 'de, R: Reader<'de, IOValue>, T>(read: &'r mut R) -> Result<T>
|
||||
where
|
||||
T: Deserialize<'de>,
|
||||
|
@ -58,6 +74,7 @@ where
|
|||
}
|
||||
|
||||
impl<'r, 'de, R: Reader<'de, IOValue>> Deserializer<'de, 'r, R> {
|
||||
/// Construct a Deserializer from `read`, a Preserves [reader][crate::value::Reader].
|
||||
pub fn from_reader(read: &'r mut R) -> Self {
|
||||
Deserializer {
|
||||
read,
|
||||
|
@ -344,6 +361,7 @@ impl<'r, 'de, 'a, R: Reader<'de, IOValue>> serde::de::Deserializer<'de>
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct Seq<'de, 'r, 'a, R: Reader<'de, IOValue>> {
|
||||
b: B::Type,
|
||||
i: B::Item,
|
||||
|
|
|
@ -1,27 +1,47 @@
|
|||
//! Serde and plain-Preserves codec errors.
|
||||
|
||||
use num::bigint::BigInt;
|
||||
use std::convert::From;
|
||||
use std::io;
|
||||
|
||||
/// Representation of parse, deserialization, and other conversion errors.
|
||||
#[derive(Debug)]
|
||||
pub enum Error {
|
||||
/// Generic IO error.
|
||||
Io(io::Error),
|
||||
/// Generic message for the user.
|
||||
Message(String),
|
||||
/// Invalid unicode scalar `n` found during interpretation of a `<UnicodeScalar n>` record
|
||||
/// as a Rust `char`.
|
||||
InvalidUnicodeScalar(u32),
|
||||
/// Preserves supports arbitrary integers; when these are converted to specific Rust
|
||||
/// machine word types, sometimes they exceed the available range.
|
||||
NumberOutOfRange(BigInt),
|
||||
/// Serde has limited support for deserializing free-form data; this error is signalled
|
||||
/// when one of the limits is hit.
|
||||
CannotDeserializeAny,
|
||||
/// Syntax error: missing closing delimiter (`)`, `]`, `}`, `>` in text syntax; `0x84` in binary syntax; etc.)
|
||||
MissingCloseDelimiter,
|
||||
/// Signalled when an expected term is not present.
|
||||
MissingItem,
|
||||
/// Signalled when what was received did not match expectations.
|
||||
Expected(ExpectedKind, Received),
|
||||
#[doc(hidden)] // TODO remove this enum variant? It isn't used
|
||||
StreamingSerializationUnsupported,
|
||||
}
|
||||
|
||||
/// Used in [Error::Expected] to indicate what was received.
|
||||
#[derive(Debug)]
|
||||
pub enum Received {
|
||||
#[doc(hidden)] // TODO remove this enum variant? It isn't used
|
||||
ReceivedSomethingElse,
|
||||
/// Received a record with the given label symbol text.
|
||||
ReceivedRecordWithLabel(String),
|
||||
/// Received some other value, described in the `String`
|
||||
ReceivedOtherValue(String),
|
||||
}
|
||||
|
||||
/// Used in [Error::Expected] to indicate what was expected.
|
||||
#[derive(Debug, PartialEq)]
|
||||
pub enum ExpectedKind {
|
||||
Boolean,
|
||||
|
@ -35,7 +55,9 @@ pub enum ExpectedKind {
|
|||
ByteString,
|
||||
Symbol,
|
||||
|
||||
/// Expected a record, either of a specific arity (length) or of no specific arity
|
||||
Record(Option<usize>),
|
||||
/// Expected a record with a symbol label with text `String`, perhaps of some specific arity
|
||||
SimpleRecord(String, Option<usize>),
|
||||
Sequence,
|
||||
Set,
|
||||
|
@ -87,14 +109,17 @@ impl std::fmt::Display for Error {
|
|||
|
||||
//---------------------------------------------------------------------------
|
||||
|
||||
/// True iff `e` is `Error::Io`
|
||||
pub fn is_io_error(e: &Error) -> bool {
|
||||
matches!(e, Error::Io(_))
|
||||
}
|
||||
|
||||
/// Produce the generic "end of file" error, `Error::Io(`[io_eof]`())`
|
||||
pub fn eof() -> Error {
|
||||
Error::Io(io_eof())
|
||||
}
|
||||
|
||||
/// True iff `e` is an "end of file" error; see [is_eof_io_error]
|
||||
pub fn is_eof_error(e: &Error) -> bool {
|
||||
if let Error::Io(ioe) = e {
|
||||
is_eof_io_error(ioe)
|
||||
|
@ -103,10 +128,12 @@ pub fn is_eof_error(e: &Error) -> bool {
|
|||
}
|
||||
}
|
||||
|
||||
/// Produce a syntax error bearing the message `s`
|
||||
pub fn syntax_error(s: &str) -> Error {
|
||||
Error::Io(io_syntax_error(s))
|
||||
}
|
||||
|
||||
/// True iff `e` is a syntax error; see [is_syntax_io_error]
|
||||
pub fn is_syntax_error(e: &Error) -> bool {
|
||||
if let Error::Io(ioe) = e {
|
||||
is_syntax_io_error(ioe)
|
||||
|
@ -117,18 +144,22 @@ pub fn is_syntax_error(e: &Error) -> bool {
|
|||
|
||||
//---------------------------------------------------------------------------
|
||||
|
||||
/// Produce an [io::Error] of [io::ErrorKind::UnexpectedEof].
|
||||
pub fn io_eof() -> io::Error {
|
||||
io::Error::new(io::ErrorKind::UnexpectedEof, "EOF")
|
||||
}
|
||||
|
||||
/// True iff `e` is [io::ErrorKind::UnexpectedEof]
|
||||
pub fn is_eof_io_error(e: &io::Error) -> bool {
|
||||
matches!(e.kind(), io::ErrorKind::UnexpectedEof)
|
||||
}
|
||||
|
||||
/// Produce a syntax error ([io::ErrorKind::InvalidData]) bearing the message `s`
|
||||
pub fn io_syntax_error(s: &str) -> io::Error {
|
||||
io::Error::new(io::ErrorKind::InvalidData, s)
|
||||
}
|
||||
|
||||
/// True iff `e` is an [io::ErrorKind::InvalidData] (a syntax error)
|
||||
pub fn is_syntax_io_error(e: &io::Error) -> bool {
|
||||
matches!(e.kind(), io::ErrorKind::InvalidData)
|
||||
}
|
||||
|
|
|
@ -1,19 +1,38 @@
|
|||
//! Utilities for producing and flexibly parsing strings containing hexadecimal binary data.
|
||||
|
||||
/// Utility for parsing hex binary data from strings.
|
||||
pub enum HexParser {
|
||||
/// "Liberal" parsing simply ignores characters that are not (case-insensitive) hex digits.
|
||||
Liberal,
|
||||
/// "Whitespace allowed" parsing ignores whitespace, but fails a parse on anything other
|
||||
/// than hex or whitespace.
|
||||
WhitespaceAllowed,
|
||||
/// "Strict" parsing accepts only (case-insensitive) hex digits; no whitespace, no other
|
||||
/// characters.
|
||||
Strict,
|
||||
}
|
||||
|
||||
/// Utility for formatting binary data as hex.
|
||||
pub enum HexFormatter {
|
||||
/// Produces LF-separated lines with a maximum of `usize` hex digits in each line.
|
||||
Lines(usize),
|
||||
/// Simply packs hex digits in as tightly as possible.
|
||||
Packed,
|
||||
}
|
||||
|
||||
/// Convert a number 0..15 to a hex digit [char].
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// Panics if given `v` outside the range 0..15 inclusive.
|
||||
///
|
||||
pub fn hexdigit(v: u8) -> char {
|
||||
char::from_digit(v as u32, 16).expect("hexadecimal digit value")
|
||||
}
|
||||
|
||||
impl HexParser {
|
||||
/// Decode `s` according to the given rules for `self`; see [HexParser].
|
||||
/// If the parse fails, yield `None`.
|
||||
pub fn decode(&self, s: &str) -> Option<Vec<u8>> {
|
||||
let mut result = Vec::new();
|
||||
let mut buf: u8 = 0;
|
||||
|
@ -49,6 +68,7 @@ impl HexParser {
|
|||
}
|
||||
|
||||
impl HexFormatter {
|
||||
/// Encode `bs` according to the given rules for `self; see [HexFormatter].
|
||||
pub fn encode(&self, bs: &[u8]) -> String {
|
||||
match self {
|
||||
HexFormatter::Lines(max_line_length) => {
|
||||
|
|
|
@ -1,3 +1,9 @@
|
|||
#![doc = concat!(
|
||||
include_str!("../README.md"),
|
||||
"# What is Preserves?\n\n",
|
||||
include_str!("../doc/what-is-preserves.md"),
|
||||
)]
|
||||
|
||||
pub mod de;
|
||||
pub mod error;
|
||||
pub mod hex;
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
//! Support for Serde serialization of Rust data types into Preserves terms.
|
||||
|
||||
use super::value::boundary as B;
|
||||
use super::value::writer::{CompoundWriter, Writer};
|
||||
use super::value::IOValueDomainCodec;
|
||||
|
@ -7,11 +9,16 @@ pub use super::error::Error;
|
|||
type Result<T> = std::result::Result<T, Error>;
|
||||
|
||||
#[derive(Debug)]
|
||||
/// Serde serializer for Preserves-encoding Rust data. Construct via [Serializer::new], and use
|
||||
/// with [serde::Serialize::serialize] methods.
|
||||
pub struct Serializer<'w, W: Writer> {
|
||||
/// The underlying Preserves [writer][crate::value::writer::Writer].
|
||||
pub write: &'w mut W,
|
||||
}
|
||||
|
||||
impl<'w, W: Writer> Serializer<'w, W> {
|
||||
/// Construct a new [Serializer] targetting the given
|
||||
/// [writer][crate::value::writer::Writer].
|
||||
pub fn new(write: &'w mut W) -> Self {
|
||||
Serializer { write }
|
||||
}
|
||||
|
@ -22,6 +29,7 @@ enum SequenceVariant<W: Writer> {
|
|||
Record(W::RecWriter),
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct SerializeCompound<'a, 'w, W: Writer> {
|
||||
b: B::Type,
|
||||
i: B::Item,
|
||||
|
@ -29,6 +37,7 @@ pub struct SerializeCompound<'a, 'w, W: Writer> {
|
|||
c: SequenceVariant<W>,
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct SerializeDictionary<'a, 'w, W: Writer> {
|
||||
b: B::Type,
|
||||
ser: &'a mut Serializer<'w, W>,
|
||||
|
@ -442,6 +451,8 @@ impl<'a, 'w, W: Writer> serde::ser::SerializeSeq for SerializeCompound<'a, 'w, W
|
|||
}
|
||||
}
|
||||
|
||||
/// Convenience function for directly serializing a Serde-serializable `T` to the given
|
||||
/// `write`, a Preserves [writer][crate::value::writer::Writer].
|
||||
pub fn to_writer<W: Writer, T: Serialize + ?Sized>(write: &mut W, value: &T) -> Result<()> {
|
||||
Ok(value.serialize(&mut Serializer::new(write))?)
|
||||
}
|
||||
|
|
|
@ -1,7 +1,26 @@
|
|||
//! Serde support for serializing Rust collections as Preserves sets.
|
||||
//!
|
||||
//! Serde doesn't include sets in its data model, so we do some somewhat awful tricks to force
|
||||
//! things to come out the way we want them.
|
||||
//!
|
||||
//! # Example
|
||||
//!
|
||||
//! Annotate collection-valued fields that you want to (en|de)code as Preserves `Set`s with
|
||||
//! `#[serde(with = "preserves::set")]`:
|
||||
//!
|
||||
//! ```rust
|
||||
//! #[derive(serde::Serialize, serde::Deserialize)]
|
||||
//! struct Example {
|
||||
//! #[serde(with = "preserves::set")]
|
||||
//! items: preserves::value::Set<String>,
|
||||
//! }
|
||||
//! ```
|
||||
|
||||
use crate::value::{self, to_value, IOValue, UnwrappedIOValue};
|
||||
use serde::{Deserialize, Deserializer, Serialize, Serializer};
|
||||
use std::iter::IntoIterator;
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn serialize<S, T, Item>(s: T, serializer: S) -> Result<S::Ok, S::Error>
|
||||
where
|
||||
S: Serializer,
|
||||
|
@ -12,6 +31,7 @@ where
|
|||
UnwrappedIOValue::from(s).wrap().serialize(serializer)
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn deserialize<'de, D, T>(deserializer: D) -> Result<T, D::Error>
|
||||
where
|
||||
D: Deserializer<'de>,
|
||||
|
|
|
@ -1,5 +1,25 @@
|
|||
//! Serde support for serializing Rust data as Preserves symbols.
|
||||
//!
|
||||
//! Serde doesn't include symbols in its data model, so we do some somewhat awful tricks to
|
||||
//! force things to come out the way we want them.
|
||||
//!
|
||||
//! # Example
|
||||
//!
|
||||
//! Either use [Symbol] directly in your data types, or annotate [String]-valued fields that
|
||||
//! you want to (en|de)code as Preserves `Symbol`s with `#[serde(with = "preserves::symbol")]`:
|
||||
//!
|
||||
//! ```rust
|
||||
//! #[derive(serde::Serialize, serde::Deserialize)]
|
||||
//! struct Example {
|
||||
//! sym1: preserves::symbol::Symbol,
|
||||
//! #[serde(with = "preserves::symbol")]
|
||||
//! sym2: String,
|
||||
//! }
|
||||
//! ```
|
||||
|
||||
use crate::value::{IOValue, NestedValue};
|
||||
|
||||
/// Wrapper for a string to coerce its Preserves-serialization to `Symbol`.
|
||||
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone)]
|
||||
pub struct Symbol(pub String);
|
||||
|
||||
|
@ -26,6 +46,7 @@ impl<'de> serde::Deserialize<'de> for Symbol {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn serialize<S>(s: &str, serializer: S) -> Result<S::Ok, S::Error>
|
||||
where
|
||||
S: serde::Serializer,
|
||||
|
@ -34,6 +55,7 @@ where
|
|||
Symbol(s.to_string()).serialize(serializer)
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn deserialize<'de, D>(deserializer: D) -> Result<String, D::Error>
|
||||
where
|
||||
D: serde::Deserializer<'de>,
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
#![doc(hidden)]
|
||||
|
||||
#[derive(Default, Clone, Debug)]
|
||||
pub struct Type {
|
||||
pub closing: Option<Item>,
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
//! Support Serde deserialization of Rust data types from Preserves *values* (not syntax).
|
||||
|
||||
use crate::error::{Error, ExpectedKind, Received};
|
||||
use crate::value::repr::{Double, Float};
|
||||
use crate::value::{IOValue, Map, NestedValue, UnwrappedIOValue, Value};
|
||||
|
@ -7,10 +9,14 @@ use std::iter::Iterator;
|
|||
|
||||
pub type Result<T> = std::result::Result<T, Error>;
|
||||
|
||||
/// Serde deserializer for constructing Rust data from an in-memory Preserves value. Use
|
||||
/// [Deserializer::from_value] to construct instances, or [from_value] to deserialize single
|
||||
/// values directly.
|
||||
pub struct Deserializer<'de> {
|
||||
input: &'de IOValue,
|
||||
}
|
||||
|
||||
/// Deserialize a `T` from `v`, a Preserves [IOValue].
|
||||
pub fn from_value<'a, T>(v: &'a IOValue) -> Result<T>
|
||||
where
|
||||
T: Deserialize<'a>,
|
||||
|
@ -21,6 +27,7 @@ where
|
|||
}
|
||||
|
||||
impl<'de> Deserializer<'de> {
|
||||
/// Construct a Deserializer from `v`, an [IOValue].
|
||||
pub fn from_value(v: &'de IOValue) -> Self {
|
||||
Deserializer { input: v }
|
||||
}
|
||||
|
@ -331,6 +338,7 @@ impl<'de, 'a> serde::de::Deserializer<'de> for &'a mut Deserializer<'de> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct VecSeq<'a, 'de: 'a, I: Iterator<Item = &'de IOValue>> {
|
||||
iter: I,
|
||||
de: &'a mut Deserializer<'de>,
|
||||
|
@ -359,6 +367,7 @@ impl<'de, 'a, I: Iterator<Item = &'de IOValue>> SeqAccess<'de> for VecSeq<'a, 'd
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct DictMap<'a, 'de: 'a> {
|
||||
pending: Option<&'de IOValue>,
|
||||
iter: Box<dyn Iterator<Item = (&'de IOValue, &'de IOValue)> + 'a>,
|
||||
|
|
|
@ -1,3 +1,6 @@
|
|||
//! Traits for working with Preserves [embedded
|
||||
//! values](https://preserves.dev/preserves.html#embeddeds).
|
||||
|
||||
use std::io;
|
||||
|
||||
use super::packed;
|
||||
|
@ -9,10 +12,12 @@ use super::NestedValue;
|
|||
use super::Reader;
|
||||
use super::Writer;
|
||||
|
||||
/// Implementations parse [IOValue]s to their own particular [Embeddable] values of type `D`.
|
||||
pub trait DomainParse<D: Embeddable> {
|
||||
fn parse_embedded(&mut self, v: &IOValue) -> io::Result<D>;
|
||||
}
|
||||
|
||||
/// Implementations read and parse from `src` to produce [Embeddable] values of type `D`.
|
||||
pub trait DomainDecode<D: Embeddable> {
|
||||
fn decode_embedded<'de, 'src, S: BinarySource<'de>>(
|
||||
&mut self,
|
||||
|
@ -21,6 +26,7 @@ pub trait DomainDecode<D: Embeddable> {
|
|||
) -> io::Result<D>;
|
||||
}
|
||||
|
||||
/// Implementations unparse and write `D`s to `w`, a [writer][crate::value::writer::Writer].
|
||||
pub trait DomainEncode<D: Embeddable> {
|
||||
fn encode_embedded<W: Writer>(&mut self, w: &mut W, d: &D) -> io::Result<()>;
|
||||
}
|
||||
|
@ -41,6 +47,9 @@ impl<'a, D: Embeddable, T: DomainDecode<D>> DomainDecode<D> for &'a mut T {
|
|||
}
|
||||
}
|
||||
|
||||
/// Convenience codec: use this as embedded codec for encoding (only) when embedded values
|
||||
/// should be serialized as Preserves `String`s holding their Rust [std::fmt::Debug]
|
||||
/// representation.
|
||||
pub struct DebugDomainEncode;
|
||||
|
||||
impl<D: Embeddable> DomainEncode<D> for DebugDomainEncode {
|
||||
|
@ -49,6 +58,8 @@ impl<D: Embeddable> DomainEncode<D> for DebugDomainEncode {
|
|||
}
|
||||
}
|
||||
|
||||
/// Convenience codec: use this as embedded codec for decoding (only) when embedded values are
|
||||
/// expected to conform to the syntax implicit in their [std::str::FromStr] implementation.
|
||||
pub struct FromStrDomainParse;
|
||||
|
||||
impl<Err: Into<io::Error>, D: Embeddable + std::str::FromStr<Err = Err>> DomainParse<D>
|
||||
|
@ -59,6 +70,8 @@ impl<Err: Into<io::Error>, D: Embeddable + std::str::FromStr<Err = Err>> DomainP
|
|||
}
|
||||
}
|
||||
|
||||
/// Use this as embedded codec when embedded data are already [IOValue]s that can be directly
|
||||
/// serialized and deserialized without further transformation.
|
||||
pub struct IOValueDomainCodec;
|
||||
|
||||
impl DomainDecode<IOValue> for IOValueDomainCodec {
|
||||
|
@ -77,6 +90,7 @@ impl DomainEncode<IOValue> for IOValueDomainCodec {
|
|||
}
|
||||
}
|
||||
|
||||
/// Use this as embedded codec to forbid use of embedded values; an [io::Error] is signalled.
|
||||
pub struct NoEmbeddedDomainCodec;
|
||||
|
||||
impl<D: Embeddable> DomainDecode<D> for NoEmbeddedDomainCodec {
|
||||
|
@ -101,9 +115,12 @@ impl<D: Embeddable> DomainEncode<D> for NoEmbeddedDomainCodec {
|
|||
}
|
||||
}
|
||||
|
||||
/// If some `C` implements [DomainDecode] but not [DomainParse], or vice versa, use `ViaCodec`
|
||||
/// to promote the one to the other. Construct instances with [ViaCodec::new].
|
||||
pub struct ViaCodec<C>(C);
|
||||
|
||||
impl<C> ViaCodec<C> {
|
||||
/// Constructs a `ViaCodec` wrapper around an underlying codec of type `C`.
|
||||
pub fn new(c: C) -> Self {
|
||||
ViaCodec(c)
|
||||
}
|
||||
|
|
|
@ -1,3 +1,12 @@
|
|||
#![doc(hidden)]
|
||||
|
||||
//! A horrifying hack to Serde-serialize [IOValue] instances to Preserves *as themselves*.
|
||||
//!
|
||||
//! Frankly I think this portion of the codebase might not survive for long. I can't think of a
|
||||
//! better way of achieving this, but the drawbacks of having this functionality are *severe*.
|
||||
//!
|
||||
//! See <https://gitlab.com/preserves/preserves/-/issues/42>.
|
||||
|
||||
use super::repr::IOValue;
|
||||
|
||||
pub static MAGIC: &str = "$____Preserves_Serde_Magic";
|
||||
|
|
|
@ -1,8 +1,13 @@
|
|||
//! Implements the Preserves
|
||||
//! [merge](https://preserves.dev/preserves.html#appendix-merging-values) of values.
|
||||
|
||||
use super::Map;
|
||||
use super::NestedValue;
|
||||
use super::Record;
|
||||
use super::Value;
|
||||
|
||||
/// Merge two sequences of values according to [the
|
||||
/// specification](https://preserves.dev/preserves.html#appendix-merging-values).
|
||||
pub fn merge_seqs<N: NestedValue>(mut a: Vec<N>, mut b: Vec<N>) -> Option<Vec<N>> {
|
||||
if a.len() > b.len() {
|
||||
std::mem::swap(&mut a, &mut b);
|
||||
|
@ -16,6 +21,8 @@ pub fn merge_seqs<N: NestedValue>(mut a: Vec<N>, mut b: Vec<N>) -> Option<Vec<N>
|
|||
Some(r)
|
||||
}
|
||||
|
||||
/// Merge two values according to [the
|
||||
/// specification](https://preserves.dev/preserves.html#appendix-merging-values).
|
||||
pub fn merge2<N: NestedValue>(v: N, w: N) -> Option<N> {
|
||||
let (mut v_anns, v_val) = v.pieces();
|
||||
let (w_anns, w_val) = w.pieces();
|
||||
|
@ -52,6 +59,8 @@ pub fn merge2<N: NestedValue>(v: N, w: N) -> Option<N> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Merge several values into a single value according to [the
|
||||
/// specification](https://preserves.dev/preserves.html#appendix-merging-values).
|
||||
pub fn merge<N: NestedValue, I: IntoIterator<Item = N>>(vs: I) -> Option<N> {
|
||||
let mut vs = vs.into_iter();
|
||||
let mut v = vs.next().expect("at least one value in merge()");
|
||||
|
|
|
@ -1,3 +1,53 @@
|
|||
//! # Representing, reading, and writing Preserves `Value`s as Rust data
|
||||
//!
|
||||
//! ```
|
||||
//! use preserves::value::{IOValue, text, packed};
|
||||
//! let v: IOValue = text::iovalue_from_str("<hi>")?;
|
||||
//! let w: IOValue = packed::iovalue_from_bytes(b"\xb4\xb3\x02hi\x84")?;
|
||||
//! assert_eq!(v, w);
|
||||
//! assert_eq!(text::TextWriter::encode_iovalue(&v)?, "<hi>");
|
||||
//! assert_eq!(packed::PackedWriter::encode_iovalue(&v)?, b"\xb4\xb3\x02hi\x84");
|
||||
//! # Ok::<(), std::io::Error>(())
|
||||
//! ```
|
||||
//!
|
||||
//! Preserves `Value`s are categorized in the following way. The core representation type,
|
||||
//! [crate::value::repr::Value], reflects this structure. However, most of the time you will
|
||||
//! work with [IOValue] or some other implementation of trait [NestedValue], which augments an
|
||||
//! underlying [Value] with [*annotations*][crate::value::repr::Annotations] (e.g. comments) and fixes a strategy
|
||||
//! for memory management.
|
||||
//!
|
||||
#![doc = include_str!("../../doc/value-grammar.md")]
|
||||
//!
|
||||
//! ## Memory management
|
||||
//!
|
||||
//! Each implementation of [NestedValue] chooses a different point in the space of possible
|
||||
//! approaches to memory management for `Value`s.
|
||||
//!
|
||||
//! ##### `IOValue`
|
||||
//!
|
||||
//! The most commonly-used and versatile implementation, [IOValue], uses [std::sync::Arc] for
|
||||
//! internal links in compound `Value`s. Unlike many of the other implementations of
|
||||
//! [NestedValue], [IOValue] doesn't offer flexibility in the Rust data type to be used for
|
||||
//! Preserves [embedded values](https://preserves.dev/preserves.html#embeddeds): instead,
|
||||
//! embedded values in an [IOValue] are themselves [IOValue]s.
|
||||
//!
|
||||
//! ##### `ArcValue<D>`, `RcValue<D>`, and `PlainValue<D>`
|
||||
//!
|
||||
//! For control over the Rust type to use for embedded values, choose [ArcValue], [RcValue], or
|
||||
//! [PlainValue]. Use [ArcValue] when you wish to transfer values among threads. [RcValue] is
|
||||
//! more niche; it may be useful for complex terms that do not need to cross thread boundaries.
|
||||
//! [PlainValue] is even more niche: it does not use a reference-counted pointer type, meaning
|
||||
//! it does not offer any kind of aliasing or sharing among subterms at all.
|
||||
//!
|
||||
//! # Parsing, pretty-printing, encoding and decoding `Value`s
|
||||
//!
|
||||
//! Modules [reader] and [writer] supply generic [Reader] and [Writer] traits for parsing and
|
||||
//! unparsing Preserves data. Implementations of [Reader] and [Writer] connect Preserves data
|
||||
//! to specific transfer syntaxes:
|
||||
//!
|
||||
//! - module [packed] supplies tools for working with the machine-oriented binary syntax
|
||||
//! - module [text] supplies tools for working with human-readable text syntax
|
||||
|
||||
pub mod boundary;
|
||||
pub mod de;
|
||||
pub mod domain;
|
||||
|
@ -56,6 +106,7 @@ pub use text::TextReader;
|
|||
pub use text::TextWriter;
|
||||
pub use writer::Writer;
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn invert_map<A, B>(m: &Map<A, B>) -> Map<B, A>
|
||||
where
|
||||
A: Clone,
|
||||
|
|
|
@ -1,6 +1,9 @@
|
|||
//! Definitions of the tags used in the binary encoding.
|
||||
|
||||
use std::convert::{From, TryFrom};
|
||||
use std::io;
|
||||
|
||||
/// Rust representation of tags used in the binary encoding.
|
||||
#[derive(Debug, PartialEq, Eq)]
|
||||
pub enum Tag {
|
||||
False,
|
||||
|
@ -19,8 +22,9 @@ pub enum Tag {
|
|||
Dictionary,
|
||||
}
|
||||
|
||||
/// Error value representing failure to decode a byte into a [Tag].
|
||||
#[derive(Debug, PartialEq, Eq)]
|
||||
pub struct InvalidTag(u8);
|
||||
pub struct InvalidTag(pub u8);
|
||||
|
||||
impl From<InvalidTag> for io::Error {
|
||||
fn from(v: InvalidTag) -> Self {
|
||||
|
|
|
@ -1,3 +1,15 @@
|
|||
//! Implements the Preserves [machine-oriented binary
|
||||
//! syntax](https://preserves.dev/preserves-binary.html).
|
||||
//!
|
||||
//! The main entry points for reading are functions [iovalue_from_bytes],
|
||||
//! [annotated_iovalue_from_bytes], [from_bytes], and [annotated_from_bytes].
|
||||
//!
|
||||
//! The main entry points for writing are [PackedWriter::encode_iovalue] and
|
||||
//! [PackedWriter::encode].
|
||||
//!
|
||||
//! # Summary of Binary Syntax
|
||||
#![doc = include_str!("../../../doc/cheatsheet-binary-plaintext.md")]
|
||||
|
||||
pub mod constants;
|
||||
pub mod reader;
|
||||
pub mod writer;
|
||||
|
@ -9,6 +21,8 @@ use std::io;
|
|||
|
||||
use super::{BinarySource, DomainDecode, IOValue, IOValueDomainCodec, NestedValue, Reader};
|
||||
|
||||
/// Reads a value from the given byte vector `bs` using the binary encoding, discarding
|
||||
/// annotations.
|
||||
pub fn from_bytes<N: NestedValue, Dec: DomainDecode<N::Embedded>>(
|
||||
bs: &[u8],
|
||||
decode_embedded: Dec,
|
||||
|
@ -18,10 +32,13 @@ pub fn from_bytes<N: NestedValue, Dec: DomainDecode<N::Embedded>>(
|
|||
.demand_next(false)
|
||||
}
|
||||
|
||||
/// Reads an [IOValue] from the given byte vector `bs` using the binary encoding, discarding
|
||||
/// annotations.
|
||||
pub fn iovalue_from_bytes(bs: &[u8]) -> io::Result<IOValue> {
|
||||
from_bytes(bs, IOValueDomainCodec)
|
||||
}
|
||||
|
||||
/// As [from_bytes], but includes annotations.
|
||||
pub fn annotated_from_bytes<N: NestedValue, Dec: DomainDecode<N::Embedded>>(
|
||||
bs: &[u8],
|
||||
decode_embedded: Dec,
|
||||
|
@ -31,6 +48,7 @@ pub fn annotated_from_bytes<N: NestedValue, Dec: DomainDecode<N::Embedded>>(
|
|||
.demand_next(true)
|
||||
}
|
||||
|
||||
/// As [iovalue_from_bytes], but includes annotations.
|
||||
pub fn annotated_iovalue_from_bytes(bs: &[u8]) -> io::Result<IOValue> {
|
||||
annotated_from_bytes(bs, IOValueDomainCodec)
|
||||
}
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
//! Implementation of [Reader] for the binary encoding.
|
||||
|
||||
use crate::error::{self, io_syntax_error, is_eof_io_error, ExpectedKind, Received};
|
||||
|
||||
use num::bigint::BigInt;
|
||||
|
@ -18,6 +20,7 @@ use super::super::{
|
|||
};
|
||||
use super::constants::Tag;
|
||||
|
||||
/// The binary encoding Preserves reader.
|
||||
pub struct PackedReader<
|
||||
'de,
|
||||
'src,
|
||||
|
@ -25,7 +28,9 @@ pub struct PackedReader<
|
|||
Dec: DomainDecode<N::Embedded>,
|
||||
S: BinarySource<'de>,
|
||||
> {
|
||||
/// Underlying source of bytes.
|
||||
pub source: &'src mut S,
|
||||
/// Decoder for producing Rust values embedded in the binary data.
|
||||
pub decode_embedded: Dec,
|
||||
phantom: PhantomData<&'de N>,
|
||||
}
|
||||
|
@ -67,6 +72,7 @@ fn out_of_range<I: Into<BigInt>>(i: I) -> error::Error {
|
|||
impl<'de, 'src, N: NestedValue, Dec: DomainDecode<N::Embedded>, S: BinarySource<'de>>
|
||||
PackedReader<'de, 'src, N, Dec, S>
|
||||
{
|
||||
/// Construct a new reader from a byte source and embedded-value decoder.
|
||||
#[inline(always)]
|
||||
pub fn new(source: &'src mut S, decode_embedded: Dec) -> Self {
|
||||
PackedReader {
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
//! Implementation of [Writer] for the binary encoding.
|
||||
|
||||
use super::super::boundary as B;
|
||||
use super::super::suspendable::Suspendable;
|
||||
use super::super::DomainEncode;
|
||||
|
@ -13,9 +15,11 @@ use std::ops::DerefMut;
|
|||
|
||||
use super::super::writer::{varint, CompoundWriter, Writer};
|
||||
|
||||
/// The binary encoding Preserves writer.
|
||||
pub struct PackedWriter<W: io::Write>(Suspendable<W>);
|
||||
|
||||
impl PackedWriter<&mut Vec<u8>> {
|
||||
/// Encodes `v` to a byte vector.
|
||||
#[inline(always)]
|
||||
pub fn encode<N: NestedValue, Enc: DomainEncode<N::Embedded>>(
|
||||
enc: &mut Enc,
|
||||
|
@ -26,6 +30,7 @@ impl PackedWriter<&mut Vec<u8>> {
|
|||
Ok(buf)
|
||||
}
|
||||
|
||||
/// Encodes `v` to a byte vector.
|
||||
#[inline(always)]
|
||||
pub fn encode_iovalue(v: &IOValue) -> io::Result<Vec<u8>> {
|
||||
Self::encode(&mut IOValueDomainCodec, v)
|
||||
|
@ -33,26 +38,31 @@ impl PackedWriter<&mut Vec<u8>> {
|
|||
}
|
||||
|
||||
impl<W: io::Write> PackedWriter<W> {
|
||||
/// Construct a writer from the given byte sink `write`.
|
||||
#[inline(always)]
|
||||
pub fn new(write: W) -> Self {
|
||||
PackedWriter(Suspendable::new(write))
|
||||
}
|
||||
|
||||
/// Retrieve a mutable reference to the underlying byte sink.
|
||||
#[inline(always)]
|
||||
pub fn w(&mut self) -> &mut W {
|
||||
self.0.deref_mut()
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
#[inline(always)]
|
||||
pub fn write_byte(&mut self, b: u8) -> io::Result<()> {
|
||||
self.w().write_all(&[b])
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
#[inline(always)]
|
||||
pub fn write_integer(&mut self, bs: &[u8]) -> io::Result<()> {
|
||||
self.write_atom(Tag::SignedInteger, bs)
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
#[inline(always)]
|
||||
pub fn write_atom(&mut self, tag: Tag, bs: &[u8]) -> io::Result<()> {
|
||||
self.write_byte(tag.into())?;
|
||||
|
@ -60,17 +70,20 @@ impl<W: io::Write> PackedWriter<W> {
|
|||
self.w().write_all(bs)
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
#[inline(always)]
|
||||
pub fn suspend(&mut self) -> Self {
|
||||
PackedWriter(self.0.suspend())
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
#[inline(always)]
|
||||
pub fn resume(&mut self, other: Self) {
|
||||
self.0.resume(other.0)
|
||||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct BinaryOrderWriter(Vec<Vec<u8>>);
|
||||
|
||||
impl BinaryOrderWriter {
|
||||
|
@ -119,6 +132,7 @@ impl BinaryOrderWriter {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub trait WriteWriter: Writer {
|
||||
fn write_raw_bytes(&mut self, v: &[u8]) -> io::Result<()>;
|
||||
|
||||
|
|
|
@ -1,3 +1,6 @@
|
|||
//! Generic [Reader] trait for parsing Preserves [Value][crate::value::repr::Value]s,
|
||||
//! implemented by code that provides each specific transfer syntax.
|
||||
|
||||
use crate::error::{self, io_eof, ExpectedKind, Received};
|
||||
|
||||
use std::borrow::Cow;
|
||||
|
@ -18,59 +21,104 @@ use super::ViaCodec;
|
|||
|
||||
pub type ReaderResult<T> = std::result::Result<T, error::Error>;
|
||||
|
||||
/// Tokens produced when performing
|
||||
/// [SAX](https://en.wikipedia.org/wiki/Simple_API_for_XML)-style reading of terms.
|
||||
pub enum Token<N: NestedValue> {
|
||||
/// An embedded value was seen and completely decoded.
|
||||
Embedded(N::Embedded),
|
||||
/// An atomic value was seen and completely decoded.
|
||||
Atom(N),
|
||||
/// A compound value has been opened; its contents follow, and it will be terminated by
|
||||
/// [Token::End].
|
||||
Compound(CompoundClass),
|
||||
/// Closes a previously-opened compound value.
|
||||
End,
|
||||
}
|
||||
|
||||
/// Generic parser for Preserves.
|
||||
pub trait Reader<'de, N: NestedValue> {
|
||||
/// Retrieve the next parseable value or an indication of end-of-input.
|
||||
///
|
||||
/// Yields `Ok(Some(...))` if a complete value is available, `Ok(None)` if the end of
|
||||
/// stream has been reached, or `Err(...)` for parse or IO errors, including
|
||||
/// incomplete/partial input. See also [Reader::demand_next].
|
||||
fn next(&mut self, read_annotations: bool) -> io::Result<Option<N>>;
|
||||
|
||||
// Hiding these from the documentation for the moment because I don't want to have to
|
||||
// document the whole Boundary thing.
|
||||
#[doc(hidden)]
|
||||
fn open_record(&mut self, arity: Option<usize>) -> ReaderResult<B::Type>;
|
||||
#[doc(hidden)]
|
||||
fn open_sequence_or_set(&mut self) -> ReaderResult<B::Item>;
|
||||
#[doc(hidden)]
|
||||
fn open_sequence(&mut self) -> ReaderResult<()>;
|
||||
#[doc(hidden)]
|
||||
fn open_set(&mut self) -> ReaderResult<()>;
|
||||
#[doc(hidden)]
|
||||
fn open_dictionary(&mut self) -> ReaderResult<()>;
|
||||
#[doc(hidden)]
|
||||
fn boundary(&mut self, b: &B::Type) -> ReaderResult<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
// close_compound implies a b.shift(...) and a self.boundary(b).
|
||||
fn close_compound(&mut self, b: &mut B::Type, i: &B::Item) -> ReaderResult<bool>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn open_embedded(&mut self) -> ReaderResult<()>;
|
||||
#[doc(hidden)]
|
||||
fn close_embedded(&mut self) -> ReaderResult<()>;
|
||||
|
||||
/// Allows structured backtracking to an earlier stage in a parse. Useful for layering
|
||||
/// parser combinators atop a Reader.
|
||||
type Mark;
|
||||
/// Retrieve a marker for the current position in the input.
|
||||
fn mark(&mut self) -> io::Result<Self::Mark>;
|
||||
/// Seek the input to a previously-saved position.
|
||||
fn restore(&mut self, mark: &Self::Mark) -> io::Result<()>;
|
||||
|
||||
/// Get the next [SAX](https://en.wikipedia.org/wiki/Simple_API_for_XML)-style event,
|
||||
/// discarding annotations.
|
||||
///
|
||||
/// The `read_embedded_annotations` controls whether annotations are also skipped on
|
||||
/// *embedded* values or not.
|
||||
fn next_token(&mut self, read_embedded_annotations: bool) -> io::Result<Token<N>>;
|
||||
/// Get the next [SAX](https://en.wikipedia.org/wiki/Simple_API_for_XML)-style event, plus
|
||||
/// a vector containing any annotations that preceded it.
|
||||
fn next_annotations_and_token(&mut self) -> io::Result<(Vec<N>, Token<N>)>;
|
||||
|
||||
//---------------------------------------------------------------------------
|
||||
|
||||
/// Skips the next available complete value. Yields an error if no such value exists.
|
||||
fn skip_value(&mut self) -> io::Result<()> {
|
||||
// TODO efficient skipping in specific impls of this trait
|
||||
let _ = self.demand_next(false)?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Retrieve the next parseable value, treating end-of-input as an error.
|
||||
///
|
||||
/// Yields `Ok(...)` if a complete value is available or `Err(...)` for parse or IO errors,
|
||||
/// including incomplete/partial input or end of stream. See also [Reader::next].
|
||||
fn demand_next(&mut self, read_annotations: bool) -> io::Result<N> {
|
||||
self.next(read_annotations)?.ok_or_else(io_eof)
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `Boolean`, or an error otherwise.
|
||||
fn next_boolean(&mut self) -> ReaderResult<bool> {
|
||||
self.demand_next(false)?.value().to_boolean()
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `Float`, or an error otherwise.
|
||||
fn next_float(&mut self) -> ReaderResult<Float> {
|
||||
Ok(self.demand_next(false)?.value().to_float()?.to_owned())
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `Double`, or an error otherwise.
|
||||
fn next_double(&mut self) -> ReaderResult<Double> {
|
||||
Ok(self.demand_next(false)?.value().to_double()?.to_owned())
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `SignedInteger`, or an error otherwise.
|
||||
fn next_signedinteger(&mut self) -> ReaderResult<SignedInteger> {
|
||||
Ok(self
|
||||
.demand_next(false)?
|
||||
|
@ -79,64 +127,92 @@ pub trait Reader<'de, N: NestedValue> {
|
|||
.to_owned())
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [i8], or an error
|
||||
/// otherwise.
|
||||
fn next_i8(&mut self) -> ReaderResult<i8> {
|
||||
self.demand_next(false)?.value().to_i8()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [u8], or an error
|
||||
/// otherwise.
|
||||
fn next_u8(&mut self) -> ReaderResult<u8> {
|
||||
self.demand_next(false)?.value().to_u8()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [i16], or an error
|
||||
/// otherwise.
|
||||
fn next_i16(&mut self) -> ReaderResult<i16> {
|
||||
self.demand_next(false)?.value().to_i16()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [u16], or an error
|
||||
/// otherwise.
|
||||
fn next_u16(&mut self) -> ReaderResult<u16> {
|
||||
self.demand_next(false)?.value().to_u16()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [i32], or an error
|
||||
/// otherwise.
|
||||
fn next_i32(&mut self) -> ReaderResult<i32> {
|
||||
self.demand_next(false)?.value().to_i32()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [u32], or an error
|
||||
/// otherwise.
|
||||
fn next_u32(&mut self) -> ReaderResult<u32> {
|
||||
self.demand_next(false)?.value().to_u32()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [i64], or an error
|
||||
/// otherwise.
|
||||
fn next_i64(&mut self) -> ReaderResult<i64> {
|
||||
self.demand_next(false)?.value().to_i64()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [u64], or an error
|
||||
/// otherwise.
|
||||
fn next_u64(&mut self) -> ReaderResult<u64> {
|
||||
self.demand_next(false)?.value().to_u64()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [i128], or an error
|
||||
/// otherwise.
|
||||
fn next_i128(&mut self) -> ReaderResult<i128> {
|
||||
self.demand_next(false)?.value().to_i128()
|
||||
}
|
||||
/// Yields the next value, if it is a `SignedInteger` that fits in [u128], or an error
|
||||
/// otherwise.
|
||||
fn next_u128(&mut self) -> ReaderResult<u128> {
|
||||
self.demand_next(false)?.value().to_u128()
|
||||
}
|
||||
/// Yields the next value as an [f32], if it is a `Float`, or an error otherwise.
|
||||
fn next_f32(&mut self) -> ReaderResult<f32> {
|
||||
self.demand_next(false)?.value().to_f32()
|
||||
}
|
||||
/// Yields the next value as an [f64], if it is a `Double`, or an error otherwise.
|
||||
fn next_f64(&mut self) -> ReaderResult<f64> {
|
||||
self.demand_next(false)?.value().to_f64()
|
||||
}
|
||||
/// Yields the next value as a [char], if it is parseable by
|
||||
/// [Value::to_char][crate::value::Value::to_char], or an error otherwise.
|
||||
fn next_char(&mut self) -> ReaderResult<char> {
|
||||
self.demand_next(false)?.value().to_char()
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `String`, or an error otherwise.
|
||||
fn next_str(&mut self) -> ReaderResult<Cow<'de, str>> {
|
||||
Ok(Cow::Owned(
|
||||
self.demand_next(false)?.value().to_string()?.to_owned(),
|
||||
))
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `ByteString`, or an error otherwise.
|
||||
fn next_bytestring(&mut self) -> ReaderResult<Cow<'de, [u8]>> {
|
||||
Ok(Cow::Owned(
|
||||
self.demand_next(false)?.value().to_bytestring()?.to_owned(),
|
||||
))
|
||||
}
|
||||
|
||||
/// Yields the next value, if it is a `Symbol`, or an error otherwise.
|
||||
fn next_symbol(&mut self) -> ReaderResult<Cow<'de, str>> {
|
||||
Ok(Cow::Owned(
|
||||
self.demand_next(false)?.value().to_symbol()?.to_owned(),
|
||||
))
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
fn open_option(&mut self) -> ReaderResult<Option<B::Type>> {
|
||||
let b = self.open_record(None)?;
|
||||
let label: &str = &self.next_symbol()?;
|
||||
|
@ -153,6 +229,7 @@ pub trait Reader<'de, N: NestedValue> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
fn open_simple_record(&mut self, name: &str, arity: Option<usize>) -> ReaderResult<B::Type> {
|
||||
let b = self.open_record(arity)?;
|
||||
let label: &str = &self.next_symbol()?;
|
||||
|
@ -166,6 +243,7 @@ pub trait Reader<'de, N: NestedValue> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Constructs a [ConfiguredReader] set with the given value for `read_annotations`.
|
||||
fn configured(self, read_annotations: bool) -> ConfiguredReader<'de, N, Self>
|
||||
where
|
||||
Self: std::marker::Sized,
|
||||
|
@ -177,6 +255,7 @@ pub trait Reader<'de, N: NestedValue> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
fn ensure_more_expected(&mut self, b: &mut B::Type, i: &B::Item) -> ReaderResult<()> {
|
||||
if !self.close_compound(b, i)? {
|
||||
Ok(())
|
||||
|
@ -185,6 +264,7 @@ pub trait Reader<'de, N: NestedValue> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
fn ensure_complete(&mut self, mut b: B::Type, i: &B::Item) -> ReaderResult<()> {
|
||||
if !self.close_compound(&mut b, i)? {
|
||||
Err(error::Error::MissingCloseDelimiter)
|
||||
|
@ -254,16 +334,27 @@ impl<'r, 'de, N: NestedValue, R: Reader<'de, N>> Reader<'de, N> for &'r mut R {
|
|||
}
|
||||
}
|
||||
|
||||
/// Generic seekable stream of input bytes.
|
||||
pub trait BinarySource<'de>: Sized {
|
||||
/// Allows structured backtracking to an earlier position in an input.
|
||||
type Mark;
|
||||
/// Retrieve a marker for the current position in the input.
|
||||
fn mark(&mut self) -> io::Result<Self::Mark>;
|
||||
/// Seek the input to a previously-saved position.
|
||||
fn restore(&mut self, mark: &Self::Mark) -> io::Result<()>;
|
||||
|
||||
/// Skip the next byte.
|
||||
fn skip(&mut self) -> io::Result<()>;
|
||||
/// Returns the next byte without advancing over it.
|
||||
fn peek(&mut self) -> io::Result<u8>;
|
||||
/// Returns and consumes the next `count` bytes, which must all be available. Always yields
|
||||
/// exactly `count` bytes or an error.
|
||||
fn readbytes(&mut self, count: usize) -> io::Result<Cow<'de, [u8]>>;
|
||||
/// As [BinarySource::readbytes], but uses `bs` as destination for the read bytes as well
|
||||
/// as taking the size of `bs` as the count of bytes to read.
|
||||
fn readbytes_into(&mut self, bs: &mut [u8]) -> io::Result<()>;
|
||||
|
||||
/// Constructs a [PackedReader][super::PackedReader] that will read from `self`.
|
||||
fn packed<N: NestedValue, Dec: DomainDecode<N::Embedded>>(
|
||||
&mut self,
|
||||
decode_embedded: Dec,
|
||||
|
@ -271,12 +362,14 @@ pub trait BinarySource<'de>: Sized {
|
|||
super::PackedReader::new(self, decode_embedded)
|
||||
}
|
||||
|
||||
/// Constructs a [PackedReader][super::PackedReader] that will read [IOValue]s from `self`.
|
||||
fn packed_iovalues(
|
||||
&mut self,
|
||||
) -> super::PackedReader<'de, '_, IOValue, IOValueDomainCodec, Self> {
|
||||
self.packed(IOValueDomainCodec)
|
||||
}
|
||||
|
||||
/// Constructs a [TextReader][super::TextReader] that will read from `self`.
|
||||
fn text<N: NestedValue, Dec: DomainParse<N::Embedded>>(
|
||||
&mut self,
|
||||
decode_embedded: Dec,
|
||||
|
@ -284,6 +377,7 @@ pub trait BinarySource<'de>: Sized {
|
|||
super::TextReader::new(self, decode_embedded)
|
||||
}
|
||||
|
||||
/// Constructs a [TextReader][super::TextReader] that will read [IOValue]s from `self`.
|
||||
fn text_iovalues(
|
||||
&mut self,
|
||||
) -> super::TextReader<'de, '_, IOValue, ViaCodec<IOValueDomainCodec>, Self> {
|
||||
|
@ -291,12 +385,18 @@ pub trait BinarySource<'de>: Sized {
|
|||
}
|
||||
}
|
||||
|
||||
/// Implementation of [BinarySource] backed by an [`io::Read`]` + `[`io::Seek`] implementation.
|
||||
pub struct IOBinarySource<R: io::Read + io::Seek> {
|
||||
/// The underlying byte source.
|
||||
pub read: R,
|
||||
#[doc(hidden)]
|
||||
/// One-place buffer for peeked bytes.
|
||||
pub buf: Option<u8>,
|
||||
}
|
||||
|
||||
impl<R: io::Read + io::Seek> IOBinarySource<R> {
|
||||
/// Constructs an [IOBinarySource] from the given [`io::Read`]` + `[`io::Seek`]
|
||||
/// implementation.
|
||||
#[inline(always)]
|
||||
pub fn new(read: R) -> Self {
|
||||
IOBinarySource { read, buf: None }
|
||||
|
@ -364,12 +464,17 @@ impl<'de, R: io::Read + io::Seek> BinarySource<'de> for IOBinarySource<R> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Implementation of [BinarySource] backed by a slice of [u8].
|
||||
pub struct BytesBinarySource<'de> {
|
||||
/// The underlying byte source.
|
||||
pub bytes: &'de [u8],
|
||||
#[doc(hidden)]
|
||||
/// Current position within `bytes`.
|
||||
pub index: usize,
|
||||
}
|
||||
|
||||
impl<'de> BytesBinarySource<'de> {
|
||||
/// Constructs a [BytesBinarySource] from the given `u8` slice.
|
||||
#[inline(always)]
|
||||
pub fn new(bytes: &'de [u8]) -> Self {
|
||||
BytesBinarySource { bytes, index: 0 }
|
||||
|
@ -432,21 +537,29 @@ impl<'de> BinarySource<'de> for BytesBinarySource<'de> {
|
|||
}
|
||||
}
|
||||
|
||||
/// A combination of a [Reader] with presets governing its operation.
|
||||
pub struct ConfiguredReader<'de, N: NestedValue, R: Reader<'de, N>> {
|
||||
/// The underlying [Reader].
|
||||
pub reader: R,
|
||||
/// Configuration as to whether to include or discard annotations while reading.
|
||||
pub read_annotations: bool,
|
||||
phantom: PhantomData<&'de N>,
|
||||
}
|
||||
|
||||
impl<'de, N: NestedValue, R: Reader<'de, N>> ConfiguredReader<'de, N, R> {
|
||||
/// Constructs a [ConfiguredReader] based on the given `reader`.
|
||||
pub fn new(reader: R) -> Self {
|
||||
reader.configured(true)
|
||||
}
|
||||
|
||||
/// Updates the `read_annotations` field of `self`.
|
||||
pub fn set_read_annotations(&mut self, read_annotations: bool) {
|
||||
self.read_annotations = read_annotations
|
||||
}
|
||||
|
||||
/// Retrieve the next parseable value, treating end-of-input as an error.
|
||||
///
|
||||
/// Delegates directly to [Reader::demand_next].
|
||||
pub fn demand_next(&mut self) -> io::Result<N> {
|
||||
self.reader.demand_next(self.read_annotations)
|
||||
}
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,6 +1,10 @@
|
|||
//! Support for Serde serialization of Rust data types into Preserves *values* (not syntax).
|
||||
|
||||
use crate::value::{repr::Record, IOValue, Map, Value};
|
||||
use serde::Serialize;
|
||||
|
||||
/// Empty/placeholder type for representing serialization errors: serialization to values
|
||||
/// cannot fail.
|
||||
#[derive(Debug)]
|
||||
pub enum Error {}
|
||||
impl serde::ser::Error for Error {
|
||||
|
@ -20,17 +24,22 @@ impl std::fmt::Display for Error {
|
|||
|
||||
type Result<T> = std::result::Result<T, Error>;
|
||||
|
||||
/// Serde serializer for converting Rust data to in-memory Preserves values, which can then be
|
||||
/// serialized using text or binary syntax, analyzed further, etc.
|
||||
pub struct Serializer;
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct SerializeDictionary {
|
||||
next_key: Option<IOValue>,
|
||||
items: Map<IOValue, IOValue>,
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct SerializeRecord {
|
||||
r: Record<IOValue>,
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub struct SerializeSequence {
|
||||
vec: Vec<IOValue>,
|
||||
}
|
||||
|
@ -359,6 +368,7 @@ impl serde::ser::SerializeSeq for SerializeSequence {
|
|||
}
|
||||
}
|
||||
|
||||
/// Convenience function for directly converting a Serde-serializable `T` to an [IOValue].
|
||||
pub fn to_value<T>(value: T) -> IOValue
|
||||
where
|
||||
T: Serialize,
|
||||
|
|
|
@ -1,3 +1,6 @@
|
|||
//! Representation of Preserves `SignedInteger`s as [i128]/[u128] (if they fit) or [BigInt] (if
|
||||
//! they don't).
|
||||
|
||||
use num::bigint::BigInt;
|
||||
use num::traits::cast::ToPrimitive;
|
||||
use num::traits::sign::Signed;
|
||||
|
@ -7,8 +10,10 @@ use std::convert::TryFrom;
|
|||
use std::convert::TryInto;
|
||||
use std::fmt;
|
||||
|
||||
// Invariant: if I128 can be used, it will be; otherwise, if U128 can
|
||||
// be used, it will be; otherwise, Big will be used.
|
||||
/// Internal representation of Preserves `SignedInteger`s.
|
||||
///
|
||||
/// Invariant: if I128 can be used, it will be; otherwise, if U128 can be used, it will be;
|
||||
/// otherwise, Big will be used.
|
||||
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
|
||||
pub enum SignedIntegerRepr {
|
||||
I128(i128),
|
||||
|
@ -16,6 +21,7 @@ pub enum SignedIntegerRepr {
|
|||
Big(Box<BigInt>),
|
||||
}
|
||||
|
||||
/// Main representation of Preserves `SignedInteger`s.
|
||||
#[derive(Clone, PartialEq, Eq, Hash)]
|
||||
pub struct SignedInteger(SignedIntegerRepr);
|
||||
|
||||
|
@ -87,18 +93,25 @@ impl PartialOrd for SignedInteger {
|
|||
}
|
||||
|
||||
impl SignedInteger {
|
||||
/// Extract the internal representation.
|
||||
pub fn repr(&self) -> &SignedIntegerRepr {
|
||||
&self.0
|
||||
}
|
||||
|
||||
/// Does this `SignedInteger` fit in an [i128]? (See also [the TryFrom instance for
|
||||
/// i128](#impl-TryFrom<%26SignedInteger>-for-i128).)
|
||||
pub fn is_i(&self) -> bool {
|
||||
matches!(self.0, SignedIntegerRepr::I128(_))
|
||||
}
|
||||
|
||||
/// Does this `SignedInteger` fit in a [u128], but not an [i128]? (See also [the TryFrom
|
||||
/// instance for u128](#impl-TryFrom<%26SignedInteger>-for-u128).)
|
||||
pub fn is_u(&self) -> bool {
|
||||
matches!(self.0, SignedIntegerRepr::U128(_))
|
||||
}
|
||||
|
||||
/// Does this `SignedInteger` fit neither in a [u128] nor an [i128]? (See also [the TryFrom
|
||||
/// instance for BigInt](#impl-From<%26'a+SignedInteger>-for-BigInt).)
|
||||
pub fn is_big(&self) -> bool {
|
||||
matches!(self.0, SignedIntegerRepr::Big(_))
|
||||
}
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
#![doc(hidden)]
|
||||
|
||||
use std::ops::{Deref, DerefMut};
|
||||
|
||||
pub enum Suspendable<T> {
|
||||
|
|
|
@ -1,3 +1,15 @@
|
|||
//! Implements the Preserves [human-oriented text
|
||||
//! syntax](https://preserves.dev/preserves-text.html).
|
||||
//!
|
||||
//! The main entry points for reading are functions [iovalue_from_str],
|
||||
//! [annotated_iovalue_from_str], [from_str], and [annotated_from_str].
|
||||
//!
|
||||
//! The main entry points for writing are [TextWriter::encode_iovalue] and
|
||||
//! [TextWriter::encode].
|
||||
//!
|
||||
//! # Summary of Text Syntax
|
||||
#![doc = include_str!("../../../doc/cheatsheet-text-plaintext.md")]
|
||||
|
||||
pub mod reader;
|
||||
pub mod writer;
|
||||
|
||||
|
@ -10,6 +22,7 @@ use std::io;
|
|||
|
||||
use super::{DomainParse, IOValue, IOValueDomainCodec, NestedValue, Reader, ViaCodec};
|
||||
|
||||
/// Reads a value from the given string using the text syntax, discarding annotations.
|
||||
pub fn from_str<N: NestedValue, Dec: DomainParse<N::Embedded>>(
|
||||
s: &str,
|
||||
decode_embedded: Dec,
|
||||
|
@ -17,10 +30,12 @@ pub fn from_str<N: NestedValue, Dec: DomainParse<N::Embedded>>(
|
|||
TextReader::new(&mut BytesBinarySource::new(s.as_bytes()), decode_embedded).demand_next(false)
|
||||
}
|
||||
|
||||
/// Reads an [IOValue] from the given string using the text syntax, discarding annotations.
|
||||
pub fn iovalue_from_str(s: &str) -> io::Result<IOValue> {
|
||||
from_str(s, ViaCodec::new(IOValueDomainCodec))
|
||||
}
|
||||
|
||||
/// As [from_str], but includes annotations.
|
||||
pub fn annotated_from_str<N: NestedValue, Dec: DomainParse<N::Embedded>>(
|
||||
s: &str,
|
||||
decode_embedded: Dec,
|
||||
|
@ -28,6 +43,7 @@ pub fn annotated_from_str<N: NestedValue, Dec: DomainParse<N::Embedded>>(
|
|||
TextReader::new(&mut BytesBinarySource::new(s.as_bytes()), decode_embedded).demand_next(true)
|
||||
}
|
||||
|
||||
/// As [iovalue_from_str], but includes annotations.
|
||||
pub fn annotated_iovalue_from_str(s: &str) -> io::Result<IOValue> {
|
||||
annotated_from_str(s, ViaCodec::new(IOValueDomainCodec))
|
||||
}
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
//! Implementation of [Reader] for the text syntax.
|
||||
|
||||
use crate::error::io_syntax_error;
|
||||
use crate::error::is_eof_io_error;
|
||||
use crate::error::syntax_error;
|
||||
|
@ -35,8 +37,11 @@ use std::io;
|
|||
use std::iter::FromIterator;
|
||||
use std::marker::PhantomData;
|
||||
|
||||
/// The text syntax Preserves reader.
|
||||
pub struct TextReader<'de, 'src, D: Embeddable, Dec: DomainParse<D>, S: BinarySource<'de>> {
|
||||
/// Underlying source of (utf8) bytes.
|
||||
pub source: &'src mut S,
|
||||
/// Decoder for producing Rust values embedded in the text.
|
||||
pub dec: Dec,
|
||||
phantom: PhantomData<&'de D>,
|
||||
}
|
||||
|
@ -56,6 +61,7 @@ fn append_codepoint(bs: &mut Vec<u8>, n: u32) -> io::Result<()> {
|
|||
impl<'de, 'src, D: Embeddable, Dec: DomainParse<D>, S: BinarySource<'de>>
|
||||
TextReader<'de, 'src, D, Dec, S>
|
||||
{
|
||||
/// Construct a new reader from a byte (utf8) source and embedded-value decoder.
|
||||
pub fn new(source: &'src mut S, dec: Dec) -> Self {
|
||||
TextReader {
|
||||
source,
|
||||
|
@ -155,6 +161,7 @@ impl<'de, 'src, D: Embeddable, Dec: DomainParse<D>, S: BinarySource<'de>>
|
|||
}
|
||||
}
|
||||
|
||||
/// Retrieve the next [IOValue] in the input stream.
|
||||
pub fn next_iovalue(&mut self, read_annotations: bool) -> io::Result<IOValue> {
|
||||
let mut r = TextReader::new(self.source, ViaCodec::new(IOValueDomainCodec));
|
||||
let v = r.demand_next(read_annotations)?;
|
||||
|
|
|
@ -1,3 +1,5 @@
|
|||
//! Implementation of [Writer] for the text syntax.
|
||||
|
||||
use crate::hex::HexFormatter;
|
||||
use crate::value::suspendable::Suspendable;
|
||||
use crate::value::writer::CompoundWriter;
|
||||
|
@ -15,17 +17,26 @@ use std::io;
|
|||
|
||||
use super::super::boundary as B;
|
||||
|
||||
/// Specifies a comma style for printing using [TextWriter].
|
||||
#[derive(Clone, Copy, Debug)]
|
||||
pub enum CommaStyle {
|
||||
/// No commas will be printed. (Preserves text syntax treats commas as whitespace (!).)
|
||||
None,
|
||||
/// Commas will be used to separate subterms.
|
||||
Separating,
|
||||
/// Commas will be used to terminate subterms.
|
||||
Terminating,
|
||||
}
|
||||
|
||||
/// The (optionally pretty-printing) text syntax Preserves writer.
|
||||
pub struct TextWriter<W: io::Write> {
|
||||
w: Suspendable<W>,
|
||||
/// Selects a comma style to use when printing.
|
||||
pub comma_style: CommaStyle,
|
||||
/// Specifies indentation to use when pretty-printing; 0 disables pretty-printing.
|
||||
pub indentation: usize,
|
||||
/// An aid to use of printed terms in shell scripts: set `true` to escape spaces embedded
|
||||
/// in strings and symbols.
|
||||
pub escape_spaces: bool,
|
||||
indent: String,
|
||||
}
|
||||
|
@ -37,6 +48,8 @@ impl std::default::Default for CommaStyle {
|
|||
}
|
||||
|
||||
impl TextWriter<&mut Vec<u8>> {
|
||||
/// Writes `v` to `f` using text syntax. Selects indentation mode based on
|
||||
/// [`f.alternate()`][std::fmt::Formatter::alternate].
|
||||
pub fn fmt_value<N: NestedValue, Enc: DomainEncode<N::Embedded>>(
|
||||
f: &mut std::fmt::Formatter<'_>,
|
||||
enc: &mut Enc,
|
||||
|
@ -52,6 +65,7 @@ impl TextWriter<&mut Vec<u8>> {
|
|||
.map_err(|_| io::Error::new(io::ErrorKind::Other, "could not append to Formatter"))
|
||||
}
|
||||
|
||||
/// Encode `v` to a [String].
|
||||
pub fn encode<N: NestedValue, Enc: DomainEncode<N::Embedded>>(
|
||||
enc: &mut Enc,
|
||||
v: &N,
|
||||
|
@ -61,12 +75,14 @@ impl TextWriter<&mut Vec<u8>> {
|
|||
Ok(String::from_utf8(buf).expect("valid UTF-8 from TextWriter"))
|
||||
}
|
||||
|
||||
/// Encode `v` to a [String].
|
||||
pub fn encode_iovalue(v: &IOValue) -> io::Result<String> {
|
||||
Self::encode(&mut IOValueDomainCodec, v)
|
||||
}
|
||||
}
|
||||
|
||||
impl<W: io::Write> TextWriter<W> {
|
||||
/// Construct a writer from the given byte sink `w`.
|
||||
pub fn new(w: W) -> Self {
|
||||
TextWriter {
|
||||
w: Suspendable::new(w),
|
||||
|
@ -77,16 +93,19 @@ impl<W: io::Write> TextWriter<W> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Update selected comma-printing style.
|
||||
pub fn set_comma_style(mut self, v: CommaStyle) -> Self {
|
||||
self.comma_style = v;
|
||||
self
|
||||
}
|
||||
|
||||
/// Update selected space-escaping style.
|
||||
pub fn set_escape_spaces(mut self, v: bool) -> Self {
|
||||
self.escape_spaces = v;
|
||||
self
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn suspend(&mut self) -> Self {
|
||||
TextWriter {
|
||||
w: self.w.suspend(),
|
||||
|
@ -95,10 +114,12 @@ impl<W: io::Write> TextWriter<W> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn resume(&mut self, other: Self) {
|
||||
self.w.resume(other.w)
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn write_stringlike_char_fallback<F>(&mut self, c: char, f: F) -> io::Result<()>
|
||||
where
|
||||
F: FnOnce(&mut W, char) -> io::Result<()>,
|
||||
|
@ -114,22 +135,26 @@ impl<W: io::Write> TextWriter<W> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn write_stringlike_char(&mut self, c: char) -> io::Result<()> {
|
||||
self.write_stringlike_char_fallback(c, |w, c| write!(w, "{}", c))
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn add_indent(&mut self) {
|
||||
for _ in 0..self.indentation {
|
||||
self.indent.push(' ')
|
||||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn del_indent(&mut self) {
|
||||
if self.indentation > 0 {
|
||||
self.indent.truncate(self.indent.len() - self.indentation)
|
||||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn indent(&mut self) -> io::Result<()> {
|
||||
if self.indentation > 0 {
|
||||
write!(self.w, "{}", &self.indent)
|
||||
|
@ -138,6 +163,7 @@ impl<W: io::Write> TextWriter<W> {
|
|||
}
|
||||
}
|
||||
|
||||
#[doc(hidden)]
|
||||
pub fn indent_sp(&mut self) -> io::Result<()> {
|
||||
if self.indentation > 0 {
|
||||
write!(self.w, "{}", &self.indent)
|
||||
|
@ -146,6 +172,7 @@ impl<W: io::Write> TextWriter<W> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Borrow the underlying byte sink.
|
||||
pub fn borrow_write(&mut self) -> &mut W {
|
||||
&mut self.w
|
||||
}
|
||||
|
|
|
@ -1,3 +1,6 @@
|
|||
//! Generic [Writer] trait for unparsing Preserves [Value]s, implemented by code that provides
|
||||
//! each specific transfer syntax.
|
||||
|
||||
use super::boundary as B;
|
||||
use super::repr::{Double, Float, NestedValue, Value};
|
||||
use super::signed_integer::SignedIntegerRepr;
|
||||
|
@ -5,61 +8,103 @@ use super::DomainEncode;
|
|||
use num::bigint::BigInt;
|
||||
use std::io;
|
||||
|
||||
#[doc(hidden)]
|
||||
/// Utility trait for tracking unparser state during production of compound `Value`s.
|
||||
pub trait CompoundWriter: Writer {
|
||||
fn boundary(&mut self, b: &B::Type) -> io::Result<()>;
|
||||
}
|
||||
|
||||
/// Generic unparser for Preserves.
|
||||
pub trait Writer: Sized {
|
||||
// Hiding these from the documentation for the moment because I don't want to have to
|
||||
// document the whole Boundary thing.
|
||||
#[doc(hidden)]
|
||||
type AnnWriter: CompoundWriter;
|
||||
#[doc(hidden)]
|
||||
type RecWriter: CompoundWriter;
|
||||
#[doc(hidden)]
|
||||
type SeqWriter: CompoundWriter;
|
||||
#[doc(hidden)]
|
||||
type SetWriter: CompoundWriter;
|
||||
#[doc(hidden)]
|
||||
type DictWriter: CompoundWriter;
|
||||
#[doc(hidden)]
|
||||
type EmbeddedWriter: Writer;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn start_annotations(&mut self) -> io::Result<Self::AnnWriter>;
|
||||
#[doc(hidden)]
|
||||
fn end_annotations(&mut self, ann: Self::AnnWriter) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn write_bool(&mut self, v: bool) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn write_f32(&mut self, v: f32) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_f64(&mut self, v: f64) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn write_i8(&mut self, v: i8) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_u8(&mut self, v: u8) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_i16(&mut self, v: i16) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_u16(&mut self, v: u16) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_i32(&mut self, v: i32) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_u32(&mut self, v: u32) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_i64(&mut self, v: i64) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_u64(&mut self, v: u64) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_i128(&mut self, v: i128) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_u128(&mut self, v: u128) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_int(&mut self, v: &BigInt) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn write_string(&mut self, v: &str) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_bytes(&mut self, v: &[u8]) -> io::Result<()>;
|
||||
#[doc(hidden)]
|
||||
fn write_symbol(&mut self, v: &str) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn start_record(&mut self, field_count: Option<usize>) -> io::Result<Self::RecWriter>;
|
||||
#[doc(hidden)]
|
||||
fn end_record(&mut self, rec: Self::RecWriter) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn start_sequence(&mut self, item_count: Option<usize>) -> io::Result<Self::SeqWriter>;
|
||||
#[doc(hidden)]
|
||||
fn end_sequence(&mut self, seq: Self::SeqWriter) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn start_set(&mut self, item_count: Option<usize>) -> io::Result<Self::SetWriter>;
|
||||
#[doc(hidden)]
|
||||
fn end_set(&mut self, set: Self::SetWriter) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn start_dictionary(&mut self, entry_count: Option<usize>) -> io::Result<Self::DictWriter>;
|
||||
#[doc(hidden)]
|
||||
fn end_dictionary(&mut self, dict: Self::DictWriter) -> io::Result<()>;
|
||||
|
||||
#[doc(hidden)]
|
||||
fn start_embedded(&mut self) -> io::Result<Self::EmbeddedWriter>;
|
||||
#[doc(hidden)]
|
||||
fn end_embedded(&mut self, ptr: Self::EmbeddedWriter) -> io::Result<()>;
|
||||
|
||||
/// Flushes any buffered output.
|
||||
fn flush(&mut self) -> io::Result<()>;
|
||||
|
||||
//---------------------------------------------------------------------------
|
||||
|
||||
/// Writes [NestedValue] `v` to the output of this [Writer].
|
||||
fn write<N: NestedValue, Enc: DomainEncode<N::Embedded>>(
|
||||
&mut self,
|
||||
enc: &mut Enc,
|
||||
|
@ -88,6 +133,7 @@ pub trait Writer: Sized {
|
|||
Ok(())
|
||||
}
|
||||
|
||||
/// Writes [Value] `v` to the output of this [Writer].
|
||||
fn write_value<N: NestedValue, Enc: DomainEncode<N::Embedded>>(
|
||||
&mut self,
|
||||
enc: &mut Enc,
|
||||
|
@ -167,6 +213,13 @@ pub trait Writer: Sized {
|
|||
}
|
||||
}
|
||||
|
||||
/// Writes a [varint](https://protobuf.dev/programming-guides/encoding/#varints) to `w`.
|
||||
/// Returns the number of bytes written.
|
||||
///
|
||||
/// ```text
|
||||
/// varint(n) = [n] if n < 128
|
||||
/// [(n & 127) | 128] ++ varint(n >> 7) if n ≥ 128
|
||||
/// ```
|
||||
pub fn varint<W: io::Write>(w: &mut W, mut v: u64) -> io::Result<usize> {
|
||||
let mut byte_count = 0;
|
||||
loop {
|
||||
|
|
|
@ -4,7 +4,7 @@ title: "Preserves Schema"
|
|||
---
|
||||
|
||||
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
|
||||
February 2023. Version 0.3.1.
|
||||
October 2023. Version 0.3.3.
|
||||
|
||||
[abnf]: https://tools.ietf.org/html/rfc7405
|
||||
|
||||
|
@ -189,12 +189,14 @@ with algebraic data types would produce a labelled-sum-of-products type.
|
|||
|
||||
### Alternation definitions.
|
||||
|
||||
OrPattern = AltPattern "/" AltPattern *("/" AltPattern)
|
||||
OrPattern = [orsep] AltPattern 1*(orsep AltPattern) [orsep]
|
||||
orsep = 1*"/"
|
||||
|
||||
The right-hand-side of a definition may supply two or more
|
||||
*alternatives*. When parsing, the alternatives are tried in order; the
|
||||
result of the first successful alternative is the result of the entire
|
||||
parse.
|
||||
The right-hand-side of a definition may supply two or more *alternatives*.
|
||||
Alternatives are separated by any number of slashes `/`, and leading or
|
||||
trailing slashes are ignored. When parsing, the alternatives are tried in
|
||||
order; the result of the first successful alternative is the result of the
|
||||
entire parse.
|
||||
|
||||
**Host-language types.** The type corresponding to an `OrPattern` is an
|
||||
algebraic sum type, a union type, a variant type, or a concrete subclass
|
||||
|
@ -205,31 +207,39 @@ definition-unique *name*. The name is used to uniquely label the
|
|||
alternative's host-language representation (for example, a subclass, or
|
||||
a member of a tagged union type).
|
||||
|
||||
A variant name can either be given explicitly as `@name` (see discussion
|
||||
of `NamedPattern` below) or inferred. It can only be inferred from the
|
||||
label of a record pattern, from the name of a reference to another
|
||||
definition, or from the text of a "sufficiently identifierlike" literal
|
||||
pattern - one that matches a string, symbol, number or boolean:
|
||||
A variant name can either be given explicitly as `@name` or
|
||||
inferred.[^variant-names-unlike-binding-names] It can only be inferred
|
||||
from the label of a record pattern, from the name of a reference to
|
||||
another definition, or from the text of a "sufficiently identifierlike"
|
||||
literal pattern - one that matches a string, symbol, number or boolean:
|
||||
|
||||
AltPattern = "@" id SimplePattern
|
||||
AltPattern = "@" id Pattern
|
||||
/ "<" id PatternSequence ">"
|
||||
/ Ref
|
||||
/ LiteralPattern -- with a side condition
|
||||
|
||||
A host language will likely use the same ordering of its types as
|
||||
specified by the schema. It is therefore recommended to specify first
|
||||
the alternative best suited as a default initialization value (if
|
||||
[^variant-names-unlike-binding-names]: Note that explicitly-given
|
||||
*variant* names are unlike *binding* names in that binding names give
|
||||
rise to a field in the record type for a definition, while variant
|
||||
names are used as labels for alternatives in a sum type for a
|
||||
definition.
|
||||
|
||||
A host language will likely use the same ordering of variants in a sum
|
||||
type as specified by the schema. It is therefore recommended to specify
|
||||
first the alternative best suited as a default initialization value (if
|
||||
there is any).
|
||||
|
||||
### Intersection definitions.
|
||||
|
||||
AndPattern = NamedPattern "&" NamedPattern *("&" NamedPattern)
|
||||
AndPattern = [andsep] NamedPattern 1*(andsep NamedPattern) [andsep]
|
||||
andsep = 1*"&"
|
||||
|
||||
The right-hand-side of a definition may supply two or more patterns, the
|
||||
*intersection* of whose denotations is the denotation of the overall
|
||||
definition. When parsing, every pattern is tried: if all succeed, the
|
||||
resulting information is combined into a single type; otherwise, the
|
||||
overall parse fails.
|
||||
definition. The patterns are separated by any number of ampersands `&`,
|
||||
and leading or trailing ampersands are ignored. When parsing, every
|
||||
pattern is tried: if all succeed, the resulting information is combined
|
||||
into a single type; otherwise, the overall parse fails.
|
||||
|
||||
When serializing, the terms resulting from serializing at each pattern
|
||||
are *merged* together.
|
||||
|
|
|
@ -23,14 +23,29 @@ ABNF allows easy definition of US-ASCII-based languages. However,
|
|||
Preserves is a Unicode-based language. Therefore, we reinterpret ABNF as
|
||||
a grammar for recognising sequences of Unicode scalar values.
|
||||
|
||||
<a id="encoding"></a>
|
||||
**Encoding.** Textual syntax for a `Value` *SHOULD* be encoded using
|
||||
UTF-8 where possible.
|
||||
|
||||
<a id="whitespace"></a>
|
||||
**Whitespace.** Whitespace is defined as any number of spaces, tabs,
|
||||
carriage returns, line feeds, or commas.
|
||||
|
||||
ws = *(%x20 / %x09 / CR / LF / ",")
|
||||
|
||||
<a id="delimiters"></a>
|
||||
**Delimiters.** Some tokens (`Boolean`, `SymbolOrNumber`) *MUST* be
|
||||
followed by a `delimiter` or by the end of the input.[^delimiters-lookahead]
|
||||
|
||||
delimiter = ws
|
||||
/ "<" / ">" / "[" / "]" / "{" / "}"
|
||||
/ "#" / ":" / DQUOTE / "|" / "@" / ";"
|
||||
|
||||
[^delimiters-lookahead]: The addition of this constraint means that
|
||||
implementations must now use some kind of lookahead to make sure a
|
||||
delimiter follows a `Boolean`; this should not be onerous, as
|
||||
something similar is required to read `SymbolOrNumber`s correctly.
|
||||
|
||||
## Grammar
|
||||
|
||||
Standalone documents may have trailing whitespace.
|
||||
|
|
|
@ -109,7 +109,7 @@ label, then by field sequence.
|
|||
labels as specially-formatted lists.
|
||||
|
||||
[^iri-labels]: It is occasionally (but seldom) necessary to
|
||||
interpret such `Symbol` labels as UTF-8 encoded IRIs. Where a
|
||||
interpret such `Symbol` labels as IRIs. Where a
|
||||
label can be read as a relative IRI, it is notionally interpreted
|
||||
with respect to the IRI
|
||||
`urn:uuid:6bf094a6-20f1-4887-ada7-46834a9b5b34`; where a label can
|
||||
|
|
12
questions.md
12
questions.md
|
@ -5,10 +5,17 @@ title: "Open questions"
|
|||
Q. Should "symbols" instead be URIs? Relative, usually; relative to
|
||||
what? Some domain-specific base URI?
|
||||
|
||||
> No. They may be interpreted as URIs, of course; see
|
||||
> [here](preserves.html#fn:iri-labels).
|
||||
|
||||
Q. Literal small integers: are they pulling their weight? They're not
|
||||
absolutely necessary. A. No, they have been removed (as part of the changes
|
||||
at version 0.990).
|
||||
|
||||
> No. They were removed in the simplification of the syntax that was the
|
||||
> outcome of [issue
|
||||
> 41](https://gitlab.com/preserves/preserves/-/issues/41).
|
||||
|
||||
Q. Should we go for trying to make the data ordering line up with the
|
||||
encoding ordering? We'd have to only use streaming forms, and avoid
|
||||
the small integer encoding, and not store record arities, and sort
|
||||
|
@ -38,3 +45,8 @@ require any whitespace at all between elements of a list, making it
|
|||
ambiguous: does `[123]` denote a single-element or a three-element
|
||||
list? Compare JSON where `[1,2,3]` is unambiguously different from
|
||||
`[123]`.
|
||||
|
||||
> With the addition of the notion of
|
||||
> [delimiters](preserves-text.html#delimiters) to the text syntax, we at
|
||||
> least answer the question of how `[123]` parses: it must yield a
|
||||
> single-element list.
|
||||
|
|
Loading…
Reference in New Issue