Cosmetic.

This commit is contained in:
Tony Garnock-Jones 2019-07-03 19:35:56 -04:00
parent 4a70364eda
commit 7861341951
1 changed files with 31 additions and 31 deletions

View File

@ -182,7 +182,7 @@ equivalent compact machine-readable syntax.
`null` are all read as `Symbol`s, and that `SignedInteger`s are `null` are all read as `Symbol`s, and that `SignedInteger`s are
never read as `Double`s. never read as `Double`s.
### Character set ### Character set.
[ABNF][abnf] allows easy definition of US-ASCII-based languages. [ABNF][abnf] allows easy definition of US-ASCII-based languages.
However, Preserves is a Unicode-based language. Therefore, we However, Preserves is a Unicode-based language. Therefore, we
@ -192,7 +192,7 @@ code points.
Textual syntax for a `Value` *SHOULD* be encoded using UTF-8 where Textual syntax for a `Value` *SHOULD* be encoded using UTF-8 where
possible. possible.
### Whitespace ### Whitespace.
Whitespace is defined as any number of spaces, tabs, carriage returns, Whitespace is defined as any number of spaces, tabs, carriage returns,
line feeds, or commas. line feeds, or commas.
@ -200,7 +200,7 @@ line feeds, or commas.
ws = *(%x20 / %x09 / newline / ",") ws = *(%x20 / %x09 / newline / ",")
newline = CR / LF newline = CR / LF
### Grammar ### Grammar.
Standalone documents may have trailing whitespace. Standalone documents may have trailing whitespace.
@ -427,7 +427,7 @@ encoded details of the `Value` itself.
For a value `v`, we write `[[v]]` for the `Repr` of v. For a value `v`, we write `[[v]]` for the `Repr` of v.
### Type and Length representation ### Type and Length representation.
Each `Repr` takes one of three possible forms: Each `Repr` takes one of three possible forms:
@ -448,7 +448,7 @@ Each `Repr` takes one of three possible forms:
Applications may choose between formats B and C depending on their Applications may choose between formats B and C depending on their
needs at serialization time. needs at serialization time.
#### The lead byte #### The lead byte.
Every `Repr` starts with a *lead byte*, constructed by Every `Repr` starts with a *lead byte*, constructed by
`leadbyte(t,n,m)`, where `t`,`n`∈{0,1,2,3} and 0≤`m`<16: `leadbyte(t,n,m)`, where `t`,`n`∈{0,1,2,3} and 0≤`m`<16:
@ -469,11 +469,11 @@ representation:[^some-encodings-unused]
- `t`=2 (format B) represents a `Record`. - `t`=2 (format B) represents a `Record`.
- `t`=3 (format B) represents a `Sequence`, `Set` or `Dictionary`. - `t`=3 (format B) represents a `Sequence`, `Set` or `Dictionary`.
#### Encoding data of fixed length (format A) #### Encoding data of fixed length (format A).
Each specific type of data defines its own rules for this format. Each specific type of data defines its own rules for this format.
#### Encoding data of known length (format B) #### Encoding data of known length (format B).
A `Repr` where the length of the `Value` to be encoded is variable but A `Repr` where the length of the `Value` to be encoded is variable but
known uses the value of `m` in `leadbyte` to encode its length. The known uses the value of `m` in `leadbyte` to encode its length. The
@ -509,7 +509,7 @@ definition,
- 300 (binary, grouped into 7-bit chunks, `10 0101100`) varint-encodes to the two bytes 172 and 2. - 300 (binary, grouped into 7-bit chunks, `10 0101100`) varint-encodes to the two bytes 172 and 2.
- 1000000000 (binary `11 1011100 1101011 0010100 0000000`) varint-encodes to bytes 128, 148, 235, 220, and 3. - 1000000000 (binary `11 1011100 1101011 0010100 0000000`) varint-encodes to bytes 128, 148, 235, 220, and 3.
#### Streaming data of unknown length (format C) #### Streaming data of unknown length (format C).
A `Repr` where the length of the `Value` to be encoded is variable and A `Repr` where the length of the `Value` to be encoded is variable and
not known at the time serialization of the `Value` starts is encoded not known at the time serialization of the `Value` starts is encoded
@ -526,7 +526,7 @@ a format B `Repr` of a `ByteString`, no matter the type of the overall
For a `Repr` of a `Value` containing other `Value`s, each chunk is to For a `Repr` of a `Value` containing other `Value`s, each chunk is to
be a single `Repr`. be a single `Repr`.
### Records ### Records.
Format B (known length): Format B (known length):
@ -542,7 +542,7 @@ Format C (streaming):
Applications *SHOULD* prefer the known-length format for encoding Applications *SHOULD* prefer the known-length format for encoding
`Record`s. `Record`s.
#### Application-specific short form for labels #### Application-specific short form for labels.
Any given protocol using Preserves may additionally define an Any given protocol using Preserves may additionally define an
interpretation for `n`∈{0,1,2}, mapping each *short form label interpretation for `n`∈{0,1,2}, mapping each *short form label
@ -574,7 +574,7 @@ for format B, or
for format C. for format C.
### Sequences, Sets and Dictionaries ### Sequences, Sets and Dictionaries.
Format B (known length): Format B (known length):
@ -618,7 +618,7 @@ order.
Note that `header(3,3,m)` and `open(3,3)`/`close(3,3)` are unused and reserved. Note that `header(3,3,m)` and `open(3,3)`/`close(3,3)` are unused and reserved.
### SignedIntegers ### SignedIntegers.
Format B/A (known length/fixed-size): Format B/A (known length/fixed-size):
@ -653,7 +653,7 @@ For example,
[[ -127 ]] = 41 81 [[ 13 ]] = 41 0D [[ 65536 ]] = 43 01 00 00 [[ -127 ]] = 41 81 [[ 13 ]] = 41 0D [[ 65536 ]] = 43 01 00 00
[[ -4 ]] = 41 FC [[ 127 ]] = 41 7F [[ 131072 ]] = 43 02 00 00 [[ -4 ]] = 41 FC [[ 127 ]] = 41 7F [[ 131072 ]] = 43 02 00 00
### Strings, ByteStrings and Symbols ### Strings, ByteStrings and Symbols.
Syntax for these three types varies only in the value of `n` supplied Syntax for these three types varies only in the value of `n` supplied
to `header`, `open`, and `close`. In each case, the payload following to `header`, `open`, and `close`. In each case, the payload following
@ -676,19 +676,19 @@ then a sequence of zero or more format B chunks, followed by
While the overall content of a streamed `String` or `Symbol` must be While the overall content of a streamed `String` or `Symbol` must be
valid UTF-8, individual chunks do not have to conform to UTF-8. valid UTF-8, individual chunks do not have to conform to UTF-8.
### Fixed-length Atoms ### Fixed-length Atoms.
Fixed-length atoms all use format A, and do not have a length Fixed-length atoms all use format A, and do not have a length
representation. They repurpose the bits that format B `Repr`s use to representation. They repurpose the bits that format B `Repr`s use to
specify lengths. Applications *MUST NOT* use format C with specify lengths. Applications *MUST NOT* use format C with
`open(0,n)` or `close(0,n)` for any `n`. `open(0,n)` or `close(0,n)` for any `n`.
#### Booleans #### Booleans.
[[ #false ]] = header(0,0,0) = [0x00] [[ #false ]] = header(0,0,0) = [0x00]
[[ #true ]] = header(0,0,1) = [0x01] [[ #true ]] = header(0,0,1) = [0x01]
#### Floats and Doubles #### Floats and Doubles.
[[ F ]] when F ∈ Float = header(0,0,2) ++ binary32(F) [[ F ]] when F ∈ Float = header(0,0,2) ++ binary32(F)
[[ D ]] when D ∈ Double = header(0,0,3) ++ binary64(D) [[ D ]] when D ∈ Double = header(0,0,3) ++ binary64(D)
@ -698,7 +698,7 @@ The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
## Examples ## Examples
### Simple examples ### Simple examples.
<!-- TODO: Give some examples of large and small Preserves, perhaps --> <!-- TODO: Give some examples of large and small Preserves, perhaps -->
<!-- translated from various JSON blobs floating around the internet. --> <!-- translated from various JSON blobs floating around the internet. -->
@ -763,7 +763,7 @@ encodes to
--- ---
### JSON examples ### JSON examples.
The examples from The examples from
[RFC 8259](https://tools.ietf.org/html/rfc8259#section-13) read as [RFC 8259](https://tools.ietf.org/html/rfc8259#section-13) read as
@ -899,7 +899,7 @@ treat them specially.
and one which enforces validity (i.e. side-conditions) when reading, and one which enforces validity (i.e. side-conditions) when reading,
writing, or constructing `Value`s. writing, or constructing `Value`s.
### MIME-type tagged binary data ### MIME-type tagged binary data.
Many internet protocols use Many internet protocols use
[media types](https://tools.ietf.org/html/rfc6838) (a.k.a MIME types) [media types](https://tools.ietf.org/html/rfc6838) (a.k.a MIME types)
@ -928,7 +928,7 @@ form label number 1 were chosen, the second example above,
`mime(text/plain "ABC")`, would be encoded with "92" in place of "B3 `mime(text/plain "ABC")`, would be encoded with "92" in place of "B3
74 6D 69 6D 65". 74 6D 69 6D 65".
### Unicode normalization forms ### Unicode normalization forms.
Unicode defines multiple Unicode defines multiple
[normalization forms](http://unicode.org/reports/tr15/) for text. [normalization forms](http://unicode.org/reports/tr15/) for text.
@ -941,13 +941,13 @@ normalization form. A `NormalizedString` is a `Record` labelled with
underlying code point representation *MUST* be normalized according to underlying code point representation *MUST* be normalized according to
the named normalization form. the named normalization form.
### IRIs (URIs, URLs, URNs, etc.) ### IRIs (URIs, URLs, URNs, etc.).
An `IRI` is a `Record` labelled with `iri` and having one field, a An `IRI` is a `Record` labelled with `iri` and having one field, a
`String` which is the IRI itself and which *MUST* be a valid absolute `String` which is the IRI itself and which *MUST* be a valid absolute
or relative IRI. or relative IRI.
### Machine words ### Machine words.
The definition of `SignedInteger` captures all integers. However, in The definition of `SignedInteger` captures all integers. However, in
certain circumstances it can be valuable to assert that a number certain circumstances it can be valuable to assert that a number
@ -962,7 +962,7 @@ which *MUST* fall within the appropriate range. That is, to be valid,
- in `i16(`*x*`)`, -32768 <= *x* <= 32767. - in `i16(`*x*`)`, -32768 <= *x* <= 32767.
- etc. - etc.
### Anonymous Tuples and Unit ### Anonymous Tuples and Unit.
A `Tuple` is a `Record` with label `tuple` and zero or more fields, A `Tuple` is a `Record` with label `tuple` and zero or more fields,
denoting an anonymous tuple of values. denoting an anonymous tuple of values.
@ -970,14 +970,14 @@ denoting an anonymous tuple of values.
The 0-ary tuple, `tuple()`, denotes the empty tuple, sometimes called The 0-ary tuple, `tuple()`, denotes the empty tuple, sometimes called
"unit" or "void" (but *not* e.g. JavaScript's "undefined" value). "unit" or "void" (but *not* e.g. JavaScript's "undefined" value).
### Null and Undefined ### Null and Undefined.
Tony Hoare's Tony Hoare's
"[billion-dollar mistake](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)" "[billion-dollar mistake](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)"
can be represented with the 0-ary `Record` `null()`. An "undefined" can be represented with the 0-ary `Record` `null()`. An "undefined"
value can be represented as `undefined()`. value can be represented as `undefined()`.
### Dates and Times ### Dates and Times.
Dates, times, moments, and timestamps can be represented with a Dates, times, moments, and timestamps can be represented with a
`Record` with label `rfc3339` having a single field, a `String`, which `Record` with label `rfc3339` having a single field, a `String`, which
@ -1078,7 +1078,7 @@ When designing a language mapping, an important consideration is
roundtripping: serialization after deserialization, and vice versa, roundtripping: serialization after deserialization, and vice versa,
should both be identities. should both be identities.
### JavaScript ### JavaScript.
- `Boolean``Boolean` - `Boolean``Boolean`
- `Float` and `Double` ↔ numbers - `Float` and `Double` ↔ numbers
@ -1093,7 +1093,7 @@ should both be identities.
- `Set``{ "_set": M }` where `M` is a `Map` from the elements of the set to `true` - `Set``{ "_set": M }` where `M` is a `Map` from the elements of the set to `true`
- `Dictionary` ↔ a `Map` - `Dictionary` ↔ a `Map`
### Scheme/Racket ### Scheme/Racket.
- `Boolean` ↔ booleans - `Boolean` ↔ booleans
- `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats) - `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats)
@ -1106,7 +1106,7 @@ should both be identities.
- `Set` ↔ Racket: sets - `Set` ↔ Racket: sets
- `Dictionary` ↔ Racket: hash-table - `Dictionary` ↔ Racket: hash-table
### Java ### Java.
- `Boolean``Boolean` - `Boolean``Boolean`
- `Float` and `Double``Float` and `Double` - `Float` and `Double``Float` and `Double`
@ -1120,7 +1120,7 @@ should both be identities.
- `Set` ↔ an implementation of `java.util.Set` - `Set` ↔ an implementation of `java.util.Set`
- `Dictionary` ↔ an implementation of `java.util.Map` - `Dictionary` ↔ an implementation of `java.util.Map`
### Erlang ### Erlang.
- `Boolean``true` and `false` - `Boolean``true` and `false`
- `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision) - `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision)
@ -1143,7 +1143,7 @@ Erlang has no distinct string type, making for a trilemma where
`String`s are in danger of clashing with `ByteString`s, `Sequence`s, `String`s are in danger of clashing with `ByteString`s, `Sequence`s,
or `Record`s. or `Record`s.
### Python ### Python.
- `Boolean``True` and `False` - `Boolean``True` and `False`
- `Float` ↔ a `Float` wrapper-class for a double-precision value - `Float` ↔ a `Float` wrapper-class for a double-precision value
@ -1157,7 +1157,7 @@ or `Record`s.
- `Set``frozenset` (but accept `set` during encoding) - `Set``frozenset` (but accept `set` during encoding)
- `Dictionary` ↔ a hashable (immutable) dictionary-like thing (but accept `dict` during encoding) - `Dictionary` ↔ a hashable (immutable) dictionary-like thing (but accept `dict` during encoding)
### Squeak Smalltalk ### Squeak Smalltalk.
- `Boolean``true` and `false` - `Boolean``true` and `false`
- `Float` ↔ perhaps a subclass of `Float`? - `Float` ↔ perhaps a subclass of `Float`?