Cosmetic.

This commit is contained in:
Tony Garnock-Jones 2019-07-03 19:35:56 -04:00
parent 4a70364eda
commit 7861341951
1 changed files with 31 additions and 31 deletions

View File

@ -182,7 +182,7 @@ equivalent compact machine-readable syntax.
`null` are all read as `Symbol`s, and that `SignedInteger`s are
never read as `Double`s.
### Character set
### Character set.
[ABNF][abnf] allows easy definition of US-ASCII-based languages.
However, Preserves is a Unicode-based language. Therefore, we
@ -192,7 +192,7 @@ code points.
Textual syntax for a `Value` *SHOULD* be encoded using UTF-8 where
possible.
### Whitespace
### Whitespace.
Whitespace is defined as any number of spaces, tabs, carriage returns,
line feeds, or commas.
@ -200,7 +200,7 @@ line feeds, or commas.
ws = *(%x20 / %x09 / newline / ",")
newline = CR / LF
### Grammar
### Grammar.
Standalone documents may have trailing whitespace.
@ -427,7 +427,7 @@ encoded details of the `Value` itself.
For a value `v`, we write `[[v]]` for the `Repr` of v.
### Type and Length representation
### Type and Length representation.
Each `Repr` takes one of three possible forms:
@ -448,7 +448,7 @@ Each `Repr` takes one of three possible forms:
Applications may choose between formats B and C depending on their
needs at serialization time.
#### The lead byte
#### The lead byte.
Every `Repr` starts with a *lead byte*, constructed by
`leadbyte(t,n,m)`, where `t`,`n`∈{0,1,2,3} and 0≤`m`<16:
@ -469,11 +469,11 @@ representation:[^some-encodings-unused]
- `t`=2 (format B) represents a `Record`.
- `t`=3 (format B) represents a `Sequence`, `Set` or `Dictionary`.
#### Encoding data of fixed length (format A)
#### Encoding data of fixed length (format A).
Each specific type of data defines its own rules for this format.
#### Encoding data of known length (format B)
#### Encoding data of known length (format B).
A `Repr` where the length of the `Value` to be encoded is variable but
known uses the value of `m` in `leadbyte` to encode its length. The
@ -509,7 +509,7 @@ definition,
- 300 (binary, grouped into 7-bit chunks, `10 0101100`) varint-encodes to the two bytes 172 and 2.
- 1000000000 (binary `11 1011100 1101011 0010100 0000000`) varint-encodes to bytes 128, 148, 235, 220, and 3.
#### Streaming data of unknown length (format C)
#### Streaming data of unknown length (format C).
A `Repr` where the length of the `Value` to be encoded is variable and
not known at the time serialization of the `Value` starts is encoded
@ -526,7 +526,7 @@ a format B `Repr` of a `ByteString`, no matter the type of the overall
For a `Repr` of a `Value` containing other `Value`s, each chunk is to
be a single `Repr`.
### Records
### Records.
Format B (known length):
@ -542,7 +542,7 @@ Format C (streaming):
Applications *SHOULD* prefer the known-length format for encoding
`Record`s.
#### Application-specific short form for labels
#### Application-specific short form for labels.
Any given protocol using Preserves may additionally define an
interpretation for `n`∈{0,1,2}, mapping each *short form label
@ -574,7 +574,7 @@ for format B, or
for format C.
### Sequences, Sets and Dictionaries
### Sequences, Sets and Dictionaries.
Format B (known length):
@ -618,7 +618,7 @@ order.
Note that `header(3,3,m)` and `open(3,3)`/`close(3,3)` are unused and reserved.
### SignedIntegers
### SignedIntegers.
Format B/A (known length/fixed-size):
@ -653,7 +653,7 @@ For example,
[[ -127 ]] = 41 81 [[ 13 ]] = 41 0D [[ 65536 ]] = 43 01 00 00
[[ -4 ]] = 41 FC [[ 127 ]] = 41 7F [[ 131072 ]] = 43 02 00 00
### Strings, ByteStrings and Symbols
### Strings, ByteStrings and Symbols.
Syntax for these three types varies only in the value of `n` supplied
to `header`, `open`, and `close`. In each case, the payload following
@ -676,19 +676,19 @@ then a sequence of zero or more format B chunks, followed by
While the overall content of a streamed `String` or `Symbol` must be
valid UTF-8, individual chunks do not have to conform to UTF-8.
### Fixed-length Atoms
### Fixed-length Atoms.
Fixed-length atoms all use format A, and do not have a length
representation. They repurpose the bits that format B `Repr`s use to
specify lengths. Applications *MUST NOT* use format C with
`open(0,n)` or `close(0,n)` for any `n`.
#### Booleans
#### Booleans.
[[ #false ]] = header(0,0,0) = [0x00]
[[ #true ]] = header(0,0,1) = [0x01]
#### Floats and Doubles
#### Floats and Doubles.
[[ F ]] when F ∈ Float = header(0,0,2) ++ binary32(F)
[[ D ]] when D ∈ Double = header(0,0,3) ++ binary64(D)
@ -698,7 +698,7 @@ The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
## Examples
### Simple examples
### Simple examples.
<!-- TODO: Give some examples of large and small Preserves, perhaps -->
<!-- translated from various JSON blobs floating around the internet. -->
@ -763,7 +763,7 @@ encodes to
---
### JSON examples
### JSON examples.
The examples from
[RFC 8259](https://tools.ietf.org/html/rfc8259#section-13) read as
@ -899,7 +899,7 @@ treat them specially.
and one which enforces validity (i.e. side-conditions) when reading,
writing, or constructing `Value`s.
### MIME-type tagged binary data
### MIME-type tagged binary data.
Many internet protocols use
[media types](https://tools.ietf.org/html/rfc6838) (a.k.a MIME types)
@ -928,7 +928,7 @@ form label number 1 were chosen, the second example above,
`mime(text/plain "ABC")`, would be encoded with "92" in place of "B3
74 6D 69 6D 65".
### Unicode normalization forms
### Unicode normalization forms.
Unicode defines multiple
[normalization forms](http://unicode.org/reports/tr15/) for text.
@ -941,13 +941,13 @@ normalization form. A `NormalizedString` is a `Record` labelled with
underlying code point representation *MUST* be normalized according to
the named normalization form.
### IRIs (URIs, URLs, URNs, etc.)
### IRIs (URIs, URLs, URNs, etc.).
An `IRI` is a `Record` labelled with `iri` and having one field, a
`String` which is the IRI itself and which *MUST* be a valid absolute
or relative IRI.
### Machine words
### Machine words.
The definition of `SignedInteger` captures all integers. However, in
certain circumstances it can be valuable to assert that a number
@ -962,7 +962,7 @@ which *MUST* fall within the appropriate range. That is, to be valid,
- in `i16(`*x*`)`, -32768 <= *x* <= 32767.
- etc.
### Anonymous Tuples and Unit
### Anonymous Tuples and Unit.
A `Tuple` is a `Record` with label `tuple` and zero or more fields,
denoting an anonymous tuple of values.
@ -970,14 +970,14 @@ denoting an anonymous tuple of values.
The 0-ary tuple, `tuple()`, denotes the empty tuple, sometimes called
"unit" or "void" (but *not* e.g. JavaScript's "undefined" value).
### Null and Undefined
### Null and Undefined.
Tony Hoare's
"[billion-dollar mistake](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)"
can be represented with the 0-ary `Record` `null()`. An "undefined"
value can be represented as `undefined()`.
### Dates and Times
### Dates and Times.
Dates, times, moments, and timestamps can be represented with a
`Record` with label `rfc3339` having a single field, a `String`, which
@ -1078,7 +1078,7 @@ When designing a language mapping, an important consideration is
roundtripping: serialization after deserialization, and vice versa,
should both be identities.
### JavaScript
### JavaScript.
- `Boolean``Boolean`
- `Float` and `Double` ↔ numbers
@ -1093,7 +1093,7 @@ should both be identities.
- `Set``{ "_set": M }` where `M` is a `Map` from the elements of the set to `true`
- `Dictionary` ↔ a `Map`
### Scheme/Racket
### Scheme/Racket.
- `Boolean` ↔ booleans
- `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats)
@ -1106,7 +1106,7 @@ should both be identities.
- `Set` ↔ Racket: sets
- `Dictionary` ↔ Racket: hash-table
### Java
### Java.
- `Boolean``Boolean`
- `Float` and `Double``Float` and `Double`
@ -1120,7 +1120,7 @@ should both be identities.
- `Set` ↔ an implementation of `java.util.Set`
- `Dictionary` ↔ an implementation of `java.util.Map`
### Erlang
### Erlang.
- `Boolean``true` and `false`
- `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision)
@ -1143,7 +1143,7 @@ Erlang has no distinct string type, making for a trilemma where
`String`s are in danger of clashing with `ByteString`s, `Sequence`s,
or `Record`s.
### Python
### Python.
- `Boolean``True` and `False`
- `Float` ↔ a `Float` wrapper-class for a double-precision value
@ -1157,7 +1157,7 @@ or `Record`s.
- `Set``frozenset` (but accept `set` during encoding)
- `Dictionary` ↔ a hashable (immutable) dictionary-like thing (but accept `dict` during encoding)
### Squeak Smalltalk
### Squeak Smalltalk.
- `Boolean``true` and `false`
- `Float` ↔ perhaps a subclass of `Float`?