Cosmetic.
This commit is contained in:
parent
4a70364eda
commit
7861341951
62
preserves.md
62
preserves.md
|
@ -182,7 +182,7 @@ equivalent compact machine-readable syntax.
|
|||
`null` are all read as `Symbol`s, and that `SignedInteger`s are
|
||||
never read as `Double`s.
|
||||
|
||||
### Character set
|
||||
### Character set.
|
||||
|
||||
[ABNF][abnf] allows easy definition of US-ASCII-based languages.
|
||||
However, Preserves is a Unicode-based language. Therefore, we
|
||||
|
@ -192,7 +192,7 @@ code points.
|
|||
Textual syntax for a `Value` *SHOULD* be encoded using UTF-8 where
|
||||
possible.
|
||||
|
||||
### Whitespace
|
||||
### Whitespace.
|
||||
|
||||
Whitespace is defined as any number of spaces, tabs, carriage returns,
|
||||
line feeds, or commas.
|
||||
|
@ -200,7 +200,7 @@ line feeds, or commas.
|
|||
ws = *(%x20 / %x09 / newline / ",")
|
||||
newline = CR / LF
|
||||
|
||||
### Grammar
|
||||
### Grammar.
|
||||
|
||||
Standalone documents may have trailing whitespace.
|
||||
|
||||
|
@ -427,7 +427,7 @@ encoded details of the `Value` itself.
|
|||
|
||||
For a value `v`, we write `[[v]]` for the `Repr` of v.
|
||||
|
||||
### Type and Length representation
|
||||
### Type and Length representation.
|
||||
|
||||
Each `Repr` takes one of three possible forms:
|
||||
|
||||
|
@ -448,7 +448,7 @@ Each `Repr` takes one of three possible forms:
|
|||
Applications may choose between formats B and C depending on their
|
||||
needs at serialization time.
|
||||
|
||||
#### The lead byte
|
||||
#### The lead byte.
|
||||
|
||||
Every `Repr` starts with a *lead byte*, constructed by
|
||||
`leadbyte(t,n,m)`, where `t`,`n`∈{0,1,2,3} and 0≤`m`<16:
|
||||
|
@ -469,11 +469,11 @@ representation:[^some-encodings-unused]
|
|||
- `t`=2 (format B) represents a `Record`.
|
||||
- `t`=3 (format B) represents a `Sequence`, `Set` or `Dictionary`.
|
||||
|
||||
#### Encoding data of fixed length (format A)
|
||||
#### Encoding data of fixed length (format A).
|
||||
|
||||
Each specific type of data defines its own rules for this format.
|
||||
|
||||
#### Encoding data of known length (format B)
|
||||
#### Encoding data of known length (format B).
|
||||
|
||||
A `Repr` where the length of the `Value` to be encoded is variable but
|
||||
known uses the value of `m` in `leadbyte` to encode its length. The
|
||||
|
@ -509,7 +509,7 @@ definition,
|
|||
- 300 (binary, grouped into 7-bit chunks, `10 0101100`) varint-encodes to the two bytes 172 and 2.
|
||||
- 1000000000 (binary `11 1011100 1101011 0010100 0000000`) varint-encodes to bytes 128, 148, 235, 220, and 3.
|
||||
|
||||
#### Streaming data of unknown length (format C)
|
||||
#### Streaming data of unknown length (format C).
|
||||
|
||||
A `Repr` where the length of the `Value` to be encoded is variable and
|
||||
not known at the time serialization of the `Value` starts is encoded
|
||||
|
@ -526,7 +526,7 @@ a format B `Repr` of a `ByteString`, no matter the type of the overall
|
|||
For a `Repr` of a `Value` containing other `Value`s, each chunk is to
|
||||
be a single `Repr`.
|
||||
|
||||
### Records
|
||||
### Records.
|
||||
|
||||
Format B (known length):
|
||||
|
||||
|
@ -542,7 +542,7 @@ Format C (streaming):
|
|||
Applications *SHOULD* prefer the known-length format for encoding
|
||||
`Record`s.
|
||||
|
||||
#### Application-specific short form for labels
|
||||
#### Application-specific short form for labels.
|
||||
|
||||
Any given protocol using Preserves may additionally define an
|
||||
interpretation for `n`∈{0,1,2}, mapping each *short form label
|
||||
|
@ -574,7 +574,7 @@ for format B, or
|
|||
|
||||
for format C.
|
||||
|
||||
### Sequences, Sets and Dictionaries
|
||||
### Sequences, Sets and Dictionaries.
|
||||
|
||||
Format B (known length):
|
||||
|
||||
|
@ -618,7 +618,7 @@ order.
|
|||
|
||||
Note that `header(3,3,m)` and `open(3,3)`/`close(3,3)` are unused and reserved.
|
||||
|
||||
### SignedIntegers
|
||||
### SignedIntegers.
|
||||
|
||||
Format B/A (known length/fixed-size):
|
||||
|
||||
|
@ -653,7 +653,7 @@ For example,
|
|||
[[ -127 ]] = 41 81 [[ 13 ]] = 41 0D [[ 65536 ]] = 43 01 00 00
|
||||
[[ -4 ]] = 41 FC [[ 127 ]] = 41 7F [[ 131072 ]] = 43 02 00 00
|
||||
|
||||
### Strings, ByteStrings and Symbols
|
||||
### Strings, ByteStrings and Symbols.
|
||||
|
||||
Syntax for these three types varies only in the value of `n` supplied
|
||||
to `header`, `open`, and `close`. In each case, the payload following
|
||||
|
@ -676,19 +676,19 @@ then a sequence of zero or more format B chunks, followed by
|
|||
While the overall content of a streamed `String` or `Symbol` must be
|
||||
valid UTF-8, individual chunks do not have to conform to UTF-8.
|
||||
|
||||
### Fixed-length Atoms
|
||||
### Fixed-length Atoms.
|
||||
|
||||
Fixed-length atoms all use format A, and do not have a length
|
||||
representation. They repurpose the bits that format B `Repr`s use to
|
||||
specify lengths. Applications *MUST NOT* use format C with
|
||||
`open(0,n)` or `close(0,n)` for any `n`.
|
||||
|
||||
#### Booleans
|
||||
#### Booleans.
|
||||
|
||||
[[ #false ]] = header(0,0,0) = [0x00]
|
||||
[[ #true ]] = header(0,0,1) = [0x01]
|
||||
|
||||
#### Floats and Doubles
|
||||
#### Floats and Doubles.
|
||||
|
||||
[[ F ]] when F ∈ Float = header(0,0,2) ++ binary32(F)
|
||||
[[ D ]] when D ∈ Double = header(0,0,3) ++ binary64(D)
|
||||
|
@ -698,7 +698,7 @@ The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
|
|||
|
||||
## Examples
|
||||
|
||||
### Simple examples
|
||||
### Simple examples.
|
||||
|
||||
<!-- TODO: Give some examples of large and small Preserves, perhaps -->
|
||||
<!-- translated from various JSON blobs floating around the internet. -->
|
||||
|
@ -763,7 +763,7 @@ encodes to
|
|||
|
||||
---
|
||||
|
||||
### JSON examples
|
||||
### JSON examples.
|
||||
|
||||
The examples from
|
||||
[RFC 8259](https://tools.ietf.org/html/rfc8259#section-13) read as
|
||||
|
@ -899,7 +899,7 @@ treat them specially.
|
|||
and one which enforces validity (i.e. side-conditions) when reading,
|
||||
writing, or constructing `Value`s.
|
||||
|
||||
### MIME-type tagged binary data
|
||||
### MIME-type tagged binary data.
|
||||
|
||||
Many internet protocols use
|
||||
[media types](https://tools.ietf.org/html/rfc6838) (a.k.a MIME types)
|
||||
|
@ -928,7 +928,7 @@ form label number 1 were chosen, the second example above,
|
|||
`mime(text/plain "ABC")`, would be encoded with "92" in place of "B3
|
||||
74 6D 69 6D 65".
|
||||
|
||||
### Unicode normalization forms
|
||||
### Unicode normalization forms.
|
||||
|
||||
Unicode defines multiple
|
||||
[normalization forms](http://unicode.org/reports/tr15/) for text.
|
||||
|
@ -941,13 +941,13 @@ normalization form. A `NormalizedString` is a `Record` labelled with
|
|||
underlying code point representation *MUST* be normalized according to
|
||||
the named normalization form.
|
||||
|
||||
### IRIs (URIs, URLs, URNs, etc.)
|
||||
### IRIs (URIs, URLs, URNs, etc.).
|
||||
|
||||
An `IRI` is a `Record` labelled with `iri` and having one field, a
|
||||
`String` which is the IRI itself and which *MUST* be a valid absolute
|
||||
or relative IRI.
|
||||
|
||||
### Machine words
|
||||
### Machine words.
|
||||
|
||||
The definition of `SignedInteger` captures all integers. However, in
|
||||
certain circumstances it can be valuable to assert that a number
|
||||
|
@ -962,7 +962,7 @@ which *MUST* fall within the appropriate range. That is, to be valid,
|
|||
- in `i16(`*x*`)`, -32768 <= *x* <= 32767.
|
||||
- etc.
|
||||
|
||||
### Anonymous Tuples and Unit
|
||||
### Anonymous Tuples and Unit.
|
||||
|
||||
A `Tuple` is a `Record` with label `tuple` and zero or more fields,
|
||||
denoting an anonymous tuple of values.
|
||||
|
@ -970,14 +970,14 @@ denoting an anonymous tuple of values.
|
|||
The 0-ary tuple, `tuple()`, denotes the empty tuple, sometimes called
|
||||
"unit" or "void" (but *not* e.g. JavaScript's "undefined" value).
|
||||
|
||||
### Null and Undefined
|
||||
### Null and Undefined.
|
||||
|
||||
Tony Hoare's
|
||||
"[billion-dollar mistake](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)"
|
||||
can be represented with the 0-ary `Record` `null()`. An "undefined"
|
||||
value can be represented as `undefined()`.
|
||||
|
||||
### Dates and Times
|
||||
### Dates and Times.
|
||||
|
||||
Dates, times, moments, and timestamps can be represented with a
|
||||
`Record` with label `rfc3339` having a single field, a `String`, which
|
||||
|
@ -1078,7 +1078,7 @@ When designing a language mapping, an important consideration is
|
|||
roundtripping: serialization after deserialization, and vice versa,
|
||||
should both be identities.
|
||||
|
||||
### JavaScript
|
||||
### JavaScript.
|
||||
|
||||
- `Boolean` ↔ `Boolean`
|
||||
- `Float` and `Double` ↔ numbers
|
||||
|
@ -1093,7 +1093,7 @@ should both be identities.
|
|||
- `Set` ↔ `{ "_set": M }` where `M` is a `Map` from the elements of the set to `true`
|
||||
- `Dictionary` ↔ a `Map`
|
||||
|
||||
### Scheme/Racket
|
||||
### Scheme/Racket.
|
||||
|
||||
- `Boolean` ↔ booleans
|
||||
- `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats)
|
||||
|
@ -1106,7 +1106,7 @@ should both be identities.
|
|||
- `Set` ↔ Racket: sets
|
||||
- `Dictionary` ↔ Racket: hash-table
|
||||
|
||||
### Java
|
||||
### Java.
|
||||
|
||||
- `Boolean` ↔ `Boolean`
|
||||
- `Float` and `Double` ↔ `Float` and `Double`
|
||||
|
@ -1120,7 +1120,7 @@ should both be identities.
|
|||
- `Set` ↔ an implementation of `java.util.Set`
|
||||
- `Dictionary` ↔ an implementation of `java.util.Map`
|
||||
|
||||
### Erlang
|
||||
### Erlang.
|
||||
|
||||
- `Boolean` ↔ `true` and `false`
|
||||
- `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision)
|
||||
|
@ -1143,7 +1143,7 @@ Erlang has no distinct string type, making for a trilemma where
|
|||
`String`s are in danger of clashing with `ByteString`s, `Sequence`s,
|
||||
or `Record`s.
|
||||
|
||||
### Python
|
||||
### Python.
|
||||
|
||||
- `Boolean` ↔ `True` and `False`
|
||||
- `Float` ↔ a `Float` wrapper-class for a double-precision value
|
||||
|
@ -1157,7 +1157,7 @@ or `Record`s.
|
|||
- `Set` ↔ `frozenset` (but accept `set` during encoding)
|
||||
- `Dictionary` ↔ a hashable (immutable) dictionary-like thing (but accept `dict` during encoding)
|
||||
|
||||
### Squeak Smalltalk
|
||||
### Squeak Smalltalk.
|
||||
|
||||
- `Boolean` ↔ `true` and `false`
|
||||
- `Float` ↔ perhaps a subclass of `Float`?
|
||||
|
|
Loading…
Reference in New Issue