WIP from the early hours of this morning, adding textual syntax

2018-09-27 11:42:55 +01:00 · 2018-09-27 11:42:55 +01:00 · 6fa0dde8f4
parent 906f8a01b6
commit 6fa0dde8f4
1 changed files with 249 additions and 99 deletions
--- a/syndicate/mc/preserve.md
+++ b/syndicate/mc/preserve.md
@ -6,12 +6,13 @@
 # Preserves: an Expressive Data Language
 Tony Garnock-Jones <tonyg@leastfixedpoint.com>  
-September 2018. Version 0.0.2.
+September 2018. Version 0.0.3.
  [sexp.txt]: http://people.csail.mit.edu/rivest/Sexp.txt
  [spki]: http://world.std.com/~cme/html/spki.html
  [varint]: https://developers.google.com/protocol-buffers/docs/encoding#varints
  [erlang-map]: http://erlang.org/doc/reference_manual/data_types.html#map
  [abnf]: https://tools.ietf.org/html/rfc7405
 This document proposes a data model and serialization format called
 *Preserves*.
@ -47,7 +48,8 @@ structures of any particular implementation language.
 Taking inspiration from functional programming, we start with a
 definition of the *values* that we want to work with and give them
-meaning independent of their syntax. We will treat syntax separately,
+meaning independent of their syntax. When we write examples of values,
 we will do so using the [textual syntax](#textual-syntax) defined
 later in this document.
 Our `Value`s fall into two broad categories: *atomic* and *compound*
@ -94,8 +96,7 @@ neither is less than the other according to the total order.
 ### Signed integers.
 A `SignedInteger` is a signed integer of arbitrary width.
-`SignedInteger`s are compared as mathematical integers. We will write
+`SignedInteger`s are compared as mathematical integers.
 examples of `SignedInteger`s using standard mathematical notation.
 **Examples.** 10; -6; 0.
@ -107,8 +108,7 @@ examples of `SignedInteger`s using standard mathematical notation.
 A `String` is a sequence of Unicode
 [code-point](http://www.unicode.org/glossary/#code_point)s. `String`s
 are compared lexicographically, code-point by
-code-point.[^utf8-is-awesome] We will write examples of `String`s as
+code-point.[^utf8-is-awesome]
 text surrounded by quotes “`"`”.
  [^utf8-is-awesome]: Happily, the design of UTF-8 is such that this
    gives the same result as a lexicographic byte-by-byte comparison
@ -121,33 +121,27 @@ the string containing the three Unicode code-points `z` (0x7A), `水`
 ### Binary data.
 A `ByteString` is an ordered sequence of zero or more eight-bit bytes.
-`ByteString`s are compared lexicographically. We will only write
+`ByteString`s are compared lexicographically.
 examples of `ByteString`s that contain bytes denoting printable ASCII
 characters, using “`#"`” as an open-quote and “`"`” as a close-quote
 mark.
-**Examples.** The `ByteString` containing the integers 65, 66 and 67
+**Examples.** `#""`, the empty `ByteString`; `#"ABC"`, the
-(corresponding to ASCII characters `A`, `B` and `C`) is written as
+`ByteString` containing the integers 65, 66 and 67 (corresponding to
-`#"ABC"`. The empty `ByteString` is written as `#""`. **N.B.** Despite
+ASCII characters `A`, `B` and `C`). **N.B.** Despite appearances,
-appearances, these are *binary* data.
+these are *binary* data.
 ### Symbols.
 Programming languages like Lisp and Prolog frequently use string-like
 values called *symbols*. Here, a `Symbol` is, like a `String`, a
 sequence of Unicode code-points representing an identifier of some
-kind. `Symbol`s are also compared lexicographically by code-point. We
+kind. `Symbol`s are also compared lexicographically by code-point.
 will write examples including only non-empty sequences of
 non-whitespace characters, using a monospace font without quotation
 marks.
 **Examples.** `hello-world`; `utf8-string`; `exact-integer?`.
 ### Booleans.
 There are exactly two `Boolean` values, “false” and “true”. The
-“false” value compares less-than the “true” value. We write `#f` for
+“false” value compares less-than the “true” value. We write `#false`
-“false”, and `#t` for “true”.
+for “false”, and `#true` for “true”.
 ### IEEE floating-point values.
@ -159,11 +153,11 @@ every `Double`, and every `SignedInteger` is greater than both. Two
 `Float`s or two `Double`s are to be ordered by the `totalOrder`
 predicate defined in section 5.10 of
 [IEEE Std 754-2008](https://dx.doi.org/10.1109/IEEESTD.2008.4610935).
-We write examples using standard mathematical notation, avoiding NaN
+We write examples using a fractional part and/or an exponent to
-and infinities, using a suffix `f` or `d` to indicate `Float` or
+distinguish them from `SignedInteger`s. An additional suffix `f`
-`Double`, respectively.
+distinguishes `Float`s from `Double`s.
-**Examples.** 10f; -6d; 0f; 0.5d; -1.202e300d.
+**Examples.** 10.0f; -6.0; 0.0f; 0.5; -1.202e300.
 **Non-examples.** 10, -6, and 0, because writing them this way
 indicates `SignedInteger`s, not `Float`s or `Double`s.
@ -174,9 +168,7 @@ A `Record` is a *labelled* tuple of zero or more `Value`s, called the
 record's *fields*. A record's label is itself a `Value`, though it
 will usually be a `Symbol`.[^extensibility] [^iri-labels] `Record`s
 are compared lexicographically as if they were just tuples; that is,
-first by their labels, and then by the remainder of their fields. We
+first by their labels, and then by the remainder of their fields.
 will write examples of `Record`s as a parenthesised, space-separated
 sequence of their label `Value` followed by their field `Value`s.
  [^extensibility]: The [Racket](https://racket-lang.org/) programming
    language defines
@ -194,17 +186,16 @@ sequence of their label `Value` followed by their field `Value`s.
    it cannot be read as an IRI at all, and so the label simply stands
    for itself—for its own `Value`.
-**Examples.** The `Record` with label `foo` and fields 1, 2 and 3 is
+**Examples.** `foo(1 2 3)`, a `Record` with label `foo` and fields 1,
-written `(foo 1 2 3)`; the `Record` with label `void` and no fields is
+2 and 3; `void()`, a `Record` with label `void` and no fields.
 written `(void)`.
-**Non-examples.** `()`, because it lacks a label.
+**Non-examples.** `()`, because it lacks a label; `void`, because it
 lacks even an empty tuple of fields.
 ### Sequences.
 A `Sequence` is a general-purpose, variable-length ordered sequence of
-zero or more `Value`s. `Sequence`s are compared lexicographically. We
+zero or more `Value`s. `Sequence`s are compared lexicographically.
 write examples space-separated, surrounded with square brackets.
 **Examples.** `[]`, the empty sequence; `[1 2 3]`, the sequence of
 `SignedInteger`s 1, 2 and 3.
@ -215,25 +206,24 @@ A `Set` is an unordered finite set of `Value`s. It contains no
 duplicate values, following the [equivalence relation](#equivalence)
 induced by the total order on `Value`s. Two `Set`s are compared by
 sorting their elements ascending using the [total order](#total-order)
-and comparing the resulting `Sequence`s. We write examples
+and comparing the resulting `Sequence`s.
 space-separated, surrounded with curly braces, prefixed by `#set`.
 **Examples.** `#set{}`, the empty set; `#set{#set{}}`, the set
-containing only the empty set; `#set{4 "hello" (void) 9.0f}`, the set
+containing only the empty set; `{4 "hello" (void) 9.0f}`, the set
 containing 4, the string `"hello"`, the record with label `void` and
-no fields, and the `Float` denoting the number 9.0; `#set{1 1.0f}`,
+no fields, and the `Float` denoting the number 9.0; `{1 1.0f}`, the
-the set containing a `SignedInteger` and a `Float`; `#set{(mime
+set containing a `SignedInteger` and a `Float`; `{mime(application/xml
-application/xml #"<x/>") (mime application/xml #"<x />")}`, a set
+#"<x/>") mime(application/xml #"<x />")}`, a set containing two
-containing two different type-labelled byte
+different `mime` records.[^mime-xml-difference]
 arrays.[^mime-xml-difference]
  [^mime-xml-difference]: The two XML documents `<x/>` and `<x />`
    differ by bytewise comparison, and thus yield different record
    values, even though under the semantics of XML they denote
    identical XML infoset.
-**Non-examples.** `#set{1 1 1}`, because it contains multiple
+**Non-examples.** `{1 1}`, because it contains multiple equivalent
-equivalent `Value`s.
+`Value`s; `{}`, because without the `#set` marker, it denotes the
 empty dictionary.
 ### Dictionaries.
@ -241,27 +231,189 @@ A `Dictionary` is an unordered finite collection of pairs of `Value`s.
 Each pair comprises a *key* and a *value*. Keys in a `Dictionary` must
 be pairwise distinct. Instances of `Dictionary` are compared by
 lexicographic comparison of the sequences resulting from ordering each
-`Dictionary`'s pairs in ascending order by key. Examples are written
+`Dictionary`'s pairs in ascending order by key.
 as a `#dict`-prefixed, curly-brace-surrounded sequence of
 space-separated key-value pairs, each written with a colon between the
 key and value.
-**Examples.** `#dict{}`, the empty dictionary; `#dict{a:1}`, the
+**Examples.** `{}`, the empty dictionary; `{a: 1}`, the dictionary
-dictionary mapping the `Symbol` `a` to the `SignedInteger` 1;
+mapping the `Symbol` `a` to the `SignedInteger` 1; `{[1 2 3]: a}`,
-`#dict{[1 2 3]:a}`, mapping `[1 2 3]` to `a`; `#dict{"hi":0 hi:0
+mapping `[1 2 3]` to `a`; `{"hi": 0, hi: 0, there: []}`, having a
-there:[]}`, having a `String` and two `Symbol` keys, and
+`String` and two `Symbol` keys, and `SignedInteger` and `Sequence`
-`SignedInteger` and `Sequence` values.
+values.
-**Non-examples.** `#dict{a:1 b:2 a:3}`, because it contains duplicate
+**Non-examples.** `{a:1 b:2 a:3}`, because it contains duplicate
-keys; `#dict{[7 8]:[] [7 8]:99}`, for the same reason.
+keys; `{[7 8]:[] [7 8]:99}`, for the same reason.
-## Syntax
+## Textual Syntax
 Now we have discussed `Value`s and their meanings, we may turn to
 techniques for *representing* `Value`s for communication or storage.
-For now, we limit our attention to an easily-parsed, easily-produced
+In this section, we use [case-sensitive ABNF][abnf] to define a
-machine-readable syntax.
+textual syntax that is easy for people to read and
 write.[^json-superset] Most of the examples in this document are
 written using this syntax. In the following section, we will define an
 equivalent compact machine-readable syntax.
  [^json-superset]: The grammar of the textual syntax is a superset of
    JSON, with the slightly unusual feature that `true`, `false`, and
    `null` are all read as `Symbol`s, and that `SignedInteger`s are
    never read as `Double`s.
 ### Character set
 [ABNF][abnf] allows easy definition of US-ASCII-based languages.
 However, Preserves is a Unicode-based language. Therefore, we
 reinterpret ABNF as a grammar for recognising sequences of Unicode
 code points.
 Textual syntax for a `Value` *SHOULD* be encoded using UTF-8 where
 possible.
 ### Whitespace
 Whitespace is defined as any number of spaces, tabs, carriage returns,
 line feeds, comments, or commas. A comment is a semicolon followed by
 the unicode code points up to and including the next carriage return
 or line feed.
                ws = *(%x20 / %x09 / newline / comment / ",")
           newline = CR / LF
           comment = ";" *(WSP / nonnl) newline
             nonnl = <any Unicode code point except CR or LF>
 ### Grammar
 Standalone documents containing textual representations of `Value`s may have trailing whitespace.
          Document = Value ws
 Any `Value` may be preceded by whitespace.
             Value = ws (Record / Collection / Atom / Compact)
        Collection = Sequence / Dictionary / Set
              Atom = Boolean / Float / Double / SignedInteger /
                     String / ByteString / Symbol
 Each `Record` is its label-`Value` followed by a parenthesised
 grouping of its field-`Value`s.
            Record = Value ws "(" *Value ws ")"
 `Sequence`s are enclosed in square brackets. `Dictionary` values are
 curly-brace-enclosed colon-separated pairs of values. `Set`s are
 written either as a simple curly-brace-enclosed non-empty sequence of
 values, or as a possibly-empty sequence of values enclosed by the
 tokens `#set{` and `}`.
          Sequence = "[" *Value ws "]"
        Dictionary = "{" *(Value ws ":" Value) ws "}"
               Set = %s"#set{" *Value ws "}" / "{" 1*Value ws "}"
 Any `Value` may be represented using the
 [compact binary syntax](#compact-binary-syntax) by directly prefixing
 the binary form of the `Value` with ASCII `SOH` (`%x01`), or by
 enclosing a hexadecimal representation of the binary form of the
 `Value` in the tokens `#hexvalue{` and `}`.
           Compact = %x01 <binary data> / %s"#hexvalue{" *(ws / HEXDIG) ws "}"
 `Boolean`s are the simple literal strings `#true` and `#false`.
           Boolean = %s"#true" / %s"#false"
 Numeric data follow the
 [JSON grammar](https://tools.ietf.org/html/rfc8259#section-6), with
 the addition of a trailing "f" distinguishing `Float` from `Double`
 values. `Float`s and `Double`s always have either a fractional part or
 an exponent part, where `SignedInteger`s never have either.
 TODO: talk about precise reading of floats, and the need for arbitrary
 precision. Your language will often have a good floating-point reading
 library.
             Float = flt %i"f"
            Double = flt
     SignedInteger = int
          digit1-9 = %x31-39
               nat = %x30 / ( digit1-9 *DIGIT )
               int = ["-"] nat
              frac = "." 1*DIGIT
               exp = %i"e" ["-"/"+"] 1*DIGIT
               flt = int (frac exp / frac / exp)
 `String`s are,
 [as in JSON](https://tools.ietf.org/html/rfc8259#section-7), possibly
 escaped text surrounded by double quotes. The escaping rules are the
 same as for JSON.[^string-json-correspondence]
 TODO: discuss surrogate pairs in \uXXXX form
            String = %x22 *char %x22
              char = unescaped / %x7C / escape (escaped / %x22 / %s"u" 4HEXDIG)
         unescaped = %x20-21 / %x23-5B / %x5D-7B / %x7D-10FFFF
            escape = %x5C              ; \
           escaped = ( %x5C /          ; \    reverse solidus U+005C
                       %x2F /          ; /    solidus         U+002F
                       %x62 /          ; b    backspace       U+0008
                       %x66 /          ; f    form feed       U+000C
                       %x6E /          ; n    line feed       U+000A
                       %x72 /          ; r    carriage return U+000D
                       %x74 )          ; t    tab             U+0009
  [^string-json-correspondence]: The grammar for `String` has the same
    effect as the
    [JSON](https://tools.ietf.org/html/rfc8259#section-7) grammar for
    `string`. Some auxiliary definitions (e.g. `escaped`) are lifted
    largely unmodified from the text of RFC 8259.
 A `ByteString` may be written in any of three different forms.
 The first is similar to a `String`, but prepended with a hash sign
 `#`. In addition, only Unicode code points overlapping with printable
 7-bit ASCII are permitted unescaped inside such a `ByteString`; other
 byte values must be escaped by prepending a two-digit hexadecimal
 value with `\x`.
        ByteString = "#" %x22 *binchar %x22
           binchar = binunescaped / escape (escaped / %x22 / %s"x" 2HEXDIG)
      binunescaped = %x20-21 / %x23-5B / %x5D-7E
 The second is as a sequence of pairs hexadecimal digits interleaved
 with whitespace and surrounded by `#hex{` and `}`.
       ByteString =/ %s"#hex{" *(ws / 2HEXDIG) ws "}"
 The third is as a sequence of
 [Base64](https://tools.ietf.org/html/rfc4648) characters, interleaved
 with whitespace and surrounded by `#base64{` and `}`. Plain and
 URL-safe Base64 characters are allowed.
       ByteString =/ %s"#base64{" *(ws / base64char) ws "}" /
        base64char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "-" / "_" / "="
 A `Symbol` may be written in a "bare" form,[^cf-sexp-token] so long as
 it conforms to certain restrictions on the characters appearing in the
 symbol, or in a quoted form. The quoted form is much the same as the
 syntax for `String`s, including embedded escape syntax, except using a
 bar or pipe character (`|`) instead of a double quote mark.
            Symbol = symstart *symcont / "|" *symchar "|"
          symstart = ALPHA / sympunct
           symcont = ALPHA / sympunct / DIGIT / "-" / "."
          sympunct = "~" / "!" / "@" / "$" / "%" / "^" / "&" / "*" /
                     "?" / "_" / "=" / "+" / "<" / ">" / "/"
           symchar = unescaped / %x22 / escape (escaped / %x7C / %s"u" 4HEXDIG)
  [^cf-sexp-token]: Compare with the [SPKI S-expression][sexp.txt]
    definition of "token representation".
 TODO: More unicode in unescaped symbols?
 ### Printing
 Recommend a JSON-compatible print mode. Recommend a submode with trailing commas.
 ## Compact Binary Syntax
 A `Repr` is an encoding, or representation, of a specific `Value`.
 Each `Repr` comprises one or more bytes describing first the kind of
@ -373,14 +525,14 @@ be a single `Repr`.
 Format B (known length):
-    [[ (L F_1...F_m) ]] = header(2,3,m+1) ++ [[L]] ++ [[F_1]] ++...++ [[F_m]]
+    [[ L(F_1...F_m) ]] = header(2,3,m+1) ++ [[L]] ++ [[F_1]] ++...++ [[F_m]]
 For `m` fields, `m+1` is supplied to `header`, to account for the
 encoding of the record label.
 Format C (streaming):
-    [[ (L F_1...F_m) ]] = open(2,3) ++ [[L]] ++ [[F_1]] ++...++ [[F_m]] ++ close(2,3)
+    [[ L(F_1...F_m) ]] = open(2,3) ++ [[L]] ++ [[F_1]] ++...++ [[F_m]] ++ close(2,3)
 Applications *SHOULD* prefer the known-length format for encoding
 `Record`s.
@ -401,12 +553,12 @@ and format C becomes
 **Examples.** For example, a protocol may choose to map records
 labelled `void` to `n=0`, making
-    [[(void)]] = header(2,0,0) = [0x80]
+    [[void()]] = header(2,0,0) = [0x80]
 or it may map records labelled `person` to short form label number 1,
 making
-    [[(person "Dr" "Elizabeth" "Blackwell")]]
+    [[person("Dr", "Elizabeth", "Blackwell")]]
        = header(2,1,3) ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]]
        =        [0x93] ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]]
@ -421,20 +573,20 @@ for format C.
 Format B (known length):
-                 [[ [X_1...X_m] ]] = header(3,0,m)   ++ [[X_1]] ++...++ [[X_m]]
+            [[ [X_1...X_m] ]] = header(3,0,m)   ++ [[X_1]] ++...++ [[X_m]]
-             [[ #set{X_1...X_m} ]] = header(3,1,m)   ++ [[X_1]] ++...++ [[X_m]]
+        [[ #set{X_1...X_m} ]] = header(3,1,m)   ++ [[X_1]] ++...++ [[X_m]]
-    [[ #dict{K_1:V_1...K_m:V_m} ]] = header(3,2,m*2) ++ [[K_1]] ++ [[V_1]] ++...
+    [[ {K_1:V_1...K_m:V_m} ]] = header(3,2,m*2) ++ [[K_1]] ++ [[V_1]] ++...
-                                                     ++ [[K_m]] ++ [[V_m]]
+                                                ++ [[K_m]] ++ [[V_m]]
 Note that `m*2` is given to `header` for a `Dictionary`, since there
 are two `Value`s in each key-value pair.
 Format C (streaming):
-                 [[ [X_1...X_m] ]] = open(3,0) ++ [[X_1]] ++...++ [[X_m]] ++ close(3,0)
+            [[ [X_1...X_m] ]] = open(3,0) ++ [[X_1]] ++...++ [[X_m]] ++ close(3,0)
-             [[ #set{X_1...X_m} ]] = open(3,1) ++ [[X_1]] ++...++ [[X_m]] ++ close(3,1)
+        [[ #set{X_1...X_m} ]] = open(3,1) ++ [[X_1]] ++...++ [[X_m]] ++ close(3,1)
-    [[ #dict{K_1:V_1...K_m:V_m} ]] = open(3,2) ++ [[K_1]] ++ [[V_1]] ++...
+    [[ {K_1:V_1...K_m:V_m} ]] = open(3,2) ++ [[K_1]] ++ [[V_1]] ++...
-                                               ++ [[K_m]] ++ [[V_m]] ++ close(3,2)
+                                          ++ [[K_m]] ++ [[V_m]] ++ close(3,2)
 Applications may use whichever format suits their needs on a
 case-by-case basis.
@ -528,8 +680,8 @@ specify lengths. Applications *MUST NOT* use format C with
 #### Booleans
-    [[ #f ]] = header(0,0,0) = [0x00]
+    [[ #false ]] = header(0,0,0) = [0x00]
-    [[ #t ]] = header(0,0,1) = [0x01]
+    [[  #true ]] = header(0,0,1) = [0x01]
 #### Floats and Doubles
@ -550,31 +702,27 @@ short form label number 0 to label `discard`, 1 to `capture`, and 2 to
 | Value                                             | Encoded hexadecimal byte sequence                                    |
 |---------------------------------------------------|----------------------------------------------------------------------|
-| `(capture (discard))`                             | 91 80                                                                |
+| `capture(discard())`                              | 91 80                                                                |
-| `(observe (speak (discard) (capture (discard))))` | A1 B3 75 73 70 65 61 6B 80 91 80                                     |
+| `observe(speak(discard(), capture(discard())))`   | A1 B3 75 73 70 65 61 6B 80 91 80                                     |
 | `[1 2 3 4]` (format B)                            | C4 11 12 13 14                                                       |
 | `[1 2 3 4]` (format C)                            | 2C 11 12 13 14 3C                                                    |
 | `[-2 -1 0 1]`                                     | C4 1E 1F 10 11                                                       |
 | `"hello"` (format B)                              | 55 68 65 6C 6C 6F                                                    |
 | `"hello"` (format C, 2 chunks)                    | 25 62 68 65 63 6C 6C 6F 35                                           |
 | `"hello"` (format C, 5 chunks)                    | 25 62 68 65 62 6C 6C 60 60 61 6F 35                                  |
-| `["hello" there #"world" [] #set{} #t #f]`        | C7 55 68 65 6C 6C 6F 75 74 68 65 72 65 65 77 6F 72 6C 64 C0 D0 01 00 |
+| `["hello" there #"world" [] #set{} #true #false]` | C7 55 68 65 6C 6C 6F 75 74 68 65 72 65 65 77 6F 72 6C 64 C0 D0 01 00 |
 | `-257`                                            | 42 FE FF                                                             |
 | `-1`                                              | 1F                                                                   |
 | `0`                                               | 10                                                                   |
 | `1`                                               | 11                                                                   |
 | `255`                                             | 42 00 FF                                                             |
-| `1f`                                              | 02 3F 80 00 00                                                       |
+| `1.0f`                                            | 02 3F 80 00 00                                                       |
-| `1d`                                              | 03 3F F0 00 00 00 00 00 00                                           |
+| `1.0`                                             | 03 3F F0 00 00 00 00 00 00                                           |
-| `-1.202e300d`                                     | 03 FE 3C B7 B7 59 BF 04 26                                           |
+| `-1.202e300`                                      | 03 FE 3C B7 B7 59 BF 04 26                                           |
 Finally, a larger example, using a non-`Symbol` label for a record.[^extensibility2] The `Record`
-    ([titled person 2 thing 1]
+    [titled person 2 thing 1](101, "Blackwell", date(1821 2 3), "Dr")
       101
       "Blackwell"
       (date 1821 2 3)
       "Dr")
 encodes to
@ -671,16 +819,16 @@ such media types following the general rules for ordering of
 | Value                                      | Encoded hexadecimal byte sequence                                                                                 |
 |--------------------------------------------|-------------------------------------------------------------------------------------------------------------------|
-| `(mime application/octet-stream #"abcde")` | B3 74 6D 69 6D 65 7F 18 61 70 70 6C 69 63 61 74 69 6F 6E 2F 6F 63 74 65 74 2D 73 74 72 65 61 6D 65 61 62 63 64 65 |
+| `mime(application/octet-stream #"abcde")`  | B3 74 6D 69 6D 65 7F 18 61 70 70 6C 69 63 61 74 69 6F 6E 2F 6F 63 74 65 74 2D 73 74 72 65 61 6D 65 61 62 63 64 65 |
-| `(mime text/plain #"ABC")`                 | B3 74 6D 69 6D 65 7A 74 65 78 74 2F 70 6C 61 69 6E 63 41 42 43                                                    |
+| `mime(text/plain #"ABC")`                  | B3 74 6D 69 6D 65 7A 74 65 78 74 2F 70 6C 61 69 6E 63 41 42 43                                                    |
-| `(mime application/xml #"<xhtml/>")`       | B3 74 6D 69 6D 65 7F 0F 61 70 70 6C 69 63 61 74 69 6F 6E 2F 78 6D 6C 68 3C 78 68 74 6D 6C 2F 3E                   |
+| `mime(application/xml #"<xhtml/>")`        | B3 74 6D 69 6D 65 7F 0F 61 70 70 6C 69 63 61 74 69 6F 6E 2F 78 6D 6C 68 3C 78 68 74 6D 6C 2F 3E                   |
-| `(mime text/csv #"123,234,345")`           | B3 74 6D 69 6D 65 78 74 65 78 74 2F 63 73 76 6B 31 32 33 2C 32 33 34 2C 33 34 35                                  |
+| `mime(text/csv #"123,234,345")`            | B3 74 6D 69 6D 65 78 74 65 78 74 2F 63 73 76 6B 31 32 33 2C 32 33 34 2C 33 34 35                                  |
 Applications making heavy use of `mime` records may choose to use a
 short form label number for the record type. For example, if short
-form label number 1 were chosen, the second example above, `(mime
+form label number 1 were chosen, the second example above,
-text/plain "ABC")`, would be encoded with "92" in place of "B3 74 6D
+`mime(text/plain "ABC")`, would be encoded with "92" in place of "B3
-69 6D 65".
+74 6D 69 6D 65".
 ### Unicode normalization forms
@ -707,13 +855,13 @@ The definition of `SignedInteger` captures all integers. However, in
 certain circumstances it can be valuable to assert that a number
 inhabits a particular range, such as a fixed-width machine word.
-A family of labels `i`*n* and `u`*n* for *n* ∈ {16,32,64} denote
+A family of labels `i`*n* and `u`*n* for *n* ∈ {8,16,32,64} denote
 *n*-bit-wide signed and unsigned range restrictions, respectively.
 Records with these labels *MUST* have one field, a `SignedInteger`,
 which *MUST* fall within the appropriate range. That is, to be valid,
- - in `(i16 `*x*`)`, -32768 <= *x* <= 32767.
+ - in `i8(`*x*`)`, -128 <= *x* <= 127.
- - in `(u16 `*x*`)`, 0 <= *x* <= 65535.
+ - in `u8(`*x*`)`, 0 <= *x* <= 255.
- - in `(i32 `*x*`)`, -2147483648 <= *x* <= 2147483647.
+ - in `i16(`*x*`)`, -32768 <= *x* <= 32767.
 - etc.
 ### Anonymous Tuples and Unit
@ -721,15 +869,15 @@ which *MUST* fall within the appropriate range. That is, to be valid,
 A `Tuple` is a `Record` with label `tuple` and zero or more fields,
 denoting an anonymous tuple of values.
-The 0-ary tuple, `(tuple)`, denotes the empty tuple, sometimes called
+The 0-ary tuple, `tuple()`, denotes the empty tuple, sometimes called
 "unit" or "void" (but *not* e.g. JavaScript's "undefined" value).
 ### Null and Undefined
 Tony Hoare's
 "[billion-dollar mistake](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)"
-can be represented with the 0-ary `Record` `(null)`. An "undefined"
+can be represented with the 0-ary `Record` `null()`. An "undefined"
-value can be represented as `(undefined)`.
+value can be represented as `undefined()`.
 ### Dates and Times
@ -741,6 +889,8 @@ or `date-time` productions of
 ## Security Considerations
 TODO: Lots of whitespace is just like lots of empty chunks
 **Empty chunks.** Streamed (format C) `String`s, `ByteString`s and
 `Symbol`s may include chunks of zero length. This opens up a
 possibility for denial-of-service: an attacker may begin streaming a
@ -751,9 +901,9 @@ chunks that may appear in a stream, and may even supply an optional
 mode that rejects empty chunks entirely.
 **Canonical form for cryptographic hashing and signing.** As
-specified, the encoding rules for `Value`s do not force canonical
+specified, neither the textual nor the compact binary encoding rules
-serializations for `Set` or `Dictionary` values. Two serializations of
+for `Value`s force canonical serializations. Two serializations of the
-the same `Value` may yield different binary `Repr`s.
+same `Value` may yield different binary `Repr`s.
 ## Appendix. Table of lead byte values