Minor print layout tweaks, and minor content fixes

2018-09-24 16:08:48 +01:00 · 2018-09-24 16:08:48 +01:00 · f38aac1e19
parent 80fb72f782
commit f38aac1e19
1 changed files with 35 additions and 43 deletions
--- a/preserve.md
+++ b/preserve.md
@ -5,12 +5,15 @@
 body { font-family: palatino, "Palatino Linotype", "Palatino LT STD", "URW Palladio L", "TeX Gyre Pagella", serif; }
@media screen {
  body { padding-top: 2rem; max-width: 40em; margin: auto; font-size: 120%; }
+  hr { display: none; }
 }
@media print {
-  @page { margin: 1.5cm; }
+  @page { margin: 4rem 4rem 4.333rem 3rem; }
  body { margin-left: 2rem; margin-right 2rem; }
-  h1, h2 { page-break-before: always }
+  h1, h2 { page-break-before: always; margin-top: 0; }
  h1:first-of-type, h2:first-of-type { page-break-before: auto; }
+  hr+* { page-break-before: always; margin-top: 0; }
+  hr { display: none; }
 }
 h1, h2, h3, h4, h5, h6 { margin-left: -1rem; color: #4f81bd; }
 h2 { border-bottom: solid #4f81bd 1px; }
@ -41,9 +44,9 @@ Preserves also supports the usual suite of atomic and compound data
 types, in particular including *binary* data as a distinct type from
 text strings.

-Finally, Preserves defines precisely how to compare two values with
-each other in terms of the data model, not in terms of syntax or of
-the data structures of any particular implementation language.
+Finally, Preserves defines precisely how to *compare* two values.
+Comparison is based on the data model, not on syntax or on data
+structures of any particular implementation language.

  [^macro-expressiveness]: By "expressive" I mean *macro-expressive*
    in the sense of Felleisen's 1991 paper, "On the Expressive Power
@ -66,6 +69,9 @@ definition of the *values* that we want to work with and give them
 meaning independent of their syntax. We will treat syntax separately,
 later in this document.

+Our `Value`s fall into two broad categories: *atomic* and *compound*
+data.
+
                          Value = Atom
                                | Compound

@ -82,14 +88,6 @@ later in this document.
                                | Set
                                | Dictionary

-Our `Value`s fall into two broad categories: *atomic* and *compound*
-data.[^inspiration]
-
-  [^inspiration]: This design was loosely inspired by S-expressions,
-    as seen in Lisp, Scheme, [SPKI/SDSI][sexp.txt], and many others,
-    as well as by the ML type system, as seen in languages such as
-    SML, OCaml, Haskell, Rust, and many others.
-
 **Total order.**<a name="total-order"></a> As we go, we will
 incrementally specify a total order over `Value`s. Two values of the
 same kind are compared using kind-specific rules. The ordering among
@ -126,10 +124,10 @@ examples of `SignedInteger`s using standard mathematical notation.
 ### Unicode strings.

 A `String` is a sequence of Unicode
-[code-point](http://www.unicode.org/glossary/#code_point)s. Two
-`String`s are compared lexicographically, code-point by
+[code-point](http://www.unicode.org/glossary/#code_point)s. `String`s
+are compared lexicographically, code-point by
 code-point.[^utf8-is-awesome] We will write examples of `String`s as
-text surrounded by double-quotes “`"`” using a monospace font.
+text surrounded by quotes “`"`”.

  [^utf8-is-awesome]: Happily, the design of UTF-8 is such that this
    gives the same result as a lexicographic byte-by-byte comparison
@ -139,33 +137,27 @@ text surrounded by double-quotes “`"`” using a monospace font.
 the string containing the three Unicode code-points `z` (0x7A), `水`
 (0x6C34) and `𝄞` (0x1D11E); `""`, the empty string.

-**Normalization forms.** Unicode defines multiple
-[normalization forms](http://unicode.org/reports/tr15/) for text. No
-particular normalization form is required for `String`s;
-[see below](#normalization-forms).
-
 ### Binary data.

-A `ByteString` is an ordered sequence of zero or more integers in the
-inclusive range [0..255]. `ByteString`s are compared
-lexicographically, byte by byte. We will only write examples of
-`ByteString`s that contain bytes mapping to printable ASCII
-characters, using “`#"`” as an opening quote mark and “`"`” as a
-closing quote mark.
+A `ByteString` is an ordered sequence of zero or more eight-bit bytes.
+`ByteString`s are compared lexicographically. We will only write
+examples of `ByteString`s that contain bytes denoting printable ASCII
+characters, using “`#"`” as an open-quote and “`"`” as a close-quote
+mark.

 **Examples.** The `ByteString` containing the integers 65, 66 and 67
 (corresponding to ASCII characters `A`, `B` and `C`) is written as
 `#"ABC"`. The empty `ByteString` is written as `#""`. **N.B.** Despite
 appearances, these are *binary* data.

-### Symbols or identifiers.
+### Symbols.

 Programming languages like Lisp and Prolog frequently use string-like
 values called *symbols*. Here, a `Symbol` is, like a `String`, a
-sequence of Unicode code-points, intended to represent an identifier
-of some kind. `Symbol`s are also compared lexicographically by
-code-point. We will write examples including only non-empty sequences
-of non-whitespace characters, using a monospace font without quotation
+sequence of Unicode code-points representing an identifier of some
+kind. `Symbol`s are also compared lexicographically by code-point. We
+will write examples including only non-empty sequences of
+non-whitespace characters, using a monospace font without quotation
 marks.

 **Examples.** `hello-world`; `utf8-string`; `exact-integer?`.
@ -176,8 +168,6 @@ There are exactly two `Boolean` values, “false” and “true”. The
 “false” value compares less-than the “true” value. We write `#f` for
 “false”, and `#t` for “true”.

-**Examples.** `#f`; `#t`.
-
 ### IEEE floating-point values.

 A `Float` is a single-precision IEEE 754 floating-point value; a
@ -345,6 +335,8 @@ representation:[^some-encodings-unused]

 Each specific type of data defines its own rules for this format.

+---
+
 #### Encoding data of known length (format B)

 A `Repr` where the length of the `Value` to be encoded is variable but
@ -416,7 +408,7 @@ Applications *SHOULD* prefer the known-length format for encoding
 #### Application-specific short form for labels

 Any given protocol using Preserves may additionally define an
-interpretation for `n ∈ {0,1,2}`, mapping each *short form label
+interpretation for `n`∈{0,1,2}, mapping each *short form label
 number* `n` to a specific record label. When encoding `m` fields with
 short form label number `n`, format B becomes

@ -583,7 +575,7 @@ short form label number 0 to label `discard`, 1 to `capture`, and 2 to
 | `(observe (speak (discard) (capture (discard))))`                  | A1 B3 75 73 70 65 61 6B 80 91 80                   |
 | `[1 2 3 4]` (format B)                                             | C4 11 12 13 14                                     |
 | `[1 2 3 4]` (format C)                                             | 2C 11 12 13 14 3C                                  |
-| `[-2 -1 0 1]`                                                      | C4 1E 1F 40 11                                     |
+| `[-2 -1 0 1]`                                                      | C4 1E 1F 10 11                                     |
 | `"hello"` (format B)                                               | 55 68 65 6C 6C 6F                                  |
 | `"hello"` (format C, 2 chunks)                                     | 25 52 68 65 53 6C 6C 6F 35                         |
 | `"hello"` (format C, 5 chunks)                                     | 25 52 68 65 52 6C 6C 50 50 51 6F 35                |
@ -708,20 +700,20 @@ form label number 1 were chosen, the second example above, `(mime
 text/plain "ABC")`, would be encoded with "92" in place of "B3 74 6D
 69 6D 65".

-### Text
+### Unicode normalization forms

-#### Normalization forms
-
-In order for users to unambiguously signal or require a particular
-[normalization form](http://unicode.org/reports/tr15/), we define a
-`NormalizedString`, which is a `Record` labelled with
+Unicode defines multiple
+[normalization forms](http://unicode.org/reports/tr15/) for text.
+While no particular normalization form is required for `String`s,
+users may need to unambiguously signal or require a particular
+normalization form. A `NormalizedString` is a `Record` labelled with
 `unicode-normalization` and having two fields, the first of which is a
 `Symbol` specifying the normalization form used (e.g. `nfc`, `nfd`,
 `nfkc`, `nfkd`), and the second of which is a `String` whose
 underlying code point representation *MUST* be normalized according to
 the named normalization form.

-#### IRIs (URIs, URLs, URNs, etc.)
+### IRIs (URIs, URLs, URNs, etc.)

 An `IRI` is a `Record` labelled with `iri` and having one field, a
 `String` which is the IRI itself and which *MUST* be a valid absolute