diff --git a/syndicate/mc/preserve.md b/syndicate/mc/preserve.md index 3b67b6f..75e0295 100644 --- a/syndicate/mc/preserve.md +++ b/syndicate/mc/preserve.md @@ -298,73 +298,122 @@ connections to other data languages can also be made. For now, we limit our attention to an easily-parsed, easily-produced machine-readable syntax. -Every `Value` is represented as one or more bytes describing first its -kind and its length, and then its specific contents. +A `Repr` is an encoding, or representation, of a specific `Value`. +Each `Repr` comprises one or more bytes describing first the kind of +represented `Value` and the length of the representation, and then the +encoded details of the `Value` itself. -For a value `v`, we write `[[v]]` for the encoding of v. +For a value `v`, we write `[[v]]` for the `Repr` of v. The following figure summarises the definitions below: tt nn mmmm varint(m) contents ------------------------------- - 00 00 mmmm ... application-specific Record - 00 01 mmmm ... application-specific Record - 00 10 mmmm ... application-specific Record - 00 11 mmmm ... Record + 00 00 0000 False + 00 00 0001 True + 00 00 0010 Float, 32 bits big-endian binary + 00 00 0011 Double, 64 bits big-endian binary + 00 00 x1xx RESERVED + 00 00 1xxx RESERVED + 00 01 xxxx RESERVED + 00 10 ttnn Start Stream + When tt = 00 --> error + 01 --> each chunk is a piece + 1x --> each chunk is a single encoded Value + 00 11 ttnn End Stream (must match preceding Start Stream) - 01 00 mmmm ... Sequence - 01 01 mmmm ... Set - 01 10 mmmm ... Dictionary + 01 00 mmmm ... SignedInteger, big-endian binary + 01 01 mmmm ... String, UTF-8 binary + 01 10 mmmm ... Bytes + 01 11 mmmm ... Symbol, UTF-8 binary - 10 00 mmmm ... SignedInteger, big-endian binary - 10 01 mmmm ... String, UTF-8 binary - 10 10 mmmm ... Bytes - 10 11 mmmm ... Symbol, UTF-8 binary + 10 00 mmmm ... application-specific Record + 10 01 mmmm ... application-specific Record + 10 10 mmmm ... application-specific Record + 10 11 mmmm ... Record - 11 00 0000 False - 11 00 0001 True - 11 00 0010 Float, 32 bits big-endian binary - 11 00 0011 Double, 64 bits big-endian binary + 11 00 mmmm ... Sequence + 11 01 mmmm ... Set + 11 10 mmmm ... Dictionary + 11 11 xxxx RESERVED If mmmm = 1111, varint(m) is present; otherwise, m is the length #### Type and Length representation -A `Value`'s type and length is represented by use of a function -`header(t,n,m)` that yields a sequence of bytes when `t`, `n` and `m` -are appropriate non-negative integers. +Each `Repr` takes one of three possible forms: - header(t,n,m) = leadbyte(t,n,m) when m < 15 - or leadbyte(t,n,15) ++ varint(m) otherwise + - (A) a fixed-length form, used for simple values such as `Boolean`s + or `Float`s. -The lead byte in a `Value`'s representation is constructed by a function + - (B) a variable-length form with length specified up-front, used for + almost all `Record`s as well as for most `Sequence`s and `String`s, + when their sizes are known at the time serialization begins. + + - (C) a variable-length streaming form with unknown or unpredictable + length, used only seldom for `Record`s, since the number of fields + in a `Record` is usually statically known, but sometimes used for + `Sequence`s, `String`s etc., such as in cases when serialization + begins before the number of elements or bytes in the corresponding + `Value` is known. + +Applications may choose between formats (B) and (C) depending on their +needs at serialization time. + +Every `Repr`, however, starts with a *lead byte* describing the +remainder of the representation. + +##### The lead byte + +The lead byte is constructed by a function `leadbyte`: leadbyte(t,n,m) = [t*64 + n*16 + m] +Both `t` and `n` are two-bit unsigned numbers; `m` is a four-bit +unsigned number. + The lead byte describes the rest of the representation as follows:[^some-encodings-unused] - leadbyte(0,-,-) represents a Record - leadbyte(1,-,-) represents a Sequence, Set or Dictionary - leadbyte(2,-,-) represents an Atom with variable-length binary representation - leadbyte(3,0,-) represents an Atom with fixed-length binary representation - [^some-encodings-unused]: Some encodings are unused. All such encodings are reserved for future versions of this specification. -Variable-length representations use the value of `m` to encode their -lengths: + - `leadbyte(0,0,-)` (format A) represents an Atom with fixed-length binary representation. + - `leadbyte(0,1,-)` (format A) is RESERVED. + - `leadbyte(0,2,-)` (format C) is a Stream Start byte. + - `leadbyte(0,3,-)` (format C) is a Stream End byte. + - `leadbyte(1,-,-)` (format B) represents an Atom with variable-length binary representation. + - `leadbyte(2,-,-)` (format B) represents a Record. + - `leadbyte(3,-,-)` (format B) represents a Sequence, Set or Dictionary. - - Lengths between 0 and 14 are represented using `leadbyte` with `m` - values 0 through 14. - - Lengths of 15 or greater are represented by `m` value 15, and - additional "length bytes" describing the length then follow the - lead byte. +##### Encoding data of fixed length (format A) -These additional length bytes are formatted as -[base 128 varints][varint]. Quoting the -[Google Protocol Buffers][varint] definition, +Each specific type of data defines its own rules for this format. + +##### Encoding data of known length (format B) + +A `Repr` where the length of the `Value` to be encoded is variable but +known uses the value of `m` in `leadbyte` to encode its length. The +length counts *bytes* for atomic `Value`s, but counts *contained +values* for compound `Value`s. + + - A length `l` between 0 and 14 is represented using `leadbyte` with + `m=l`. + - A length of 15 or greater is represented by `m=15` and additional + bytes describing the length following the lead byte. + +The function `header(t,n,m)` yields an appropriate sequence of bytes +describing a `Repr`'s type and length when `t`, `n` and `m` are +appropriate non-negative integers: + + header(t,n,m) = leadbyte(t,n,m) when m < 15 + or leadbyte(t,n,15) ++ varint(m) otherwise + +The additional length bytes are formatted as +[base 128 varints][varint]. We write `varint(m)` for the +varint-encoding of `m`. Quoting the [Google Protocol Buffers][varint] +definition, > Each byte in a varint, except the last byte, has the most > significant bit (msb) set – this indicates that there are further @@ -378,43 +427,93 @@ These additional length bytes are formatted as - 300 (binary, grouped into 7-bit chunks, `10 0101100`) varint-encodes to the two bytes 172 and 2. - 1000000000 (binary `11 1011100 1101011 0010100 0000000`) varint-encodes to bytes 128, 148, 235, 220, and 3. -We write `varint(m)` for the varint-encoding of `m`. +##### Streaming data of unknown length (format C) + +A `Repr` where the length of the `Value` to be encoded is variable and +not known at the time serialization of the `Value` starts is encoded +by a single Stream Start byte, followed by zero or more *chunks*, +followed by a matching Stream End byte: + + startbyte(t,n) = leadbyte(0,2, t*4 + n) + endbyte(t,n) = leadbyte(0,3, t*4 + n) + +For a `Repr` of a `Value` containing binary data, each chunk is to be +a format B `Repr` of the same type as the overall `Repr`. + +For a `Repr` of a `Value` containing other `Value`s, each chunk is to +be a single `Repr`. #### Records - [[ (L F_1 ... F_m) ]] = header(0,3,m+1) ++ [[L]] ++ [[F_1]] ++ ... ++ [[F_m]] +Format B (known length): + + [[ (L F_1 ... F_m) ]] = header(2,3,m+1) ++ [[L]] ++ [[F_1]] ++ ... ++ [[F_m]] For `m` fields, `m+1` is supplied to `header`, to account for the encoding of the record label. +Format C (streaming): + + [[ (L F_1 ... F_m) ]] + = startbyte(2,3) ++ [[L]] ++ [[F_1]] ++ ... ++ [[F_m]] ++ endbyte(2,3) + +Applications *SHOULD* prefer the known-length format for encoding +`Record`s. + ##### Application-specific short form for labels Any given protocol using Preserves may additionally define an interpretation for `n ∈ {0,1,2}`, mapping each *short form label number* `n` to a specific record label. When encoding `m` fields with -short form label number `n`, the header is `header(0,n,m)` (rather -than `m+1`) since the label is implicit. +short form label number `n`, format B becomes + + header(2,n,m) ++ [[F_1]] ++ ... ++ [[F_m]] + +and format C becomes + + startbyte(2,n) ++ [[F_1]] ++ ... ++ [[F_m]] ++ endbyte(2,n) **Examples.** For example, a protocol may choose to map records labelled `void` to `n=0`, making - [[(void)]] = header(0,0,0) = [0x00] + [[(void)]] = header(2,0,0) = [0x80] or it may map records labelled `person` to short form label number 1, making [[(person "Dr" "Elizabeth" "Blackwell")]] - = header(0,1,3) ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]]` - = [0x13] ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]]` + = header(2,1,3) ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]] + = [0x93] ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]] + +for format B, or + + = startbyte(2,1) ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]] ++ endbyte(2,1) + = [0x29] ++ [["Dr"]] ++ [["Elizabeth"]] ++ [["Blackwell"]] ++ [0x39] + +for format C. #### Sequences, Sets and Dictionaries - [[ [X_1 ... X_m] ]] = header(1,0,m) ++ [[X_1]] ++ ... ++ [[X_m]] +Format B (known length): - [[ #set{X_1 ... X_m} ]] = header(1,1,m) ++ [[X_1]] ++ ... ++ [[X_m]] + [[ [X_1 ... X_m] ]] = header(3,0,m) ++ [[X_1]] ++ ... ++ [[X_m]] + + [[ #set{X_1 ... X_m} ]] = header(3,1,m) ++ [[X_1]] ++ ... ++ [[X_m]] [[ #dict{K_1:V_1 ... K_m:V_m} ]] - = header(1,2,m) ++ [[K_1]] ++ [[V_1]] ++ ... ++ [[K_m]] ++ [[V_m]] + = header(3,2,m) ++ [[K_1]] ++ [[V_1]] ++ ... ++ [[K_m]] ++ [[V_m]] + +Format C (streaming): + + [[ [X_1 ... X_m] ]] = startbyte(3,0) ++ [[X_1]] ++ ... ++ [[X_m]] ++ endbyte(3,0) + + [[ #set{X_1 ... X_m} ]] = startbyte(3,1) ++ [[X_1]] ++ ... ++ [[X_m]] ++ endbyte(3,1) + + [[ #dict{K_1:V_1 ... K_m:V_m} ]] + = startbyte(3,2) ++ [[K_1]] ++ [[V_1]] ++ ... ++ [[K_m]] ++ [[V_m]] ++ endbyte(3,2) + +Applications may use whichever format suits their needs on a +case-by-case basis. There is *no* ordering requirement on the `X_i` elements or `K_i`/`V_i` pairs.[^no-sorting-rationale] They may appear in any @@ -432,19 +531,23 @@ order. (b) sorting keys or elements makes no sense in streaming serialization formats. -Note that `n=3` is unused and reserved. +Note that `header(3,3,m)` and `startbyte(3,3)`/`endbyte(3,3)` is unused and reserved. #### Variable-length Atoms ##### SignedInteger - [[ x ]] when x ∈ SignedInteger = header(2,0,m) ++ intbytes(x) +Format B (known length): + + [[ x ]] when x ∈ SignedInteger = header(1,0,m) ++ intbytes(x) where m = |intbytes(x)| and intbytes(x) = a big-endian two's-complement representation of the signed integer x, taking exactly as many whole bytes as needed to unambiguously identify the value +Format C *MUST NOT* be used for `SignedInteger`s. + The value 0 needs zero bytes to identify the value, so `intbytes(0)` is the empty byte string. Non-zero values need at least one byte; the most-significant bit in the first byte in `intbytes(x)` for `x≠0` is @@ -452,55 +555,78 @@ the sign bit. For example, - [[ -257 ]] = [0x82, 0xFE, 0xFF] - [[ -256 ]] = [0x82, 0xFF, 0x00] - [[ -255 ]] = [0x82, 0xFF, 0x01] - [[ -254 ]] = [0x82, 0xFF, 0x02] - [[ -129 ]] = [0x82, 0xFF, 0x7F] - [[ -128 ]] = [0x81, 0x80] - [[ -127 ]] = [0x81, 0x81] - [[ -2 ]] = [0x81, 0xFE] - [[ -1 ]] = [0x81, 0xFF] - [[ 0 ]] = [0x80] - [[ 1 ]] = [0x81, 0x01] - [[ 127 ]] = [0x81, 0x7F] - [[ 128 ]] = [0x82, 0x00, 0x80] - [[ 255 ]] = [0x82, 0x00, 0xFF] - [[ 256 ]] = [0x82, 0x01, 0x00] - [[ 32767 ]] = [0x82, 0x7F, 0xFF] - [[ 32768 ]] = [0x83, 0x00, 0x80, 0x00] - [[ 65535 ]] = [0x83, 0x00, 0xFF, 0xFF] - [[ 65536 ]] = [0x83, 0x01, 0x00, 0x00] - [[ 131072 ]] = [0x83, 0x02, 0x00, 0x00] + [[ -257 ]] = [0x42, 0xFE, 0xFF] + [[ -256 ]] = [0x42, 0xFF, 0x00] + [[ -255 ]] = [0x42, 0xFF, 0x01] + [[ -254 ]] = [0x42, 0xFF, 0x02] + [[ -129 ]] = [0x42, 0xFF, 0x7F] + [[ -128 ]] = [0x41, 0x80] + [[ -127 ]] = [0x41, 0x81] + [[ -2 ]] = [0x41, 0xFE] + [[ -1 ]] = [0x41, 0xFF] + [[ 0 ]] = [0x40] + [[ 1 ]] = [0x41, 0x01] + [[ 127 ]] = [0x41, 0x7F] + [[ 128 ]] = [0x42, 0x00, 0x80] + [[ 255 ]] = [0x42, 0x00, 0xFF] + [[ 256 ]] = [0x42, 0x01, 0x00] + [[ 32767 ]] = [0x42, 0x7F, 0xFF] + [[ 32768 ]] = [0x43, 0x00, 0x80, 0x00] + [[ 65535 ]] = [0x43, 0x00, 0xFF, 0xFF] + [[ 65536 ]] = [0x43, 0x01, 0x00, 0x00] + [[ 131072 ]] = [0x43, 0x02, 0x00, 0x00] ##### String - [[ S ]] when S ∈ String = header(2,1,m) ++ utf8(S) +Format B (known length): + + [[ S ]] when S ∈ String = header(1,1,m) ++ utf8(S) where m = |utf8(x)| and utf8(x) = the UTF-8 encoding of S +To stream a `String`, emit `startbyte(1,1)` and then a sequence of +zero or more format B `String` chunks, followed by `endbyte(1,1)`. + +While the overall content of a streamed `String` must be valid UTF-8, +individual chunks do not have to conform to UTF-8. + ##### ByteString - [[ B ]] when B ∈ ByteString = header(2,2,m) ++ B +Format B (known length): + + [[ B ]] when B ∈ ByteString = header(1,2,m) ++ B where m = |B| +To stream a `ByteString`, emit `startbyte(1,2)` and then a sequence of +zero or more format B `ByteString` chunks, followed by `endbyte(1,2)`. + ##### Symbol - [[ S ]] when S ∈ Symbol = header(2,2,m) ++ utf8(S) +Format B (known length): + + [[ S ]] when S ∈ Symbol = header(1,3,m) ++ utf8(S) where m = |utf8(x)| and utf8(x) = the UTF-8 encoding of S +To stream a `Symbol`, emit `startbyte(1,3)` and then a sequence of +zero or more format B `Symbol` chunks, followed by `endbyte(1,3)`. + #### Fixed-length Atoms +Fixed-length atoms all use format A, and do not have a length +representation. They repurpose the bits that format B `Repr`s use to +specify lengths. Applications *MUST NOT* use format C with +`startbyte(0,n)` or `endbyte(0,n)` for any `n`. + ##### Booleans - [[ #f ]] = header(3,0,0) = [0xC0] - [[ #t ]] = header(3,0,1) = [0xC1] + [[ #f ]] = header(0,0,0) = [0x00] + [[ #t ]] = header(0,0,1) = [0x01] ##### Floats and Doubles - [[ F ]] when F ∈ Float = header(3,0,2) ++ binary32(F) - [[ D ]] when D ∈ Double = header(3,0,3) ++ binary64(D) + [[ F ]] when F ∈ Float = header(0,0,2) ++ binary32(F) + [[ D ]] when D ∈ Double = header(0,0,3) ++ binary64(D) where binary32(F) and binary64(D) are big-endian 4- and 8-byte IEEE 754 binary representations @@ -515,21 +641,25 @@ short form label number 0 to label `discard`, 1 to `capture`, and 2 to | Value | Encoded hexadecimal byte sequence | |--------------------------------------------------------------------|----------------------------------------------------| -| `(capture (discard))` | 11 00 | -| `(observe (speak (discard) (capture (discard))))` | 21 33 B5 73 70 65 61 6B 00 11 00 | -| `[1 2 3 4]` | 44 81 01 81 02 81 03 81 04 | -| `[-2 -1 0 1]` | 54 81 FE 81 FF 80 81 01 | -| `["hello" there #"world" [] #set{} #t #f]` | 47 95 68 65 6C 6C 6F A5 74 68 65 72 65 40 50 C1 C0 | -| `-257` | 82 FE FF | -| `-1` | 81 FF | -| `0` | 80 | -| `1` | 81 01 | -| `255` | 82 00 FF | -| `1f` | C2 3F 80 00 00 | -| `1d` | C3 3F F0 00 00 00 00 00 00 | -| `-1.202e300d` | C3 FE 3C B7 B7 59 BF 04 26 | +| `(capture (discard))` | 91 80 | +| `(observe (speak (discard) (capture (discard))))` | A1 B3 75 73 70 65 61 6B 80 91 80 | +| `[1 2 3 4]` (format B) | C4 41 01 41 02 41 03 41 04 | +| `[1 2 3 4]` (format C) | 2C 41 01 41 02 41 03 41 04 3C | +| `[-2 -1 0 1]` | C4 41 FE 41 FF 40 41 01 | +| `"hello"` (format B) | 55 68 65 6C 6C 6F | +| `"hello"` (format C, 2 chunks) | 25 52 68 65 53 6C 6C 6F 35 | +| `"hello"` (format C, 5 chunks) | 25 52 68 65 52 6C 6C 50 50 51 6F 35 | +| `["hello" there #"world" [] #set{} #t #f]` | C7 55 68 65 6C 6C 6F 75 74 68 65 72 65 C0 D0 01 00 | +| `-257` | 42 FE FF | +| `-1` | 41 FF | +| `0` | 40 | +| `1` | 41 01 | +| `255` | 42 00 FF | +| `1f` | 02 3F 80 00 00 | +| `1d` | 03 3F F0 00 00 00 00 00 00 | +| `-1.202e300d` | 03 FE 3C B7 B7 59 BF 04 26 | -Finally, a larger example, using a non-`Symbol` label for a record.[^extensibility2] The `Value` +Finally, a larger example, using a non-`Symbol` label for a record.[^extensibility2] The `Record` ([titled person 2 thing 1] 101 @@ -539,21 +669,21 @@ Finally, a larger example, using a non-`Symbol` label for a record.[^extensibili encodes to - 35 ;; Record, generic, 4+1 - 45 ;; Sequence, 5 - B6 74 69 74 6C 65 64 ;; Symbol, "titled" - B6 70 65 72 73 6F 6E ;; Symbol, "person" - 81 02 ;; SignedInteger, "2" - B5 74 68 69 6E 67 ;; Symbol, "thing" - 81 01 ;; SignedInteger, "1" - 81 65 ;; SignedInteger, "101" - 99 42 6C 61 63 6B 77 65 6C 6C ;; String, "Blackwell" - 34 ;; Record, generic, 3+1 - B4 64 61 74 65 ;; Symbol, "date" - 82 07 1D ;; SignedInteger, "1821" - 81 02 ;; SignedInteger, "2" - 81 03 ;; SignedInteger, "3" - 92 44 72 ;; String, "Dr" + B5 ;; Record, generic, 4+1 + C5 ;; Sequence, 5 + 76 74 69 74 6C 65 64 ;; Symbol, "titled" + 76 70 65 72 73 6F 6E ;; Symbol, "person" + 41 02 ;; SignedInteger, "2" + 75 74 68 69 6E 67 ;; Symbol, "thing" + 41 01 ;; SignedInteger, "1" + 41 65 ;; SignedInteger, "101" + 59 42 6C 61 63 6B 77 65 6C 6C ;; String, "Blackwell" + B4 ;; Record, generic, 3+1 + 74 64 61 74 65 ;; Symbol, "date" + 42 07 1D ;; SignedInteger, "1821" + 41 02 ;; SignedInteger, "2" + 41 03 ;; SignedInteger, "3" + 52 44 72 ;; String, "Dr" [^extensibility2]: It happens to line up with Racket's representation of a record label for an inheritance hierarchy @@ -608,15 +738,15 @@ pair. **Examples.** -| `(mime application/octet-stream #"abcde")` | 33 B4 6D 69 6D 65 BF 18 61 70 70 6C 69 63 61 74 69 6F 6E 2F 6F 63 74 65 74 2D 73 74 72 65 61 6D A5 61 62 63 64 65 | -| `(mime text/plain "ABC")` | 33 B4 6D 69 6D 65 BA 74 65 78 74 2F 70 6C 61 69 6E 93 41 42 43 | -| `(mime application/xml "")` | 33 B4 6D 69 6D 65 BF 0F 61 70 70 6C 69 63 61 74 69 6F 6E 2F 78 6D 6C 98 3C 78 68 74 6D 6C 2F 3E | -| `(mime text/csv "123,234,345")` | 33 B4 6D 69 6D 65 B8 74 65 78 74 2F 63 73 76 9B 31 32 33 2C 32 33 34 2C 33 34 35 | +| `(mime application/octet-stream #"abcde")` | B3 74 6D 69 6D 65 7F 18 61 70 70 6C 69 63 61 74 69 6F 6E 2F 6F 63 74 65 74 2D 73 74 72 65 61 6D 65 61 62 63 64 65 | +| `(mime text/plain #"ABC")` | B3 74 6D 69 6D 65 7A 74 65 78 74 2F 70 6C 61 69 6E 63 41 42 43 | +| `(mime application/xml #"")` | B3 74 6D 69 6D 65 7F 0F 61 70 70 6C 69 63 61 74 69 6F 6E 2F 78 6D 6C 68 3C 78 68 74 6D 6C 2F 3E | +| `(mime text/csv #"123,234,345")` | B3 74 6D 69 6D 65 78 74 65 78 74 2F 63 73 76 6B 31 32 33 2C 32 33 34 2C 33 34 35 | Applications making heavy use of `mime` records may choose to use a short form label number for the record type. For example, if short form label number 1 were chosen, the second example above, `(mime -text/plain "ABC")`, would be encoded with "12" in place of "33 B4 6D +text/plain "ABC")`, would be encoded with "92" in place of "B3 74 6D 69 6D 65". ### Text @@ -746,26 +876,29 @@ should both be identities. ## Appendix. Table of lead byte values - 0x - short form Record label index 0 - 1x - short form Record label index 1 - 2x - short form Record label index 2 - 3x - Record - 4x - Sequence - 5x - Set - 6x - Dictionary - (7x) RESERVED - 8x - SignedInteger - 9x - String - Ax - Bytes - Bx - Symbol - C0 - False - C1 - True - C2 - Float - C3 - Double - (Cx) RESERVED C4-CF - (Dx) RESERVED - (Ex) RESERVED - (Fx) RESERVED + 00 - False + 01 - True + 02 - Float + 03 - Double + (0x) RESERVED 04-0F + (1x) RESERVED 10-1F + 2x - Start Stream + 3x - End Stream + + 4x - SignedInteger + 5x - String + 6x - Bytes + 7x - Symbol + + 8x - short form Record label index 0 + 9x - short form Record label index 1 + Ax - short form Record label index 2 + Bx - Record + + Cx - Sequence + Dx - Set + Ex - Dictionary + (Fx) RESERVED F0-FF ## Appendix. Why not Just Use JSON? @@ -942,15 +1075,6 @@ Q. Should I map to SPKI SEXP or is that nonsense / for later?[^why-not-spki-sexp other kind of structure, and the "hint" itself can only be a binary blob. -Q. Should `MIMEData` be a special syntax for `Record`s with a single -`ByteString` field? - -A. Not even. It should probably just be moved to the "conventions" -section. Compare: - - D5 BA text/plain hello -- using special MIMEData encoding - 32 BA text/plain A5 hello -- using bog standard type-labelled Record - Q. Should `Symbol` be a special syntax for a `Record` with a `Symbol` label (recursive!?) and a single `String` field? @@ -970,16 +1094,6 @@ Q. Are the language mappings reasonable? How about one for Python? --- -Streaming: needed for variable-sized structures. Tricky to design -syntax for this that isn't gratuitously warty. End byte value. - -SIGH. Streaming for text/bytes too I SUPPOSE. Chunks, like CBOR - Literal small integers: could be nice? Not absolutely necessary. -Maybe reorder: fixed-length atoms first, then variable-length atoms, -then fixed-length compounds, then variable-length compounds? Reason -being that then maybe can put the streaming forms of the -variable-length ones very last. - ---