Further small improvements

This commit is contained in:
Tony Garnock-Jones 2023-10-17 01:34:21 +02:00
parent 9f0217b22a
commit 71ead5abb7
2 changed files with 19 additions and 23 deletions

View File

@ -31,9 +31,10 @@ class="postcard-grammar binarysyntax">*V*</span>.
| | <span class="outputish">(*n* &amp; 127) &#124; 128</span> **varint**(*n* &gt;&gt; 7) | if *n* ≥ 128
{:.postcard-grammar.binarysyntax}
**intbytes**(*n*) | = | <span class="roman">(the empty sequence)</span> | if *n* = 0
| | **signedBigEndian**(*n*) | otherwise
**signedBigEndian**(*n*) | = | <span class="outputish">*n* &amp; 255</span> | if -128 ≤ *n* ≤ 127
**intbytes**(*n*) | = | <span class="roman">the empty sequence if</span> *n* = 0<span class="roman">, otherwise</span> **signedBigEndian**(*n*)
{:.postcard-grammar.binarysyntax}
**signedBigEndian**(*n*) | = | <span class="outputish">*n* &amp; 255</span> | if 128 ≤ *n* ≤ 127
| | **signedBigEndian**(*n* &gt;&gt; 8) <span class="outputish">*n* &amp; 255</span> | otherwise
The functions <span class="postcard-grammar binarysyntax">**binary32**(*F*)</span> and <span

View File

@ -92,17 +92,23 @@ serializing in some other implementation-defined order.
but encoding and then sorting byte strings is much more likely to
be within easy reach.
### SignedIntegers.
### SignedIntegers, Strings, ByteStrings and Symbols.
«x» = [0xB0] ++ varint(|intbytes(x)|) ++ intbytes(x) if x ∈ SignedInteger
«S» = [0xB0] ++ varint(|intbytes(S)|) ++ intbytes(S) if S ∈ SignedInteger
[0xB1] ++ varint(|utf8(S)|) ++ utf8(S) if S ∈ String
[0xB2] ++ varint(|S|) ++ S if S ∈ ByteString
[0xB3] ++ varint(|utf8(S)|) ++ utf8(S) if S ∈ Symbol
The function `intbytes(x)` gives the big-endian two's-complement
binary representation of `x`, taking exactly as many whole bytes as
needed to unambiguously identify the value and its sign. `intbytes(0)`
is special-cased to be the empty byte
sequence;[^empty-intbytes-sequence] otherwise, the most-significant
bit of the first byte is the sign bit. See the [examples in the
appendix below](#signedinteger-examples).
For `String` and `Symbol`, the data following the tag and length is a
UTF-8 encoding of the `Value`. For `ByteString`, it is the raw data
contained within the `Value` unmodified. For `SignedInteger`, it is
the big-endian two's-complement binary representation of the number,
taking exactly as many whole bytes as needed to unambiguously identify
the value and its sign. `intbytes(0)` is special-cased to be the empty
byte sequence;[^empty-intbytes-sequence] otherwise, the
most-significant bit of the first byte is the sign bit. (See
[SignedInteger examples](#signedinteger-examples) in the appendix
below.)
[^empty-intbytes-sequence]: Without the special case of
`intbytes(0)` yielding the empty byte sequence, no input to
@ -110,17 +116,6 @@ appendix below](#signedinteger-examples).
`0` or `-1` could be special-cased to the empty sequence; here, we
arbitrarily choose `0`.
### Strings, ByteStrings and Symbols.
«S» = [0xB1] ++ varint(|utf8(S)|) ++ utf8(S) if S ∈ String
[0xB2] ++ varint(|S|) ++ S if S ∈ ByteString
[0xB3] ++ varint(|utf8(S)|) ++ utf8(S) if S ∈ Symbol
Syntax for these three types varies only in the tag used. For `String`
and `Symbol`, the data following the tag is a UTF-8 encoding of the
`Value`, while for `ByteString` it is the raw data contained within the
`Value` unmodified.
### Booleans.
«#f» = [0x80]