Permit overlong SignedInteger encodings

This commit is contained in:
Tony Garnock-Jones 2022-06-19 17:20:25 +02:00
parent 4c53dadc41
commit 0dd2a8d622
3 changed files with 24 additions and 14 deletions

View File

@ -28,6 +28,10 @@ lengths](./preserves-binary.html#varint) *MUST* appear in the unique shortest
encoding for a given length. That is, canonical varint-encodings *MUST encoding for a given length. That is, canonical varint-encodings *MUST
NOT* start with `0`. NOT* start with `0`.
**SignedIntegers.** Each `SignedInteger` *MUST* be serialized using its
shortest possible encoding. That is, the encoding *MUST NOT* have `A3
FF FF` or `A3 00 00` as prefixes, and *MUST NOT* be `A3 00`.
**Sets.** **Sets.**
The elements of a `Set` *MUST* be serialized sorted in ascending order The elements of a `Set` *MUST* be serialized sorted in ascending order
by comparing their canonical encoded binary representations. by comparing their canonical encoded binary representations.
@ -42,7 +46,7 @@ representations of their keys.[^no-need-for-by-value]
**Other kinds of `Value`.** **Other kinds of `Value`.**
There are no special canonicalization restrictions on There are no special canonicalization restrictions on
`SignedInteger`s, `String`s, `ByteString`s, `Symbol`s, `Boolean`s, `String`s, `ByteString`s, `Symbol`s, `Boolean`s,
`Float`s, `Double`s, `Record`s, `Sequence`s, or `Embedded`s. The `Float`s, `Double`s, `Record`s, `Sequence`s, or `Embedded`s. The
constraints given for these `Value`s in the [specification][spec] constraints given for these `Value`s in the [specification][spec]
suffice to ensure canonicity. suffice to ensure canonicity.

View File

@ -39,10 +39,10 @@ which could be a file, an HTTP message, a UDP packet, etc.
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
8-byte IEEE 754 binary representations of `F` and `D`, respectively. 8-byte IEEE 754 binary representations of `F` and `D`, respectively.
The function `intbytes(x)` gives the big-endian two's-complement signed The function `intbytes(x)` is a big-endian two's-complement signed
binary representation of `x`, taking exactly as many whole bytes as binary representation of `x`, taking at least as many whole bytes as
needed to unambiguously identify the value and its sign. `intbytes(0)` needed to unambiguously identify the value and its sign. `intbytes(0)`
is the empty byte sequence. may be the empty byte sequence.
When reading, the length of the input is supplied externally. This means When reading, the length of the input is supplied externally. This means
that, when reading a length/value pair in a `seq()`, each length should that, when reading a length/value pair in a `seq()`, each length should

View File

@ -117,17 +117,23 @@ to stop expecting more contained `Repr`s.
«x» when x ∈ SignedInteger = [0xA3] ++ intbytes(x) «x» when x ∈ SignedInteger = [0xA3] ++ intbytes(x)
The function `intbytes(x)` gives the big-endian two's-complement binary The function `intbytes(x)` gives a big-endian two's-complement binary
representation of `x`, taking exactly as many whole bytes as needed to representation of `x`, taking at least as many whole bytes as needed to
unambiguously identify the value and its sign. As a special case, unambiguously identify the value and its sign; `intbytes(0)` may be the
`intbytes(0)` is the empty byte sequence. The most-significant bit in empty byte sequence.[^zero-intbytes] The most-significant bit in the
the first byte in `intbytes(x)` (for `x`≠0) is the sign first byte in `intbytes(x)` is the sign bit. While every `SignedInteger`
bit.[^zero-intbytes] Every `SignedInteger` *MUST* be represented with *SHOULD* be represented with its shortest possible encoding (which will
its shortest possible encoding. often include a necessary leading `0xFF` or `0x00`), redundant leading
`0xFF` or `0x00` bytes *MAY* be used.[^overlong-signedinteger]
[^zero-intbytes]: The value 0 needs zero bytes to identify the [^zero-intbytes]: The value 0 needs zero bytes to identify the value,
value, so `intbytes(0)` is the empty byte string. Non-zero values so `intbytes(0)` can be the empty byte string. Non-zero values need
need at least one byte. at least one byte.
[^overlong-signedinteger]: **Implementation note.** The spec permits
overlong `SignedInteger` encodings to allow e.g. construction of
`Repr`s by filling in partially-completed templates, which can be
useful in resource-constrained situations.
### Strings, ByteStrings and Symbols. ### Strings, ByteStrings and Symbols.