forked from syndicate-lang/preserves
Canonical Form for Binary Syntax
This commit is contained in:
parent
da2a6d44d1
commit
fe8c766d1e
|
@ -21,6 +21,7 @@ comparable to JSON, XML, S-expressions, CBOR, ASN.1 BER, and so on.
|
|||
- [Preserves tutorial](TUTORIAL.html)
|
||||
- [Preserves specification](preserves.html), including semantics,
|
||||
textual syntax, and compact binary syntax
|
||||
- [Canonical Form for Binary Syntax](canonical-binary.html)
|
||||
|
||||
## Additional resources
|
||||
|
||||
|
|
|
@ -0,0 +1,68 @@
|
|||
---
|
||||
title: "Canonical Form for Binary Syntax"
|
||||
---
|
||||
|
||||
[spec]: preserves.html
|
||||
|
||||
When two `Value`s are written down in *canonical form*, comparing
|
||||
their *syntax* for equivalence gives the same result as comparing them
|
||||
*semantically* according to the equivalence defined in the
|
||||
[Preserves specification][spec].[^equivalence-not-ordering]
|
||||
|
||||
[^equivalence-not-ordering]: However, canonical form does *not*
|
||||
induce a match between lexicographic ordering on syntax and
|
||||
semantic ordering [as specified][spec]. It *only* induces a
|
||||
connection between equivalences.
|
||||
|
||||
That is, canonical forms are equal if and only if the encoded `Value`s
|
||||
are equal.
|
||||
|
||||
This document specifies canonical form for the Preserves compact
|
||||
binary syntax.
|
||||
|
||||
**General rules.**
|
||||
Streaming formats ("format C") MUST NOT be used.
|
||||
Annotations MUST NOT be present.
|
||||
Placeholders MUST NOT be used.
|
||||
Where possible, fixed-length ("format A") MUST be used in preference
|
||||
to variable-length ("format B") formats.
|
||||
|
||||
**Signed integers.**
|
||||
When a `SignedInteger` *n* is greater than or equal to -3 and less
|
||||
than 13 (i.e. -3≤*n*<13), it MUST be represented using the single-byte
|
||||
encoding with initial nibble equal to 3.
|
||||
Otherwise (i.e. when *n*<-3 or *n*≥13), it MUST be represented using
|
||||
the multi-byte encoding with initial nibble equal to 4, and the
|
||||
variable-length part must be as short as possible while remaining
|
||||
unambiguous.[^signed-integer-examples]
|
||||
|
||||
[^signed-integer-examples]: The following examples from
|
||||
[the specification][spec] are all in canonical form:
|
||||
|
||||
[[ -257 ]] = 42 FE FF [[ -3 ]] = 3D [[ 128 ]] = 42 00 80
|
||||
[[ -256 ]] = 42 FF 00 [[ -2 ]] = 3E [[ 255 ]] = 42 00 FF
|
||||
[[ -255 ]] = 42 FF 01 [[ -1 ]] = 3F [[ 256 ]] = 42 01 00
|
||||
[[ -254 ]] = 42 FF 02 [[ 0 ]] = 30 [[ 32767 ]] = 42 7F FF
|
||||
[[ -129 ]] = 42 FF 7F [[ 1 ]] = 31 [[ 32768 ]] = 43 00 80 00
|
||||
[[ -128 ]] = 41 80 [[ 12 ]] = 3C [[ 65535 ]] = 43 00 FF FF
|
||||
[[ -127 ]] = 41 81 [[ 13 ]] = 41 0D [[ 65536 ]] = 43 01 00 00
|
||||
[[ -4 ]] = 41 FC [[ 127 ]] = 41 7F [[ 131072 ]] = 43 02 00 00
|
||||
|
||||
|
||||
**Sets.**
|
||||
The elements of a `Set` MUST be serialized sorted in ascending order
|
||||
following the total order relation defined in the
|
||||
[Preserves specification][spec].
|
||||
|
||||
**Dictionaries.**
|
||||
The key-value pairs in a `Dictionary` MUST be serialized sorted in
|
||||
ascending order by key, following the total order relation defined in
|
||||
the [Preserves specification][spec].
|
||||
|
||||
**Other kinds of `Value`.**
|
||||
There are no special canonicalization restrictions on `String`s,
|
||||
`ByteString`s, `Symbol`s, `Boolean`s, `Float`s, `Double`s, `Record`s,
|
||||
or `Sequence`s.
|
||||
|
||||
<!-- Heading to visually offset the footnotes from the main document: -->
|
||||
## Notes
|
Loading…
Reference in New Issue