preserves/canonical-binary.md

2.0 KiB

title
Canonical Form for Binary Syntax

When two Values are written down in canonical form, comparing their syntax for equivalence gives the same result as comparing them semantically according to the equivalence defined in the Preserves specification.1

That is, canonical forms are equal if and only if the encoded Values are equal.

This document specifies canonical form for the Preserves machine-oriented binary syntax.

Annotations. Annotations MUST NOT be present.

Length representations. Varint-encoded lengths MUST appear in the unique shortest encoding for a given length. That is, canonical varint-encodings MUST NOT start with 0.

SignedIntegers. Each SignedInteger MUST be serialized using its shortest possible encoding. That is, the encoding MUST NOT have A3 FF FF or A3 00 00 as prefixes, and MUST NOT be A3 00.

Sets. The elements of a Set MUST be serialized sorted in ascending order by comparing their canonical encoded binary representations.

Dictionaries. The key-value pairs in a Dictionary MUST be serialized sorted in ascending order by comparing the canonical encoded binary representations of their keys.2

Other kinds of Value. There are no special canonicalization restrictions on Strings, ByteStrings, Symbols, Booleans, Floats, Doubles, Records, Sequences, or Embeddeds. The constraints given for these Values in the specification suffice to ensure canonicity.

Notes


  1. However, canonical form does not induce a match between lexicographic ordering on syntax and semantic ordering as specified. It only induces a connection between equivalences. ↩︎

  2. There is no need to order by (key, value) pair, since a Dictionary has no duplicate keys. ↩︎