Remove single-precision floats from the specs

2024-01-27 11:34:51 +01:00 · 2024-01-27 11:34:51 +01:00 · dc1b0ac54d
parent d579a0d607
commit dc1b0ac54d
20 changed files with 49 additions and 78 deletions
--- a/TUTORIAL.md
+++ b/TUTORIAL.md
@ -105,7 +105,7 @@ A few more interesting differences:
     {"dictionaries": "as keys???"}: "well, why not?"}
 ```

-Preserves technically provides a few types of numbers:
+Preserves technically provides various types of numbers:

 ```
    # Signed Integers
@ -114,9 +114,6 @@ Preserves technically provides a few types of numbers:
    5907212309572059846509324862304968273468909473609826340
    -5907212309572059846509324862304968273468909473609826340

-    # Floats (Single-precision IEEE floats) (notice the trailing f)
-    3.1415927f
-
    # Doubles (Double-precision IEEE floats)
    3.141592653589793
 ```
--- a/_includes/cheatsheet-binary-plaintext.md
+++ b/_includes/cheatsheet-binary-plaintext.md
@ -7,7 +7,6 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
                    «@W V» = [0x85] ++ «W» ++ «V»
                     «#!V» = [0x86] ++ «V»

-  «V» if V ∈ Float         = [0x87, 0x04] ++ binary32(V)
  «V» if V ∈ Double        = [0x87, 0x08] ++ binary64(V)

  «V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
@ -29,5 +28,4 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
                             signedBigEndian(n >> 8) ++ [n & 255] otherwise
 ```

-The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
-8-byte IEEE 754 binary representations of `F` and `D`, respectively.
+The function `binary64(D)` yields the big-endian 8-byte IEEE 754 binary representation of `D`.
--- a/_includes/cheatsheet-binary.md
+++ b/_includes/cheatsheet-binary.md
@ -11,7 +11,6 @@ class="postcard-grammar binarysyntax">*V*</span>.
 «`#!`*V*» | = | `86` «*V*»

 {:.postcard-grammar.binarysyntax}
-«*V*» | = | `87``04` **binary32**(*V*) | if *V* ∈ Float
 «*V*» | = | `87``08` **binary64**(*V*) | if *V* ∈ Double

 {:.postcard-grammar.binarysyntax}
@ -37,10 +36,9 @@ class="postcard-grammar binarysyntax">*V*</span>.
 **signedBigEndian**(*n*) | = | <span class="outputish">*n* &amp; 255</span> | if −128 ≤ *n* ≤ 127
 | | **signedBigEndian**(*n* &gt;&gt; 8) <span class="outputish">*n* &amp; 255</span> | otherwise

-The functions <span class="postcard-grammar binarysyntax">**binary32**(*F*)</span> and <span
-class="postcard-grammar binarysyntax">**binary64**(*D*)</span> yield big-endian 4- and 8-byte
-IEEE 754 binary representations of <span class="postcard-grammar binarysyntax">*F*</span> and
-<span class="postcard-grammar binarysyntax">*D*</span>, respectively.
+The function <span class="postcard-grammar binarysyntax">**binary64**(*D*)</span> yields the
+big-endian 8-byte IEEE 754 binary representation of <span class="postcard-grammar
+binarysyntax">*D*</span>.

 <!--
 Together, <span class="postcard-grammar binarysyntax">**div**</span> and <span
--- a/_includes/cheatsheet-text-plaintext.md
+++ b/_includes/cheatsheet-text-plaintext.md
@ -21,8 +21,7 @@ ByteString    :=  `#"` binchar* `"`
 String        :=  `"` («any unicode scalar except `\` or `"`» | escaped | `\"`)* `"`
 QuotedSymbol  :=  `|` («any unicode scalar except `\` or `|`» | escaped | `\|`)* `|`
 Symbol        :=  (`A`..`Z` | `a`..`z` | `0`..`9` | sympunct | symuchar)+
-Number        :=  Float | Double | SignedInteger
-Float         :=  flt (`f`|`F`) | `#xf"` (ws hex hex)4 ws `"`
+Number        :=  Double | SignedInteger
 Double        :=  flt | `#xd"` (ws hex hex)8 ws `"`
 SignedInteger :=  int

--- a/_includes/cheatsheet-text.md
+++ b/_includes/cheatsheet-text.md
@ -22,8 +22,7 @@
 | *String* | := | `"` (« any unicode scalar value except `\` or `"` » &#124; *escaped* &#124;`\"`)<sup>⋆</sup> `"` |
 | *QuotedSymbol* | := | `|` (« any unicode scalar value except `\` or `|` » &#124; *escaped* &#124;`\|`)<sup>⋆</sup> `|` |
 | *Symbol* | := | (`A`..`Z`&#124;`a`..`z`&#124;`0`..`9`&#124; *sympunct* &#124; *symuchar*)<sup>+</sup> |
-| *Number* | := | *Float* &#124; *Double* &#124; *SignedInteger* |
-| *Float* | := | *flt* (`f`&#124;`F`) &#124;`#xf"` (**ws** *hex* *hex*)<sup>4</sup> **ws**`"` |
+| *Number* | := | *Double* &#124; *SignedInteger* |
 | *Double* | := | *flt* &#124;`#xd"` (**ws** *hex* *hex*)<sup>8</sup> **ws**`"` |
 | *SignedInteger* | := | *int* |

--- a/_includes/python-representation.md
+++ b/_includes/python-representation.md
@ -1,5 +1,5 @@
 Python's strings, byte strings, integers, booleans, and double-precision floats stand directly
-for their Preserves counterparts. Wrapper objects for [Float][preserves.values.Float] and
+for their Preserves counterparts. Wrapper objects for
 [Symbol][preserves.values.Symbol] complete the suite of atomic types.

 Python's lists and tuples correspond to Preserves `Sequence`s, and dicts and sets to
--- a/_includes/text-examples.md
+++ b/_includes/text-examples.md
@ -2,7 +2,6 @@ Here are a few example values, written using the [text
 syntax](https://preserves.dev/preserves-text.html):

    Boolean    : #t #f
-    Float      : 1.0f 10.4e3f -100.6f
    Double     : 1.0 10.4e3 -100.6
    Integer    : 1 0 -100
    String     : "Hello, world!\n"
--- a/_includes/value-grammar.md
+++ b/_includes/value-grammar.md
@ -4,7 +4,6 @@
                            | Embedded

                       Atom = Boolean
-                            | Float
                            | Double
                            | SignedInteger
                            | String
--- a/canonical-binary.md
+++ b/canonical-binary.md
@ -38,7 +38,7 @@ representations of their keys.[^no-need-for-by-value]
 **Other kinds of `Value`.**
 There are no special canonicalization restrictions on
 `SignedInteger`s, `String`s, `ByteString`s, `Symbol`s, `Boolean`s,
-`Float`s, `Double`s, `Record`s, `Sequence`s, or `Embedded`s. The
+`Double`s, `Record`s, `Sequence`s, or `Embedded`s. The
 constraints given for these `Value`s in the [specification][spec]
 suffice to ensure canonicity.

--- a/conventions.md
+++ b/conventions.md
@ -23,10 +23,10 @@ Appropriately-labelled `Record`s denote these domain-specific data
 types.[^why-dictionaries]

  [^why-dictionaries]: Given `Record`'s existence, it may seem odd
-    that `Dictionary`, `Set`, `Float`, etc. are given special
+    that `Dictionary`, `Set`, `Double`, etc. are given special
    treatment. Preserves aims to offer a useful basic equivalence
    predicate to programmers, and so if a data type demands a special
-    equivalence predicate, as `Dictionary`, `Set` and `Float` all do,
+    equivalence predicate, as `Dictionary`, `Set` and `Double` all do,
    then the type should be included in the base language. Otherwise,
    it can be represented as a `Record` and treated separately.
    `Boolean`, `String` and `Symbol` are seeming exceptions. The first
--- a/implementations/rust/preserves/doc/cheatsheet-binary-plaintext.md
+++ b/implementations/rust/preserves/doc/cheatsheet-binary-plaintext.md
@ -7,7 +7,6 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
                    «@W V» = [0x85] ++ «W» ++ «V»
                     «#!V» = [0x86] ++ «V»

-  «V» if V ∈ Float         = [0x87, 0x04] ++ binary32(V)
  «V» if V ∈ Double        = [0x87, 0x08] ++ binary64(V)

  «V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
@ -29,5 +28,4 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
                             signedBigEndian(n >> 8) ++ [n & 255] otherwise
 ```

-The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
-8-byte IEEE 754 binary representations of `F` and `D`, respectively.
+The function `binary64(D)` yields the big-endian 8-byte IEEE 754 binary representation of `D`.
--- a/implementations/rust/preserves/doc/cheatsheet-text-plaintext.md
+++ b/implementations/rust/preserves/doc/cheatsheet-text-plaintext.md
@ -21,8 +21,7 @@ ByteString    :=  `#"` binchar* `"`
 String        :=  `"` («any unicode scalar except `\` or `"`» | escaped | `\"`)* `"`
 QuotedSymbol  :=  `|` («any unicode scalar except `\` or `|`» | escaped | `\|`)* `|`
 Symbol        :=  (`A`..`Z` | `a`..`z` | `0`..`9` | sympunct | symuchar)+
-Number        :=  Float | Double | SignedInteger
-Float         :=  flt (`f`|`F`) | `#xf"` (ws hex hex)4 ws `"`
+Number        :=  Double | SignedInteger
 Double        :=  flt | `#xd"` (ws hex hex)8 ws `"`
 SignedInteger :=  int

--- a/implementations/rust/preserves/doc/value-grammar.md
+++ b/implementations/rust/preserves/doc/value-grammar.md
@ -4,7 +4,6 @@
                            | Embedded

                       Atom = Boolean
-                            | Float
                            | Double
                            | SignedInteger
                            | String
--- a/preserves-binary.md
+++ b/preserves-binary.md
@ -28,7 +28,7 @@ represented. Depending on the tag, a length indicator, further encoded
 information, and/or an ending tag may follow.

    tag                          (simple atomic data)
-    tag ++ length ++ binarydata  (floats, doubles, integers, strings, symbols, and binary)
+    tag ++ length ++ binarydata  (doubles, integers, strings, symbols, and binary)
    tag ++ repr ++ ... ++ endtag (compound data)

 The unique end tag is byte value `0x84`.
@ -121,13 +121,12 @@ below.)
    «#f» = [0x80]
    «#t» = [0x81]

-### Floats and Doubles.
+### Doubles.

-    «F» = [0x87, 0x04] ++ binary32(F)  if F ∈ Float
    «D» = [0x87, 0x08] ++ binary64(D)  if D ∈ Double

-The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
-8-byte IEEE 754 binary representations of `F` and `D`, respectively.
+The function `binary64(D)` yields the big-endian 8-byte IEEE 754 binary
+representation of `D`.

 ### Embeddeds.

@ -196,7 +195,7 @@ a binary-syntax document; otherwise, it should be interpreted as text.
     84 - End marker
     85 - Annotation
     86 - Embedded
-     87 - Float and Double
+     87 - Double

     B0 - Integer
     B1 - String
--- a/preserves-path.md
+++ b/preserves-path.md
@ -85,15 +85,14 @@ Filters: narrow down a selection without moving

        ^ literal         # Matches a record having a the literal as its label -- equivalent to [.^ = literal]

-        ~real             # Promotes int and float to double, passes on double unchanged, rejects others
+        ~real             # Promotes int to double, passes on double unchanged, rejects others
                          # Out-of-range ints (too big or too small) become various double infinities
                          # Converting high-magnitude ints causes loss of precision

-        ~int              # Converts float and double to closest integer, where possible
+        ~int              # Converts double to closest integer, where possible
                          # NaN and infinities are rejected

        bool              # Type filters
-        float
        double
        int
        string
@ -116,9 +115,9 @@ engines](https://www.regular-expressions.info/engine.html)"; (2) it
 should be very widely implemented; (3) it should cover regular
 languages and no more; (4) it should be easy to implement.

-Design choice: How should comparison work? Should `lt 1.0f` accept not only `0.9f` but also
-`#t` and `#f` (since `Boolean` comes before `Float` in the Preserves total ordering)? Should
-`lt 1.0f` accept `0.9` and `0` as well as `0.9f`?
+Design choice: How should comparison work? Should `lt 1.0` accept not only `0.9` but also
+`#t` and `#f` (since `Boolean` comes before `Double` in the Preserves total ordering)? Should
+`lt 1.0` accept `0` as well as `0.0`?

 ## Functions

--- a/preserves-schema.md
+++ b/preserves-schema.md
@ -324,7 +324,7 @@ The `any` pattern matches any input `Value`:

 Specifying the name of a kind of `Atom` matches that kind of atom:

-    AtomKindPattern = "bool" / "float" / "double" / "int" / "string" / "bytes" / "symbol"
+    AtomKindPattern = "bool" / "double" / "int" / "string" / "bytes" / "symbol"

 Embedded input `Value`s are matched with embedded patterns. The
 portion under the `#!` prefix is the *interface* schema for the
@ -625,7 +625,7 @@ Simple patterns are as described above:
      # any
      / =any

-      # special builtins: bool, float, double, int, string, bytes, symbol
+      # special builtins: bool, double, int, string, bytes, symbol
      / <atom @atomKind AtomKind>

      # matches an embedded value in the input: #!p
@ -648,7 +648,6 @@ Simple patterns are as described above:
    .

    AtomKind = =Boolean
-             / =Float
             / =Double
             / =SignedInteger
             / =String
@ -755,7 +754,6 @@ metaschema.

        AtomKind: <or [
          ["Boolean", <lit Boolean>],
-          ["Float", <lit Float>],
          ["Double", <lit Double>],
          ["SignedInteger", <lit SignedInteger>],
          ["String", <lit String>],
@ -878,7 +876,6 @@ definitions for the metaschema.

    export type AtomKind = (
        {"_variant": "Boolean"} |
-        {"_variant": "Float"} |
        {"_variant": "Double"} |
        {"_variant": "SignedInteger"} |
        {"_variant": "String"} |
@ -911,7 +908,6 @@ definitions for the metaschema.
    (struct AtomKind-String () #:prefab)
    (struct AtomKind-SignedInteger () #:prefab)
    (struct AtomKind-Double () #:prefab)
-    (struct AtomKind-Float () #:prefab)
    (struct AtomKind-Boolean () #:prefab)

    (struct Bundle (modules) #:prefab)
--- a/preserves-text.md
+++ b/preserves-text.md
@ -180,8 +180,8 @@ including embedded escape syntax, except using a bar or pipe character

 Alternatively, a `Symbol` may be written in a “bare” form[^cf-sexp-token].
 The grammar for numeric data is a subset of the grammar for bare `Symbol`s,
-so if a `SymbolOrNumber` also matches the grammar for `Float`, `Double` or
-`SignedInteger`, then it must be interpreted as one of those, and otherwise
+so if a `SymbolOrNumber` also matches the grammar for `Double` or
+`SignedInteger` then it must be interpreted as one of those, and otherwise
 it must be interpreted as a bare `Symbol`.

    SymbolOrNumber = 1*(ALPHA / DIGIT / sympunct / symuchar)
@ -197,14 +197,12 @@ it must be interpreted as a bare `Symbol`.

 Numeric data follow the [JSON
 grammar](https://tools.ietf.org/html/rfc8259#section-6) except that leading
-zeros are permitted and an optional leading `+` sign is allowed. The
-addition of a trailing “f” distinguishes a `Float` from a `Double` value.
-`Float`s and `Double`s always have either a fractional part or an exponent
+zeros are permitted and an optional leading `+` sign is allowed.
+`Double`s always have either a fractional part or an exponent
 part, where `SignedInteger`s never have
 either.[^reading-and-writing-floats-accurately]
 [^arbitrary-precision-signedinteger]

-             Float = flt %i"f"
            Double = flt
     SignedInteger = int

@ -244,14 +242,13 @@ either.[^reading-and-writing-floats-accurately]
    values for equality or ordering will not yield results that match
    the expected semantics of the data model.

-Some valid IEEE 754 `Float`s and `Double`s are not covered by the grammar
+Some valid IEEE 754 `Double`s are not covered by the grammar
 above, namely, the several million NaNs and the two infinities. These are
 represented as raw hexadecimal strings similar to hexadecimal
 `ByteString`s. Implementations are free to use hexadecimal floating-point
 syntax whereever convenient, even for values representable using the
 grammar above.[^rationale-no-general-machine-syntax]

-            Float =/ "#xf" DQUOTE 4(ws 2HEXDIG) ws DQUOTE
           Double =/ "#xd" DQUOTE 8(ws 2HEXDIG) ws DQUOTE

  [^rationale-no-general-machine-syntax]: **Rationale.** Previous versions
@ -332,12 +329,11 @@ syntax.
 ## Appendix. Regular expressions for bare symbols and numbers

 When parsing, if a token matches both `SymbolOrNumber` and `Number`, it's a
-number; use `Float`, `Double` and `SignedInteger` to disambiguate. If it
+number; use `Double` and `SignedInteger` to disambiguate. If it
 matches `SymbolOrNumber` but not `Number`, it's a "bare" `Symbol`.

    SymbolOrNumber: ^[-a-zA-Z0-9~!$%^&*?_=+/.]+$
-            Number: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+))([fF]?))?$
-             Float: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+))[fF])$
+            Number: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+)))?$
            Double: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+)))$
     SignedInteger: ^([-+]?\d+)$

--- a/preserves.md
+++ b/preserves.md
@ -39,7 +39,7 @@ follows:

            (Compounds)     Record < Sequence < Set < Dictionary

-            (Atoms)         Boolean < Float < Double < SignedInteger
+            (Atoms)         Boolean < Double < SignedInteger
                              < String < ByteString < Symbol

 **Equivalence.**<a name="equivalence"></a> Two `Value`s are equal if
@ -92,14 +92,18 @@ less-than the “true” value.

 ### IEEE floating-point values.

-`Float`s and `Double`s are single- and double-precision IEEE 754
-floating-point values, respectively. `Float`s, `Double`s and
-`SignedInteger`s are disjoint; by the rules [above](#total-order), every
-`Float` is less than every `Double`, and every `SignedInteger` is
-greater than both. Two `Float`s or two `Double`s are to be ordered by
-the `totalOrder` predicate defined in section 5.10 of [IEEE Std
+`Double`s are double-precision IEEE 754 floating-point
+values.[^other-ieee754-precisions] `Double`s and `SignedInteger`s are
+disjoint; by the rules [above](#total-order), every `Double` is less than
+every `SignedInteger`. Two `Double`s are to be ordered by the `totalOrder`
+predicate defined in section 5.10 of [IEEE Std
 754-2008](https://dx.doi.org/10.1109/IEEESTD.2008.4610935).

+  [^other-ieee754-precisions]: Every value inhabiting a smaller IEEE 754
+    type (e.g. single- or half-precision) can be injected into and
+    projected from double-precision losslessly and in an order-preserving
+    way.
+
 ### Records.

 A `Record` is a *labelled* tuple of `Value`s, the record's *fields*. A
@ -239,7 +243,6 @@ The total ordering specified [above](#total-order) means that the following stat
 | `1.0f`                                              | 87 04 3F 80 00 00                                                               |
 | `1.0`                                               | 87 08 3F F0 00 00 00 00 00 00                                                   |
 | `-1.202e300`                                        | 87 08 FE 3C B7 B7 59 BF 04 26                                                   |
-| `#xf"7f800000"`, positive `Float` infinity          | 87 04 7F 80 00 00                                                               |
 | `#xd"fff0000000000000"`, negative `Double` infinity | 87 08 FF F0 00 00 00 00 00 00                                                   |

 The next example uses a non-`Symbol` label for a record.[^extensibility2] The `Record`
--- a/questions.md
+++ b/questions.md
@ -19,10 +19,10 @@ at version 0.990).
 Q. Should we go for trying to make the data ordering line up with the
 encoding ordering? We'd have to only use streaming forms, and avoid
 the small integer encoding, and not store record arities, and sort
-sets and dictionaries, and mask floats and doubles (perhaps
+sets and dictionaries, and mask doubles (perhaps
 [like this](https://stackoverflow.com/questions/43299299/sorting-floating-point-values-using-their-byte-representation)),
 and perhaps pick a specific `NaN`, and I don't know what to do about
-SignedIntegers. Perhaps make them more like float formats, with the
+SignedIntegers. Perhaps make them more like floating-point formats, with the
 byte count acting as a kind of exponent underneath the sign bit.

 - Perhaps define separate additional canonicalization restrictions?
@ -31,11 +31,6 @@ byte count acting as a kind of exponent underneath the sign bit.
 - Canonicalization and early-bailout-equivalence-checking are in
   tension with support for streaming values.

-Q. To remain compatible with JSON, portions of the text syntax have to
-remain case-insensitive (`%i"..."`). However, non-JSON extensions do
-not. There's only one (?) at the moment, the `%i"f"` in `Float`;
-should it be changed to case-sensitive?
-
 Q. Should `IOList`s be wrapped in an identifying unary record constructor?

 Q. Whitespace - is having `,` as whitespace sensible or not? I can
--- a/representations.md
+++ b/representations.md
@ -19,7 +19,7 @@ affect comparisons of that `Value` to others in any way.
 ## JavaScript.

 - `Boolean` ↔ `Boolean`
- - `Float` and `Double` ↔ numbers
+ - `Double` ↔ numbers
 - `SignedInteger` ↔ numbers or `BigInt` (see [here](https://developers.google.com/web/updates/2018/05/bigint) and [here](https://github.com/tc39/proposal-bigint))
 - `String` ↔ strings
 - `ByteString` ↔ `Uint8Array`
@ -34,7 +34,7 @@ affect comparisons of that `Value` to others in any way.
 ## Scheme/Racket.

 - `Boolean` ↔ booleans
- - `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats)
+ - `Double` ↔ inexact numbers
 - `SignedInteger` ↔ exact numbers
 - `String` ↔ strings
 - `ByteString` ↔ byte vector (Racket: "Bytes")
@ -47,7 +47,7 @@ affect comparisons of that `Value` to others in any way.
 ## Java.

 - `Boolean` ↔ `Boolean`
- - `Float` and `Double` ↔ `Float` and `Double`
+ - `Double` ↔ `Double`
 - `SignedInteger` ↔ `Integer`, `Long`, `BigInteger`
 - `String` ↔ `String`
 - `ByteString` ↔ `byte[]`
@ -61,7 +61,7 @@ affect comparisons of that `Value` to others in any way.
 ## Erlang.

 - `Boolean` ↔ `true` and `false`
- - `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision)
+ - `Double` ↔ floats
 - `SignedInteger` ↔ integers
 - `String` ↔ pair of `utf8` and a binary
 - `ByteString` ↔ a binary
@ -84,7 +84,6 @@ or `Record`s.
 ## Python.

 - `Boolean` ↔ `True` and `False`
- - `Float` ↔ a `Float` wrapper-class for a double-precision value
 - `Double` ↔ float
 - `SignedInteger` ↔ int and long
 - `String` ↔ `unicode`
@ -98,7 +97,6 @@ or `Record`s.
 ## Squeak Smalltalk.

 - `Boolean` ↔ `true` and `false`
- - `Float` ↔ perhaps a subclass of `Float`?
 - `Double` ↔ `Float`
 - `SignedInteger` ↔ `Integer`
 - `String` ↔ `WideString`