Remove single-precision floats from the specs

This commit is contained in:
Tony Garnock-Jones 2024-01-27 11:34:51 +01:00
parent d579a0d607
commit dc1b0ac54d
20 changed files with 49 additions and 78 deletions

View File

@ -105,7 +105,7 @@ A few more interesting differences:
{"dictionaries": "as keys???"}: "well, why not?"}
```
Preserves technically provides a few types of numbers:
Preserves technically provides various types of numbers:
```
# Signed Integers
@ -114,9 +114,6 @@ Preserves technically provides a few types of numbers:
5907212309572059846509324862304968273468909473609826340
-5907212309572059846509324862304968273468909473609826340
# Floats (Single-precision IEEE floats) (notice the trailing f)
3.1415927f
# Doubles (Double-precision IEEE floats)
3.141592653589793
```

View File

@ -7,7 +7,6 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
«@W V» = [0x85] ++ «W» ++ «V»
«#!V» = [0x86] ++ «V»
«V» if V ∈ Float = [0x87, 0x04] ++ binary32(V)
«V» if V ∈ Double = [0x87, 0x08] ++ binary64(V)
«V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
@ -29,5 +28,4 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
signedBigEndian(n >> 8) ++ [n & 255] otherwise
```
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
The function `binary64(D)` yields the big-endian 8-byte IEEE 754 binary representation of `D`.

View File

@ -11,7 +11,6 @@ class="postcard-grammar binarysyntax">*V*</span>.
«`#!`*V*» | = | `86` «*V*»
{:.postcard-grammar.binarysyntax}
«*V*» | = | `87``04` **binary32**(*V*) | if *V* ∈ Float
«*V*» | = | `87``08` **binary64**(*V*) | if *V* ∈ Double
{:.postcard-grammar.binarysyntax}
@ -37,10 +36,9 @@ class="postcard-grammar binarysyntax">*V*</span>.
**signedBigEndian**(*n*) | = | <span class="outputish">*n* &amp; 255</span> | if 128 ≤ *n* ≤ 127
| | **signedBigEndian**(*n* &gt;&gt; 8) <span class="outputish">*n* &amp; 255</span> | otherwise
The functions <span class="postcard-grammar binarysyntax">**binary32**(*F*)</span> and <span
class="postcard-grammar binarysyntax">**binary64**(*D*)</span> yield big-endian 4- and 8-byte
IEEE 754 binary representations of <span class="postcard-grammar binarysyntax">*F*</span> and
<span class="postcard-grammar binarysyntax">*D*</span>, respectively.
The function <span class="postcard-grammar binarysyntax">**binary64**(*D*)</span> yields the
big-endian 8-byte IEEE 754 binary representation of <span class="postcard-grammar
binarysyntax">*D*</span>.
<!--
Together, <span class="postcard-grammar binarysyntax">**div**</span> and <span

View File

@ -21,8 +21,7 @@ ByteString := `#"` binchar* `"`
String := `"` («any unicode scalar except `\` or `"`» | escaped | `\"`)* `"`
QuotedSymbol := `|` («any unicode scalar except `\` or `|`» | escaped | `\|`)* `|`
Symbol := (`A`..`Z` | `a`..`z` | `0`..`9` | sympunct | symuchar)+
Number := Float | Double | SignedInteger
Float := flt (`f`|`F`) | `#xf"` (ws hex hex)4 ws `"`
Number := Double | SignedInteger
Double := flt | `#xd"` (ws hex hex)8 ws `"`
SignedInteger := int

View File

@ -22,8 +22,7 @@
| *String* | := | `"` (« any unicode scalar value except `\` or `"` » &#124; *escaped* &#124;`\"`)<sup>⋆</sup> `"` |
| *QuotedSymbol* | := | `|` (« any unicode scalar value except `\` or `|` » &#124; *escaped* &#124;`\|`)<sup>⋆</sup> `|` |
| *Symbol* | := | (`A`..`Z`&#124;`a`..`z`&#124;`0`..`9`&#124; *sympunct* &#124; *symuchar*)<sup>+</sup> |
| *Number* | := | *Float* &#124; *Double* &#124; *SignedInteger* |
| *Float* | := | *flt* (`f`&#124;`F`) &#124;`#xf"` (**ws** *hex* *hex*)<sup>4</sup> **ws**`"` |
| *Number* | := | *Double* &#124; *SignedInteger* |
| *Double* | := | *flt* &#124;`#xd"` (**ws** *hex* *hex*)<sup>8</sup> **ws**`"` |
| *SignedInteger* | := | *int* |

View File

@ -1,5 +1,5 @@
Python's strings, byte strings, integers, booleans, and double-precision floats stand directly
for their Preserves counterparts. Wrapper objects for [Float][preserves.values.Float] and
for their Preserves counterparts. Wrapper objects for
[Symbol][preserves.values.Symbol] complete the suite of atomic types.
Python's lists and tuples correspond to Preserves `Sequence`s, and dicts and sets to

View File

@ -2,7 +2,6 @@ Here are a few example values, written using the [text
syntax](https://preserves.dev/preserves-text.html):
Boolean : #t #f
Float : 1.0f 10.4e3f -100.6f
Double : 1.0 10.4e3 -100.6
Integer : 1 0 -100
String : "Hello, world!\n"

View File

@ -4,7 +4,6 @@
| Embedded
Atom = Boolean
| Float
| Double
| SignedInteger
| String

View File

@ -38,7 +38,7 @@ representations of their keys.[^no-need-for-by-value]
**Other kinds of `Value`.**
There are no special canonicalization restrictions on
`SignedInteger`s, `String`s, `ByteString`s, `Symbol`s, `Boolean`s,
`Float`s, `Double`s, `Record`s, `Sequence`s, or `Embedded`s. The
`Double`s, `Record`s, `Sequence`s, or `Embedded`s. The
constraints given for these `Value`s in the [specification][spec]
suffice to ensure canonicity.

View File

@ -23,10 +23,10 @@ Appropriately-labelled `Record`s denote these domain-specific data
types.[^why-dictionaries]
[^why-dictionaries]: Given `Record`'s existence, it may seem odd
that `Dictionary`, `Set`, `Float`, etc. are given special
that `Dictionary`, `Set`, `Double`, etc. are given special
treatment. Preserves aims to offer a useful basic equivalence
predicate to programmers, and so if a data type demands a special
equivalence predicate, as `Dictionary`, `Set` and `Float` all do,
equivalence predicate, as `Dictionary`, `Set` and `Double` all do,
then the type should be included in the base language. Otherwise,
it can be represented as a `Record` and treated separately.
`Boolean`, `String` and `Symbol` are seeming exceptions. The first

View File

@ -7,7 +7,6 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
«@W V» = [0x85] ++ «W» ++ «V»
«#!V» = [0x86] ++ «V»
«V» if V ∈ Float = [0x87, 0x04] ++ binary32(V)
«V» if V ∈ Double = [0x87, 0x08] ++ binary64(V)
«V» if V ∈ SignedInteger = [0xB0] ++ varint(|intbytes(V)|) ++ intbytes(V)
@ -29,5 +28,4 @@ For a value `V`, we write `«V»` for the binary encoding of `V`.
signedBigEndian(n >> 8) ++ [n & 255] otherwise
```
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
The function `binary64(D)` yields the big-endian 8-byte IEEE 754 binary representation of `D`.

View File

@ -21,8 +21,7 @@ ByteString := `#"` binchar* `"`
String := `"` («any unicode scalar except `\` or `"`» | escaped | `\"`)* `"`
QuotedSymbol := `|` («any unicode scalar except `\` or `|`» | escaped | `\|`)* `|`
Symbol := (`A`..`Z` | `a`..`z` | `0`..`9` | sympunct | symuchar)+
Number := Float | Double | SignedInteger
Float := flt (`f`|`F`) | `#xf"` (ws hex hex)4 ws `"`
Number := Double | SignedInteger
Double := flt | `#xd"` (ws hex hex)8 ws `"`
SignedInteger := int

View File

@ -4,7 +4,6 @@
| Embedded
Atom = Boolean
| Float
| Double
| SignedInteger
| String

View File

@ -28,7 +28,7 @@ represented. Depending on the tag, a length indicator, further encoded
information, and/or an ending tag may follow.
tag (simple atomic data)
tag ++ length ++ binarydata (floats, doubles, integers, strings, symbols, and binary)
tag ++ length ++ binarydata (doubles, integers, strings, symbols, and binary)
tag ++ repr ++ ... ++ endtag (compound data)
The unique end tag is byte value `0x84`.
@ -121,13 +121,12 @@ below.)
«#f» = [0x80]
«#t» = [0x81]
### Floats and Doubles.
### Doubles.
«F» = [0x87, 0x04] ++ binary32(F) if F ∈ Float
«D» = [0x87, 0x08] ++ binary64(D) if D ∈ Double
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
The function `binary64(D)` yields the big-endian 8-byte IEEE 754 binary
representation of `D`.
### Embeddeds.
@ -196,7 +195,7 @@ a binary-syntax document; otherwise, it should be interpreted as text.
84 - End marker
85 - Annotation
86 - Embedded
87 - Float and Double
87 - Double
B0 - Integer
B1 - String

View File

@ -85,15 +85,14 @@ Filters: narrow down a selection without moving
^ literal # Matches a record having a the literal as its label -- equivalent to [.^ = literal]
~real # Promotes int and float to double, passes on double unchanged, rejects others
~real # Promotes int to double, passes on double unchanged, rejects others
# Out-of-range ints (too big or too small) become various double infinities
# Converting high-magnitude ints causes loss of precision
~int # Converts float and double to closest integer, where possible
~int # Converts double to closest integer, where possible
# NaN and infinities are rejected
bool # Type filters
float
double
int
string
@ -116,9 +115,9 @@ engines](https://www.regular-expressions.info/engine.html)"; (2) it
should be very widely implemented; (3) it should cover regular
languages and no more; (4) it should be easy to implement.
Design choice: How should comparison work? Should `lt 1.0f` accept not only `0.9f` but also
`#t` and `#f` (since `Boolean` comes before `Float` in the Preserves total ordering)? Should
`lt 1.0f` accept `0.9` and `0` as well as `0.9f`?
Design choice: How should comparison work? Should `lt 1.0` accept not only `0.9` but also
`#t` and `#f` (since `Boolean` comes before `Double` in the Preserves total ordering)? Should
`lt 1.0` accept `0` as well as `0.0`?
## Functions

View File

@ -324,7 +324,7 @@ The `any` pattern matches any input `Value`:
Specifying the name of a kind of `Atom` matches that kind of atom:
AtomKindPattern = "bool" / "float" / "double" / "int" / "string" / "bytes" / "symbol"
AtomKindPattern = "bool" / "double" / "int" / "string" / "bytes" / "symbol"
Embedded input `Value`s are matched with embedded patterns. The
portion under the `#!` prefix is the *interface* schema for the
@ -625,7 +625,7 @@ Simple patterns are as described above:
# any
/ =any
# special builtins: bool, float, double, int, string, bytes, symbol
# special builtins: bool, double, int, string, bytes, symbol
/ <atom @atomKind AtomKind>
# matches an embedded value in the input: #!p
@ -648,7 +648,6 @@ Simple patterns are as described above:
.
AtomKind = =Boolean
/ =Float
/ =Double
/ =SignedInteger
/ =String
@ -755,7 +754,6 @@ metaschema.
AtomKind: <or [
["Boolean", <lit Boolean>],
["Float", <lit Float>],
["Double", <lit Double>],
["SignedInteger", <lit SignedInteger>],
["String", <lit String>],
@ -878,7 +876,6 @@ definitions for the metaschema.
export type AtomKind = (
{"_variant": "Boolean"} |
{"_variant": "Float"} |
{"_variant": "Double"} |
{"_variant": "SignedInteger"} |
{"_variant": "String"} |
@ -911,7 +908,6 @@ definitions for the metaschema.
(struct AtomKind-String () #:prefab)
(struct AtomKind-SignedInteger () #:prefab)
(struct AtomKind-Double () #:prefab)
(struct AtomKind-Float () #:prefab)
(struct AtomKind-Boolean () #:prefab)
(struct Bundle (modules) #:prefab)

View File

@ -180,8 +180,8 @@ including embedded escape syntax, except using a bar or pipe character
Alternatively, a `Symbol` may be written in a “bare” form[^cf-sexp-token].
The grammar for numeric data is a subset of the grammar for bare `Symbol`s,
so if a `SymbolOrNumber` also matches the grammar for `Float`, `Double` or
`SignedInteger`, then it must be interpreted as one of those, and otherwise
so if a `SymbolOrNumber` also matches the grammar for `Double` or
`SignedInteger` then it must be interpreted as one of those, and otherwise
it must be interpreted as a bare `Symbol`.
SymbolOrNumber = 1*(ALPHA / DIGIT / sympunct / symuchar)
@ -197,14 +197,12 @@ it must be interpreted as a bare `Symbol`.
Numeric data follow the [JSON
grammar](https://tools.ietf.org/html/rfc8259#section-6) except that leading
zeros are permitted and an optional leading `+` sign is allowed. The
addition of a trailing “f” distinguishes a `Float` from a `Double` value.
`Float`s and `Double`s always have either a fractional part or an exponent
zeros are permitted and an optional leading `+` sign is allowed.
`Double`s always have either a fractional part or an exponent
part, where `SignedInteger`s never have
either.[^reading-and-writing-floats-accurately]
[^arbitrary-precision-signedinteger]
Float = flt %i"f"
Double = flt
SignedInteger = int
@ -244,14 +242,13 @@ either.[^reading-and-writing-floats-accurately]
values for equality or ordering will not yield results that match
the expected semantics of the data model.
Some valid IEEE 754 `Float`s and `Double`s are not covered by the grammar
Some valid IEEE 754 `Double`s are not covered by the grammar
above, namely, the several million NaNs and the two infinities. These are
represented as raw hexadecimal strings similar to hexadecimal
`ByteString`s. Implementations are free to use hexadecimal floating-point
syntax whereever convenient, even for values representable using the
grammar above.[^rationale-no-general-machine-syntax]
Float =/ "#xf" DQUOTE 4(ws 2HEXDIG) ws DQUOTE
Double =/ "#xd" DQUOTE 8(ws 2HEXDIG) ws DQUOTE
[^rationale-no-general-machine-syntax]: **Rationale.** Previous versions
@ -332,12 +329,11 @@ syntax.
## Appendix. Regular expressions for bare symbols and numbers
When parsing, if a token matches both `SymbolOrNumber` and `Number`, it's a
number; use `Float`, `Double` and `SignedInteger` to disambiguate. If it
number; use `Double` and `SignedInteger` to disambiguate. If it
matches `SymbolOrNumber` but not `Number`, it's a "bare" `Symbol`.
SymbolOrNumber: ^[-a-zA-Z0-9~!$%^&*?_=+/.]+$
Number: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+))([fF]?))?$
Float: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+))[fF])$
Number: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+)))?$
Double: ^([-+]?\d+)(((\.\d+([eE][-+]?\d+)?)|([eE][-+]?\d+)))$
SignedInteger: ^([-+]?\d+)$

View File

@ -39,7 +39,7 @@ follows:
(Compounds) Record < Sequence < Set < Dictionary
(Atoms) Boolean < Float < Double < SignedInteger
(Atoms) Boolean < Double < SignedInteger
< String < ByteString < Symbol
**Equivalence.**<a name="equivalence"></a> Two `Value`s are equal if
@ -92,14 +92,18 @@ less-than the “true” value.
### IEEE floating-point values.
`Float`s and `Double`s are single- and double-precision IEEE 754
floating-point values, respectively. `Float`s, `Double`s and
`SignedInteger`s are disjoint; by the rules [above](#total-order), every
`Float` is less than every `Double`, and every `SignedInteger` is
greater than both. Two `Float`s or two `Double`s are to be ordered by
the `totalOrder` predicate defined in section 5.10 of [IEEE Std
`Double`s are double-precision IEEE 754 floating-point
values.[^other-ieee754-precisions] `Double`s and `SignedInteger`s are
disjoint; by the rules [above](#total-order), every `Double` is less than
every `SignedInteger`. Two `Double`s are to be ordered by the `totalOrder`
predicate defined in section 5.10 of [IEEE Std
754-2008](https://dx.doi.org/10.1109/IEEESTD.2008.4610935).
[^other-ieee754-precisions]: Every value inhabiting a smaller IEEE 754
type (e.g. single- or half-precision) can be injected into and
projected from double-precision losslessly and in an order-preserving
way.
### Records.
A `Record` is a *labelled* tuple of `Value`s, the record's *fields*. A
@ -239,7 +243,6 @@ The total ordering specified [above](#total-order) means that the following stat
| `1.0f` | 87 04 3F 80 00 00 |
| `1.0` | 87 08 3F F0 00 00 00 00 00 00 |
| `-1.202e300` | 87 08 FE 3C B7 B7 59 BF 04 26 |
| `#xf"7f800000"`, positive `Float` infinity | 87 04 7F 80 00 00 |
| `#xd"fff0000000000000"`, negative `Double` infinity | 87 08 FF F0 00 00 00 00 00 00 |
The next example uses a non-`Symbol` label for a record.[^extensibility2] The `Record`

View File

@ -19,10 +19,10 @@ at version 0.990).
Q. Should we go for trying to make the data ordering line up with the
encoding ordering? We'd have to only use streaming forms, and avoid
the small integer encoding, and not store record arities, and sort
sets and dictionaries, and mask floats and doubles (perhaps
sets and dictionaries, and mask doubles (perhaps
[like this](https://stackoverflow.com/questions/43299299/sorting-floating-point-values-using-their-byte-representation)),
and perhaps pick a specific `NaN`, and I don't know what to do about
SignedIntegers. Perhaps make them more like float formats, with the
SignedIntegers. Perhaps make them more like floating-point formats, with the
byte count acting as a kind of exponent underneath the sign bit.
- Perhaps define separate additional canonicalization restrictions?
@ -31,11 +31,6 @@ byte count acting as a kind of exponent underneath the sign bit.
- Canonicalization and early-bailout-equivalence-checking are in
tension with support for streaming values.
Q. To remain compatible with JSON, portions of the text syntax have to
remain case-insensitive (`%i"..."`). However, non-JSON extensions do
not. There's only one (?) at the moment, the `%i"f"` in `Float`;
should it be changed to case-sensitive?
Q. Should `IOList`s be wrapped in an identifying unary record constructor?
Q. Whitespace - is having `,` as whitespace sensible or not? I can

View File

@ -19,7 +19,7 @@ affect comparisons of that `Value` to others in any way.
## JavaScript.
- `Boolean``Boolean`
- `Float` and `Double` ↔ numbers
- `Double` ↔ numbers
- `SignedInteger` ↔ numbers or `BigInt` (see [here](https://developers.google.com/web/updates/2018/05/bigint) and [here](https://github.com/tc39/proposal-bigint))
- `String` ↔ strings
- `ByteString``Uint8Array`
@ -34,7 +34,7 @@ affect comparisons of that `Value` to others in any way.
## Scheme/Racket.
- `Boolean` ↔ booleans
- `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats)
- `Double` ↔ inexact numbers
- `SignedInteger` ↔ exact numbers
- `String` ↔ strings
- `ByteString` ↔ byte vector (Racket: "Bytes")
@ -47,7 +47,7 @@ affect comparisons of that `Value` to others in any way.
## Java.
- `Boolean``Boolean`
- `Float` and `Double` ↔ `Float` and `Double`
- `Double` ↔ `Double`
- `SignedInteger``Integer`, `Long`, `BigInteger`
- `String``String`
- `ByteString``byte[]`
@ -61,7 +61,7 @@ affect comparisons of that `Value` to others in any way.
## Erlang.
- `Boolean``true` and `false`
- `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision)
- `Double` ↔ floats
- `SignedInteger` ↔ integers
- `String` ↔ pair of `utf8` and a binary
- `ByteString` ↔ a binary
@ -84,7 +84,6 @@ or `Record`s.
## Python.
- `Boolean``True` and `False`
- `Float` ↔ a `Float` wrapper-class for a double-precision value
- `Double` ↔ float
- `SignedInteger` ↔ int and long
- `String``unicode`
@ -98,7 +97,6 @@ or `Record`s.
## Squeak Smalltalk.
- `Boolean``true` and `false`
- `Float` ↔ perhaps a subclass of `Float`?
- `Double``Float`
- `SignedInteger``Integer`
- `String``WideString`