From e438085e745e5c6998b14f4a62f3c559335a8bdb Mon Sep 17 00:00:00 2001 From: Tony Garnock-Jones Date: Mon, 24 Sep 2018 18:34:07 +0100 Subject: [PATCH] Tweaks; python mapping --- syndicate/mc/preserve.md | 42 +++++++++++++++++++++++++++------------- 1 file changed, 29 insertions(+), 13 deletions(-) diff --git a/syndicate/mc/preserve.md b/syndicate/mc/preserve.md index ab3bde8..237c549 100644 --- a/syndicate/mc/preserve.md +++ b/syndicate/mc/preserve.md @@ -171,7 +171,7 @@ indicates `SignedInteger`s, not `Float`s or `Double`s. ### Records. A `Record` is a *labelled* tuple of zero or more `Value`s, called the -record's *fields*. A record's label is, itself, a `Value`, though it +record's *fields*. A record's label is itself a `Value`, though it will usually be a `Symbol`.[^extensibility] [^iri-labels] `Record`s are compared lexicographically as if they were just tuples; that is, first by their labels, and then by the remainder of their fields. We @@ -180,7 +180,7 @@ sequence of their label `Value` followed by their field `Value`s. [^extensibility]: The [Racket](https://racket-lang.org/) programming language defines - [“prefab”](http://docs.racket-lang.org/guide/define-struct.html#(part._prefab-struct)) + “[prefab](http://docs.racket-lang.org/guide/define-struct.html#(part._prefab-struct))” structure types, which map well to our `Record`s. Racket supports record extensibility by encoding record supertypes into record labels as specially-formatted lists. @@ -828,12 +828,12 @@ should both be identities. ### JavaScript - - `SignedInteger` ↔ numbers or `BigInt` [[1](https://developers.google.com/web/updates/2018/05/bigint), [2](https://github.com/tc39/proposal-bigint)] + - `Boolean` ↔ `Boolean` + - `Float` and `Double` ↔ numbers + - `SignedInteger` ↔ numbers or `BigInt` (see [here](https://developers.google.com/web/updates/2018/05/bigint) and [here](https://github.com/tc39/proposal-bigint)) - `String` ↔ strings - `ByteString` ↔ `Uint8Array` - `Symbol` ↔ `Symbol.for(...)` - - `Boolean` ↔ `Boolean` - - `Float` and `Double` ↔ numbers, - `Record` ↔ `{ "_label": theLabel, "_fields": [field0, ..., fieldN] }`, plus convenience accessors - `(undefined)` ↔ the undefined value - `(rfc3339 F)` ↔ `Date`, if `F` matches the `date-time` RFC 3339 production @@ -843,12 +843,12 @@ should both be identities. ### Scheme/Racket + - `Boolean` ↔ booleans + - `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats) - `SignedInteger` ↔ exact numbers - `String` ↔ strings - `ByteString` ↔ byte vector (Racket: "Bytes") - `Symbol` ↔ symbols - - `Boolean` ↔ booleans - - `Float` and `Double` ↔ inexact numbers (Racket: single- and double-precision floats) - `Record` ↔ structures (Racket: prefab struct) - `Sequence` ↔ lists - `Set` ↔ Racket: sets @@ -856,19 +856,22 @@ should both be identities. ### Java + - `Boolean` ↔ `Boolean` + - `Float` and `Double` ↔ `Float` and `Double` - `SignedInteger` ↔ `Integer`, `Long`, `BigInteger` - `String` ↔ `String` - `ByteString` ↔ `byte[]` - `Symbol` ↔ a simple data class wrapping a `String` - - `Boolean` ↔ `Boolean` - - `Float` and `Double` ↔ `Float` and `Double` - `Record` ↔ in a simple implementation, a generic `Record` class; else perhaps a bean mapping? + - `(mime T B)` ↔ an implementation of `javax.activation.DataSource`? - `Sequence` ↔ an implementation of `java.util.List` - `Set` ↔ an implementation of `java.util.Set` - `Dictionary` ↔ an implementation of `java.util.Map` ### Erlang + - `Boolean` ↔ `true` and `false` + - `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision) - `SignedInteger` ↔ integers - `String` ↔ tuple of `utf8` and a binary - `ByteString` ↔ a binary @@ -876,13 +879,28 @@ should both be identities. some kind of an "unsafe" mode is set on the decoder (because Erlang atoms are not GC'd); otherwise perhaps a tuple of `symbol` and a binary of the utf-8 - - `Boolean` ↔ `true` and `false` - - `Float` and `Double` ↔ floats (unsure how Erlang deals with single-precision) - `Record` ↔ a tuple with the label in the first position, and the fields in subsequent positions - `Sequence` ↔ a list - `Set` ↔ a `sets` set (is this unambiguous? Maybe a [map][erlang-map] from elements to `true`?) - `Dictionary` ↔ a [map][erlang-map] (new in Erlang/OTP R17) +This is an unsatisfactory mapping: it conflates `"hello"` with `(utf8 #"hello")`, +and `true` with `#t`. + +### Python + + - `Boolean` ↔ `True` and `False` + - `Float` ↔ a `Float` wrapper-class for a double-precision value + - `Double` ↔ float + - `SignedInteger` ↔ int + - `String` ↔ `unicode` + - `ByteString` ↔ `bytes` + - `Symbol` ↔ a simple data class wrapping a `unicode` + - `Record` ↔ something like `namedtuple`, but that doesn't care about class identity? + - `Sequence` ↔ `list` + - `Set` ↔ `set` + - `Dictionary` ↔ `dict` + ## Appendix. Why not Just Use JSON? @@ -1041,8 +1059,6 @@ encodings that are irrecoverably ambiguous. Q. Should "symbols" instead be URIs? Relative, usually; relative to what? Some domain-specific base URI? -Q. Are the language mappings reasonable? How about one for Python? - Q. Literal small integers: are they pulling their weight? They're not absolutely necessary. They mess up the connection between value-ordering and repr-ordering!