preserves/preserves-expressions.md

---
title: "P-expressions"
---

Tony Garnock-Jones <tonyg@leastfixedpoint.com>  
February 2024. Version 0.3.1.

[text syntax]: preserves-text.html

**Experimental.**
This document defines a grammar called *Preserves Expressions*
(*P-expressions*, *pexprs*) that includes [ordinary Preserves text
syntax][text syntax] but offers extensions sufficient to support a Lisp-
or Haskell-like programming notation.

**Motivation.** The [text syntax][] for Preserves works well for writing
`Value`s, i.e. data. However, in some contexts, Preserves applications
need a broader grammar that allows interleaving of *expressions* with
data. Two examples are the [Preserves Schema
language](preserves-schema.html) and the [Synit configuration scripting
language](https://synit.org/book/operation/scripting.html), both of
which (ab)use Preserves text syntax as a kind of programming notation.

## Preliminaries

The P-expression grammar includes by reference the definition of `Atom` from the
[text syntax][], as well as the definitions that `Atom` depends on.

<a id="whitespace">
**Whitespace.** Whitespace `ws` is, as in the text syntax, defined as
any number of spaces, tabs, carriage returns, or line feeds.

                ws = *(%x20 / %x09 / CR / LF)

No changes to [the Preserves semantic model](preserves.html) are made.
Every Preserves text-syntax term can be parsed as a valid P-expression,
but in general P-expressions must be rewritten or otherwise interpreted
before a meaningful Preserves value can be arrived at ([see
below](#reading-preserves)).

## Grammar

Standalone documents containing P-expressions are sequences of
individual `Expr`s, followed by annotations, comments, and/or whitespace.

          Document = *Expr Trailer ws

A single P-expression `Expr` can be an `Atom` from the [text syntax][],
a compound expression, special punctuation, an `Embedded` expression, or
an `Annotated` expression. The class `SimpleExpr` includes all of `Expr`
except special punctuation.

              Expr = ws (SimpleExpr / Punct)
        SimpleExpr = Atom / Compound / Embedded / Annotated

Embedded and annotated values are as in the text syntax, differing only
in that uses of `Value` are replaced with `SimpleExpr`.

           Embedded = "#:" SimpleExpr
          Annotated = Annotation SimpleExpr
         Annotation = "@" SimpleExpr / "#" [(%x20 / %x09) linecomment] (CR / LF)
        linecomment = *<any unicode scalar value except CR or LF>

P-expression special punctuation marks are comma, semicolon, and
sequences of one or more colons.[^greedy-colons]

             Punct = "," / ";" / 1*":"

[^greedy-colons]: Colon matching is greedy: when reading, all adjacent
    colons are always taken into a single token, and when writing,
    adjacent colon-sequence punctuation marks must be written with
    whitespace separating them.

Compound expressions are sequences of `Expr`s with optional trailing
`Annotation`s, surrounded by various kinds of parentheses.

          Compound = Sequence / Record / Block / Group / Set
          Sequence =  "[" *Expr Trailer ws "]"
            Record =  "<" *Expr Trailer ws ">"
             Block =  "{" *Expr Trailer ws "}"
             Group =  "(" *Expr Trailer ws ")"
               Set = "#{" *Expr Trailer ws "}"

In an `Annotated` P-expression, annotations and comments attach to the
term following them, just as in the ordinary text syntax. However, it is
common in programming notations to allow comments at the end of a file
or other sequential construct. The ordinary text syntax forbids comments
in these positions, but P-expressions allow them.

           Trailer = *(ws Annotation)

## <a id="encoding-pexprs"></a>Encoding P-expressions as Preserves

We write ⌜*p*⌝ for the encoding into Preserves of a P-expression *p*.

{:.pseudocode.equations}
| ⌜·⌝ : **Expr**   | ⟶ | **Value** |
| ⌜`[`*p* ...`]`⌝  | = | `[`⌜*p*⌝ ...`]`         |
| ⌜`<`*p* ...`>`⌝  | = | `<r` ⌜*p*⌝ ...`>`       |
| ⌜`{`*p* ...`}`⌝  | = | `<b` ⌜*p*⌝ ...`>`       |
| ⌜`(`*p* ...`)`⌝  | = | `<g` ⌜*p*⌝ ...`>`       |
| ⌜`#{`*p* ...`}`⌝ | = | `<s `⌜*p*⌝ ...`>`       |
| ⌜`#:`*p*⌝        | = | `#:`⌜*p*⌝               |
| ⌜`@`*p* *q*⌝     | = | `@`⌜*p*⌝ ⌜*q*⌝          |
| ⌜*p*⌝            | = | *p* | when *p* ∈ **Atom** |
| ⌜`,`⌝            | = | `<p |,|>`               |
| ⌜`;`⌝            | = | `<p |;|>`               |
| ⌜`:` ...⌝        | = | `<p |:` ...`|>`         |
| ⌜*t*⌝            | = | `@`⌜*a*⌝ ... `<a>` | where `@`*a* ... are the annotations in *t* and *t* ∈ **Trailer** |

The record `<a>` acts as an anchor for the annotations in a `Trailer`.

We overload the ⌜·⌝ notation for encoding whole `Document`s into
sequences of Preserves values.

{:.pseudocode.equations}
| ⌜·⌝ : **P-expression Document** | ⟶ | **Preserves Sequence**                    |
| ⌜*p* ...⌝                       | = | `[`⌜*p*⌝ ...`]`                           |
| ⌜*p* ... `@`*a* ...⌝            | = | `[`⌜*p*⌝ ... `@`⌜*a*⌝ ... `<a>]`          |
|                                 |   | where `@`*a* ... are trailing annotations |

## <a id="reading-preserves"></a>Interpreting P-expressions as Preserves

The [previous section](#encoding-pexprs) discussed ways of representing
P-expressions using Preserves. Here, we discuss *interpreting*
P-expressions *as* Preserves so that (1) a Preserves datum (2) written
using Preserves text syntax and then (3) read as a P-expression can be
(4) interpreted from that P-expression to yield the original datum.

 1. Every `(`..`)` or `;` that appears is an error.
 2. Every `:`, `::`, `:::`, ... is an error, except in context of `Block`s as described below.
 3. Every `,` that appears is discarded.
 4. Every `Trailer` that appears is an error.[^discard-trailers-instead-of-error]
 5. Every `Record` with no values in it is an error.
 6. Every `Block` must contain zero or more repeating triplets of
    `SimpleExpr`, `:`, `SimpleExpr`. Any `Block` not following this
    pattern is an error. Each `Block` following the pattern is
    translated to a `Dictionary` containing a key/value pair for each
    triplet. Any `Block` with duplicate keys (under interpretation) is
    an error.
 7. Every `Set` containing any duplicate expressions (under
    interpretation) is an error.

[^discard-trailers-instead-of-error]: **Implementation note.** When
    implementing parsing of P-expressions into Preserves, consider
    offering an optional mode where trailing annotations `Trailer` are
    *discarded* instead of causing an error to be signalled.

## Appendix: Examples

Examples are given as pairs of P-expressions and their Preserves
text-syntax encodings.

### Individual P-expression `Expr`s

```preserves
 ⌜<date 1821 (lookup-month "February") 3>⌝
= <r date 1821 <g lookup-month "February"> 3>
```

```preserves
 ⌜<>⌝
= <r>
```

```preserves
 ⌜(begin (println! (+ 1 2)) (+ 3 4))⌝
= <g begin <g println! <g + 1 2>> <g + 3 4>>
```

```preserves
 ⌜()⌝
= <g>

 ⌜[() () ()]⌝
= [<g>, <g>, <g>]
```

```preserves
 ⌜{
      setUp();
      # Now enter the loop
      loop: {
          greet("World");
      }
      tearDown();
  }⌝
= <b
      setUp <g> <p |;|>
      # Now enter the loop
      loop <p |:|> <b
          greet <g "World"> <p |;|>
      >
      tearDown <g> <p |;|>
  >
```

```preserves
 ⌜[1 + 2.0, print "Hello", predicate: #t, foo, #:remote, bar]⌝
= [1 + 2.0 <p |,|> print "Hello" <p |,|> predicate <p |:|> #t <p |,|>
   foo <p |,|> #:remote <p |,|> bar]
```

```preserves
 ⌜#{1 2 3}⌝
= <s 1 2 3>

 ⌜#{(read) (read) (read)}⌝
= <s <g read> <g read> <g read>>
```

```preserves
 ⌜{
      optional name: string,
      address: Address,
  }⌝
= <b
      optional name <p |:|> string <p |,|>
      address <p |:|> Address <p |,|>
  >
```

### Whole `Document`s

```preserves
 ⌜{
      key: value
      # example of a comment at the end of a dictionary
  }
  # example of a comment at the end of the input file⌝
= [ <b
        key <p |:|> value
        @"example of a comment at the end of a dictionary" <a>
    >
    @"example of a comment at the end of the input file"
    <a>
  ]
```

## Appendix: Reading vs. Parsing

Lisp systems first *read* streams of bytes into S-expressions and then
*parse* those S-expressions into more abstract structures denoting
various kinds of program syntax. [Separation of reading from parsing is
what gives Lisp its syntactic
flexibility.](http://calculist.org/blog/2012/04/17/homoiconicity-isnt-the-point/)

Similarly, the Apple programming language
[Dylan](https://en.wikipedia.org/wiki/Dylan_(programming_language))
included a reader-parser split, with the Dylan reader producing
*D-expressions* that are somewhat similar to P-expressions.

Finally, the Racket dialects
[Honu](https://docs.racket-lang.org/honu/index.html) and
[Something](https://github.com/tonyg/racket-something) use a
reader-parser-macro setup, where the reader produces Racket data, the
parser produces "syntax" and is user-extensible, and Racket's own
modular macro system rewrites this "syntax" down to core forms to be
compiled to machine code.

Similarly, when using P-expressions as the foundation for a language, a
generic P-expression reader can then feed into special-purpose
*parsers*. The reader captures the coarse syntactic structure of a
program, and the parser refines this.

Often, a parser will wish to extract structure from sequences of
P-expression `Expr`s.

 - A simple technique is repeated splitting of sequences; first by
   `Semicolon`, then by `Comma`, then by increasingly high binding-power
   operators.

 - More refined is to use a Pratt parser or similar
   ([1](https://en.wikipedia.org/wiki/Operator-precedence_parser),
   [2](https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html),
   [3](https://github.com/tonyg/racket-something/blob/f6116bf3861b76970f5ce291a628476adef820b4/src/something/pratt.rkt))
   to build a parse tree using an extensible specification of the pre-,
   in-, and postfix operators involved.

 - Finally, if you treat sequences of `Expr`s as pre-lexed token
   streams, almost any parsing formalism (such as [PEG
   parsing](https://en.wikipedia.org/wiki/Parsing_expression_grammar),
   [Ometa](https://en.wikipedia.org/wiki/OMeta), etc.) can be used to
   extract further syntactic structure.

## <a id="reading-preserves-equationally"></a>Appendix: Equations for interpreting P-expressions as Preserves

The function **uncomma**(*p*) removes all occurrences of `,` from a
P-expression *p* ∈ `Expr` − {`,`}.

{:.pseudocode.equations}
| **uncomma** : **Expr** − {`,`} | ⟶ | **Expr**                             |                                       |
| **uncomma**(`[`*p* ...`]`)     | = | `[`**uncomma**(*p*) ...`]`           | omitting any *p* = `,`                |
| **uncomma**(`<`*p* ...`>`)     | = | `<`**uncomma**(*p*) ...`>`           | omitting any *p* = `,`                |
| **uncomma**(`{`*p* ...`}`)     | = | `{`**uncomma**(*p*) ...`}`           | omitting any *p* = `,`                |
| **uncomma**(`(`*p* ...`)`)     | = | `(`**uncomma**(*p*) ...`)`           | omitting any *p* = `,`                |
| **uncomma**(`#{`*p* ...`}`)    | = | `#{`**uncomma**(*p*) ...`}`          | omitting any *p* = `,`                |
| **uncomma**(`#:`*p*)           | = | `#:`**uncomma**(*p*)                 |                                       |
| **uncomma**(`@`*p* *q*)        | = | `@`**uncomma**(*p*) **uncomma**(*q*) |                                       |
| **uncomma**(*p*)               | = | *p*                                  | if *p* ∈ **Atom** ∪ **Punct** − {`,`} |

We write ⌞**uncomma**(*p*)⌟ for the partial function mapping a
P-expression *p* ∈ `Expr` − {`,`} to a corresponding Preserves `Value`.

{:.pseudocode.equations}
| ⌞·⌟ : **Expr** − {`,`} | ⇀ | **Value**               |                               |
| ⌞`[`*p* ...`]`⌟        | = | `[`⌞*p*⌟ ...`]`         |                               |
| ⌞`<`ℓ *p* ...`>`⌟      | = | `<`⌞ℓ⌟ ⌞*p*⌟ ...`>`     |                               |
| ⌞`{`*k*`:`*v* ...`}`⌟  | = | `{`⌞*k*⌟`:`⌞*v*⌟ ...`}` | if all ⌞*k*⌟ ... are distinct |
| ⌞`#{`*p* ...`}`⌟       | = | `#{`⌞*p*⌟ ...`}`        | if all ⌞*p*⌟ ... are distinct |
| ⌞`#:`*p*⌟              | = | `#:`⌞*p*⌟               |                               |
| ⌞`@`*p* *q*⌟           | = | `@`⌞*p*⌟ ⌞*q*⌟          |                               |
| ⌞*p*⌟                  | = | *p*                     | when *p* ∈ **Atom**           |

## Notes
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								---
 								title: "P-expressions"
 								---
 								Tony Garnock-Jones <tonyg@leastfixedpoint.com>
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								February 2024. Version 0.3.1.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
 								[text syntax]: preserves-text.html
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												Cheatsheets for pexprs

											
										
										
											2023-11-01 14:15:24 +00:00
+								**Experimental.**
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								This document defines a grammar called *Preserves Expressions*
 								(*P-expressions*, *pexprs*) that includes [ordinary Preserves text
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								syntax][text syntax] but offers extensions sufficient to support a Lisp-
 								or Haskell-like programming notation.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								**Motivation.** The [text syntax][] for Preserves works well for writing
 								`Value`s, i.e. data. However, in some contexts, Preserves applications
 								need a broader grammar that allows interleaving of *expressions* with
 								data. Two examples are the [Preserves Schema
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								language](preserves-schema.html) and the [Synit configuration scripting
 								language](https://synit.org/book/operation/scripting.html), both of
 								which (ab)use Preserves text syntax as a kind of programming notation.
 								## Preliminaries
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								The P-expression grammar includes by reference the definition of `Atom` from the
 								[text syntax][], as well as the definitions that `Atom` depends on.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								<a id="whitespace">
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								**Whitespace.** Whitespace `ws` is, as in the text syntax, defined as
 								any number of spaces, tabs, carriage returns, or line feeds.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								                ws = *(%x20 / %x09 / CR / LF)
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								No changes to [the Preserves semantic model](preserves.html) are made.
 								Every Preserves text-syntax term can be parsed as a valid P-expression,
 								but in general P-expressions must be rewritten or otherwise interpreted
 								before a meaningful Preserves value can be arrived at ([see
 								below](#reading-preserves)).
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
 								## Grammar
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								Standalone documents containing P-expressions are sequences of
-												Fix bugs wrt pexpr trailers

											
										
										
											2023-11-01 14:35:59 +00:00
+								individual `Expr`s, followed by annotations, comments, and/or whitespace.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												Fix bugs wrt pexpr trailers

											
										
										
											2023-11-01 14:35:59 +00:00
+								          Document = *Expr Trailer ws
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								A single P-expression `Expr` can be an `Atom` from the [text syntax][],
 								a compound expression, special punctuation, an `Embedded` expression, or
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								an `Annotated` expression. The class `SimpleExpr` includes all of `Expr`
 								except special punctuation.
-												Empty records

											
										
										
											2023-11-01 10:20:33 +00:00
-												Cheatsheets for pexprs

											
										
										
											2023-11-01 14:15:24 +00:00
+								              Expr = ws (SimpleExpr / Punct)
 								        SimpleExpr = Atom / Compound / Embedded / Annotated
-												Empty records

											
										
										
											2023-11-01 10:20:33 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								Embedded and annotated values are as in the text syntax, differing only
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								in that uses of `Value` are replaced with `SimpleExpr`.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								           Embedded = "#:" SimpleExpr
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								          Annotated = Annotation SimpleExpr
 								         Annotation = "@" SimpleExpr / "#" [(%x20 / %x09) linecomment] (CR / LF)
-												Cheatsheets for pexprs

											
										
										
											2023-11-01 14:15:24 +00:00
+								        linecomment = *<any unicode scalar value except CR or LF>
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												write-pexpr (not quite right yet)

											
										
										
											2023-11-05 19:43:23 +00:00
+								P-expression special punctuation marks are comma, semicolon, and
 								sequences of one or more colons.[^greedy-colons]
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								             Punct = "," / ";" / 1*":"
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
-												write-pexpr (not quite right yet)

											
										
										
											2023-11-05 19:43:23 +00:00
+								[^greedy-colons]: Colon matching is greedy: when reading, all adjacent
 								    colons are always taken into a single token, and when writing,
 								    adjacent colon-sequence punctuation marks must be written with
 								    whitespace separating them.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								Compound expressions are sequences of `Expr`s with optional trailing
 								`Annotation`s, surrounded by various kinds of parentheses.
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								          Compound = Sequence / Record / Block / Group / Set
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								          Sequence =  "[" *Expr Trailer ws "]"
 								            Record =  "<" *Expr Trailer ws ">"
 								             Block =  "{" *Expr Trailer ws "}"
 								             Group =  "(" *Expr Trailer ws ")"
 								               Set = "#{" *Expr Trailer ws "}"
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								In an `Annotated` P-expression, annotations and comments attach to the
 								term following them, just as in the ordinary text syntax. However, it is
 								common in programming notations to allow comments at the end of a file
 								or other sequential construct. The ordinary text syntax forbids comments
 								in these positions, but P-expressions allow them.
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
-												Fix bugs wrt pexpr trailers

											
										
										
											2023-11-01 14:35:59 +00:00
+								           Trailer = *(ws Annotation)
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
-												Tweaks

											
										
										
											2023-10-31 17:06:05 +00:00
+								## <a id="encoding-pexprs"></a>Encoding P-expressions as Preserves
-												Nuance

											
										
										
											2023-11-01 12:27:26 +00:00
+								We write ⌜*p*⌝ for the encoding into Preserves of a P-expression *p*.
-												Tweaks

											
										
										
											2023-10-31 17:06:05 +00:00
 								{:.pseudocode.equations}
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								| ⌜·⌝ : **Expr**   | ⟶ | **Value** |
 								| ⌜`[`*p* ...`]`⌝  | = | `[`⌜*p*⌝ ...`]`         |
 								| ⌜`<`*p* ...`>`⌝  | = | `<r` ⌜*p*⌝ ...`>`       |
 								| ⌜`{`*p* ...`}`⌝  | = | `<b` ⌜*p*⌝ ...`>`       |
 								| ⌜`(`*p* ...`)`⌝  | = | `<g` ⌜*p*⌝ ...`>`       |
 								| ⌜`#{`*p* ...`}`⌝ | = | `<s `⌜*p*⌝ ...`>`       |
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								| ⌜`#:`*p*⌝        | = | `#:`⌜*p*⌝               |
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								| ⌜`@`*p* *q*⌝     | = | `@`⌜*p*⌝ ⌜*q*⌝          |
 								| ⌜*p*⌝            | = | *p* | when *p* ∈ **Atom** |
 								| ⌜`,`⌝            | = | `<p |,|>`               |
 								| ⌜`;`⌝            | = | `<p |;|>`               |
 								| ⌜`:` ...⌝        | = | `<p |:` ...`|>`         |
-												Correct encoding functions for trailers

											
										
										
											2023-11-01 14:42:25 +00:00
+								| ⌜*t*⌝            | = | `@`⌜*a*⌝ ... `<a>` | where `@`*a* ... are the annotations in *t* and *t* ∈ **Trailer** |
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								The record `<a>` acts as an anchor for the annotations in a `Trailer`.
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
 								We overload the ⌜·⌝ notation for encoding whole `Document`s into
 								sequences of Preserves values.
 								{:.pseudocode.equations}
-												Correct encoding functions for trailers

											
										
										
											2023-11-01 14:42:25 +00:00
+								| ⌜·⌝ : **P-expression Document** | ⟶ | **Preserves Sequence**                    |
 								| ⌜*p* ...⌝                       | = | `[`⌜*p*⌝ ...`]`                           |
 								| ⌜*p* ... `@`*a* ...⌝            | = | `[`⌜*p*⌝ ... `@`⌜*a*⌝ ... `<a>]`          |
 								|                                 |   | where `@`*a* ... are trailing annotations |
-												Tweaks

											
										
										
											2023-10-31 17:06:05 +00:00
 								## <a id="reading-preserves"></a>Interpreting P-expressions as Preserves
 								The [previous section](#encoding-pexprs) discussed ways of representing
 								P-expressions using Preserves. Here, we discuss *interpreting*
-												Nuance

											
										
										
											2023-11-01 12:27:26 +00:00
+								P-expressions *as* Preserves so that (1) a Preserves datum (2) written
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								using Preserves text syntax and then (3) read as a P-expression can be
 								(4) interpreted from that P-expression to yield the original datum.
-												Tweaks

											
										
										
											2023-10-31 17:06:05 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+. Every `(`..`)` or `;` that appears is an error.
 . Every `:`, `::`, `:::`, ... is an error, except in context of `Block`s as described below.
 . Every `,` that appears is discarded.
-												Empty records

											
										
										
											2023-11-01 10:20:33 +00:00
+. Every `Trailer` that appears is an error.[^discard-trailers-instead-of-error]
 . Every `Record` with no values in it is an error.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+. Every `Block` must contain zero or more repeating triplets of
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								    `SimpleExpr`, `:`, `SimpleExpr`. Any `Block` not following this
 								    pattern is an error. Each `Block` following the pattern is
 								    translated to a `Dictionary` containing a key/value pair for each
 								    triplet. Any `Block` with duplicate keys (under interpretation) is
 								    an error.
 . Every `Set` containing any duplicate expressions (under
 								    interpretation) is an error.
-												Nuance

											
										
										
											2023-11-01 12:27:26 +00:00
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
+								[^discard-trailers-instead-of-error]: **Implementation note.** When
 								    implementing parsing of P-expressions into Preserves, consider
 								    offering an optional mode where trailing annotations `Trailer` are
 								    *discarded* instead of causing an error to be signalled.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								## Appendix: Examples
 								Examples are given as pairs of P-expressions and their Preserves
 								text-syntax encodings.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								### Individual P-expression `Expr`s
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```preserves
 								 ⌜<date 1821 (lookup-month "February") 3>⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= <r date 1821 <g lookup-month "February"> 3>
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```
-												Empty records

											
										
										
											2023-11-01 10:20:33 +00:00
+								```preserves
 								 ⌜<>⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= <r>
-												Empty records

											
										
										
											2023-11-01 10:20:33 +00:00
+								```
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```preserves
 								 ⌜(begin (println! (+ 1 2)) (+ 3 4))⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= <g begin <g println! <g + 1 2>> <g + 3 4>>
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```
 								```preserves
 								 ⌜()⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= <g>
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
 								 ⌜[() () ()]⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= [<g>, <g>, <g>]
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```
 								```preserves
 								 ⌜{
 								      setUp();
 								      # Now enter the loop
 								      loop: {
 								          greet("World");
 								      }
 								      tearDown();
 								  }⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= <b
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								      setUp <g> <p |;|>
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								      # Now enter the loop
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								      loop <p |:|> <b
 								          greet <g "World"> <p |;|>
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								      >
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								      tearDown <g> <p |;|>
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								  >
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```
 								```preserves
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								 ⌜[1 + 2.0, print "Hello", predicate: #t, foo, #:remote, bar]⌝
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								= [1 + 2.0 <p |,|> print "Hello" <p |,|> predicate <p |:|> #t <p |,|>
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								   foo <p |,|> #:remote <p |,|> bar]
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								```
 								```preserves
 								 ⌜#{1 2 3}⌝
 								= <s 1 2 3>
 								 ⌜#{(read) (read) (read)}⌝
 								= <s <g read> <g read> <g read>>
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```
 								```preserves
 								 ⌜{
 								      optional name: string,
 								      address: Address,
 								  }⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= <b
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								      optional name <p |:|> string <p |,|>
 								      address <p |:|> Address <p |,|>
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								  >
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								```
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
+								### Whole `Document`s
 								```preserves
 								 ⌜{
 								      key: value
 								      # example of a comment at the end of a dictionary
 								  }
 								  # example of a comment at the end of the input file⌝
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								= [ <b
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								        key <p |:|> value
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								        @"example of a comment at the end of a dictionary" <a>
 								    >
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
+								    @"example of a comment at the end of the input file"
-												Use records instead of dictionaries

											
										
										
											2023-11-01 10:57:23 +00:00
+								    <a>
-												Trailing comments

											
										
										
											2023-10-31 18:32:06 +00:00
+								  ]
 								```
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								## Appendix: Reading vs. Parsing
 								Lisp systems first *read* streams of bytes into S-expressions and then
 								*parse* those S-expressions into more abstract structures denoting
 								various kinds of program syntax. [Separation of reading from parsing is
 								what gives Lisp its syntactic
 								flexibility.](http://calculist.org/blog/2012/04/17/homoiconicity-isnt-the-point/)
 								Similarly, the Apple programming language
 								[Dylan](https://en.wikipedia.org/wiki/Dylan_(programming_language))
 								included a reader-parser split, with the Dylan reader producing
 								*D-expressions* that are somewhat similar to P-expressions.
 								Finally, the Racket dialects
 								[Honu](https://docs.racket-lang.org/honu/index.html) and
 								[Something](https://github.com/tonyg/racket-something) use a
 								reader-parser-macro setup, where the reader produces Racket data, the
 								parser produces "syntax" and is user-extensible, and Racket's own
 								modular macro system rewrites this "syntax" down to core forms to be
 								compiled to machine code.
 								Similarly, when using P-expressions as the foundation for a language, a
 								generic P-expression reader can then feed into special-purpose
 								*parsers*. The reader captures the coarse syntactic structure of a
 								program, and the parser refines this.
 								Often, a parser will wish to extract structure from sequences of
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								P-expression `Expr`s.
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
 								 - A simple technique is repeated splitting of sequences; first by
 								   `Semicolon`, then by `Comma`, then by increasingly high binding-power
 								   operators.
 								 - More refined is to use a Pratt parser or similar
 								   ([1](https://en.wikipedia.org/wiki/Operator-precedence_parser),
 								   [2](https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html),
 								   [3](https://github.com/tonyg/racket-something/blob/f6116bf3861b76970f5ce291a628476adef820b4/src/something/pratt.rkt))
 								   to build a parse tree using an extensible specification of the pre-,
 								   in-, and postfix operators involved.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								 - Finally, if you treat sequences of `Expr`s as pre-lexed token
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								   streams, almost any parsing formalism (such as [PEG
 								   parsing](https://en.wikipedia.org/wiki/Parsing_expression_grammar),
 								   [Ometa](https://en.wikipedia.org/wiki/OMeta), etc.) can be used to
 								   extract further syntactic structure.
-												Nuance

											
										
										
											2023-11-01 12:27:26 +00:00
+								## <a id="reading-preserves-equationally"></a>Appendix: Equations for interpreting P-expressions as Preserves
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								The function **uncomma**(*p*) removes all occurrences of `,` from a
 								P-expression *p* ∈ `Expr` − {`,`}.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
-												Nuance

											
										
										
											2023-11-01 12:27:26 +00:00
+								{:.pseudocode.equations}
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								| **uncomma** : **Expr** − {`,`} | ⟶ | **Expr**                             |                                       |
 								| **uncomma**(`[`*p* ...`]`)     | = | `[`**uncomma**(*p*) ...`]`           | omitting any *p* = `,`                |
 								| **uncomma**(`<`*p* ...`>`)     | = | `<`**uncomma**(*p*) ...`>`           | omitting any *p* = `,`                |
 								| **uncomma**(`{`*p* ...`}`)     | = | `{`**uncomma**(*p*) ...`}`           | omitting any *p* = `,`                |
 								| **uncomma**(`(`*p* ...`)`)     | = | `(`**uncomma**(*p*) ...`)`           | omitting any *p* = `,`                |
 								| **uncomma**(`#{`*p* ...`}`)    | = | `#{`**uncomma**(*p*) ...`}`          | omitting any *p* = `,`                |
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								| **uncomma**(`#:`*p*)           | = | `#:`**uncomma**(*p*)                 |                                       |
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								| **uncomma**(`@`*p* *q*)        | = | `@`**uncomma**(*p*) **uncomma**(*q*) |                                       |
 								| **uncomma**(*p*)               | = | *p*                                  | if *p* ∈ **Atom** ∪ **Punct** − {`,`} |
-												Nuance

											
										
										
											2023-11-01 12:27:26 +00:00
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
+								We write ⌞**uncomma**(*p*)⌟ for the partial function mapping a
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								P-expression *p* ∈ `Expr` − {`,`} to a corresponding Preserves `Value`.
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
 								{:.pseudocode.equations}
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								| ⌞·⌟ : **Expr** − {`,`} | ⇀ | **Value**               |                               |
 								| ⌞`[`*p* ...`]`⌟        | = | `[`⌞*p*⌟ ...`]`         |                               |
 								| ⌞`<`ℓ *p* ...`>`⌟      | = | `<`⌞ℓ⌟ ⌞*p*⌟ ...`>`     |                               |
 								| ⌞`{`*k*`:`*v* ...`}`⌟  | = | `{`⌞*k*⌟`:`⌞*v*⌟ ...`}` | if all ⌞*k*⌟ ... are distinct |
 								| ⌞`#{`*p* ...`}`⌟       | = | `#{`⌞*p*⌟ ...`}`        | if all ⌞*p*⌟ ... are distinct |
-												Switch from `#!` to `#:` for embedded values

											
										
										
											2024-02-05 21:38:49 +00:00
+								| ⌞`#:`*p*⌟              | = | `#:`⌞*p*⌟               |                               |
-												Restrict usage of commas. Bump spec version to 0.992

											
										
										
											2023-11-01 13:13:36 +00:00
+								| ⌞`@`*p* *q*⌟           | = | `@`⌞*p*⌟ ⌞*q*⌟          |                               |
 								| ⌞*p*⌟                  | = | *p*                     | when *p* ∈ **Atom**           |
-												This is way better

											
										
										
											2023-11-01 12:06:15 +00:00
-												preserves-expressions.md

											
										
										
											2023-10-31 16:37:09 +00:00
+								## Notes