From 23e0e59dafca603a1a8e1f4f4ea9c4000f1029ac Mon Sep 17 00:00:00 2001 From: Tony Garnock-Jones Date: Tue, 31 Oct 2023 19:32:06 +0100 Subject: [PATCH] Trailing comments --- preserves-expressions.md | 78 +++++++++++++++++++++++++++++++++++----- preserves-text.md | 5 +-- 2 files changed, 72 insertions(+), 11 deletions(-) diff --git a/preserves-expressions.md b/preserves-expressions.md index 7901621..abcc252 100644 --- a/preserves-expressions.md +++ b/preserves-expressions.md @@ -3,7 +3,7 @@ title: "P-expressions" --- Tony Garnock-Jones -October 2023. Version 0.1.0. +October 2023. Version 0.1.1. This document defines a grammar called *Preserves Expressions* (*P-expressions*, *pexprs*) that includes [ordinary Preserves text @@ -51,7 +51,7 @@ to the syntax class `Value`. Now that colon is in `Value`, the syntax for `Dictionary` is replaced with `Block` everywhere it is mentioned. - Block = "{" *Value ws "}" + Block = "{" *Value ws "}" New syntax for explicit uninterpreted grouping of sequences of values is introduced, and added to class `Value`. @@ -70,15 +70,39 @@ P-expressions must be rewritten or otherwise interpreted before a meaningful Preserves value can be arrived at ([see below](#reading-preserves)). +## Annotations and Comments + +Annotations and comments attach to the term following them, just as in +the ordinary text syntax. However, it is common in programming notations +to allow comments at the end of a file or other sequential construct: + + { + key: value + # example of a comment at the end of a dictionary + } + # example of a comment at the end of the input file + +While the ordinary text syntax forbids comments in these positions, +P-expressions allow them: + + Document =/ *Value Trailer ws + Record =/ "<" Value *Value Trailer ws ">" + Sequence =/ "[" *Value Trailer ws "]" + Set =/ "#{" *Value Trailer ws "}" + Block =/ "{" *Value Trailer ws "}" + + Trailer = 1*Annotation + ## Encoding P-expressions as Preserves We write ⌜*p*⌝ for the encoding into Preserves of P-expression *p*. {:.pseudocode.equations} -| ⌜·⌝ | : | **P-expression** ⟶ **Preserves** | +| ⌜·⌝ : **P-expression** | ⟶ | **Preserves** | -Aside from the special classes `Group`, `Block`, `Comma`, `Semicolon` or -`Colons`, P-expressions are encoded directly as Preserves data. +Aside from the special classes `Group`, `Block`, `Comma`, `Semicolon`, +`Colons`, or `Trailer`, P-expressions are encoded directly as Preserves +data. {:.pseudocode.equations} | ⌜`[`*p* ...`]`⌝ | = | `[`⌜*p*⌝ ...`]` | @@ -86,10 +110,10 @@ Aside from the special classes `Group`, `Block`, `Comma`, `Semicolon` or | ⌜`#{`*p* ...`}`⌝ | = | `#{`⌜*p*⌝ ...`}` | | ⌜`#!`*p*⌝ | = | `#!`⌜*p*⌝ | | ⌜`@`*p* *q*⌝ | = | `@`⌜*p*⌝ ⌜*q*⌝ | -| ⌜*p*⌝ | = | *p* **when** *p* ∈ **Atom** | +| ⌜*p*⌝ | = | *p* when *p* ∈ **Atom** | -All members of the special classes are encoded as Preserves text -`Dictionary`[^encoding-rationale] values. +All members of the special classes are encoded as Preserves +dictionaries[^encoding-rationale]. [^encoding-rationale]: In principle, it would be nice to use *records* for this purpose, but if we did so we would have to also encode @@ -101,6 +125,17 @@ All members of the special classes are encoded as Preserves text | ⌜`,`⌝ | = | `{s:|,|}` | | ⌜`;`⌝ | = | `{s:|;|}` | | ⌜`:` ...⌝ | = | `{s:|:` ...`|}` | +| ⌜*t*⌝ | = | ⌜*a*⌝ ... `{}`, where *a* ... are the annotations in *t* and *t* ∈ **Trailer** | + +The empty dictionary `{}` acts as an anchor for the annotations in a +`Trailer`. + +We overload the ⌜·⌝ notation for encoding whole `Document`s into +sequences of Preserves values. + +{:.pseudocode.equations} +| ⌜·⌝ : **P-expression Document** | ⟶ | **Preserves Sequence** | +| ⌜*p* ...⌝ | = | `[`⌜*p*⌝ ...`]` | ## Interpreting P-expressions as Preserves @@ -117,17 +152,25 @@ classes mentioned above. 1. Every `Group` or `Semicolon` that appears is an error. 2. Every `Colons` with two or more colons in it is an error. - 3. Every `Comma` that appears is removed from its container. + 3. Every `Comma` that appears is discarded. + 3. Every `Trailer` that appears is an error.[^discard-trailers-instead-of-error] 4. Every `Block` must contain triplets of `Value`, `Colons` (with a single colon), `Value`. Any `Block` not following this pattern is an error. Each `Block` following the pattern is translated to a `Dictionary` containing a key/value pair for each triplet. +[^discard-trailers-instead-of-error]: **Implementation note.** When + implementing parsing of P-expressions into Preserves, consider + offering an optional mode where trailing annotations `Trailer` are + *discarded* instead of causing an error to be signalled. + ## Appendix: Examples Examples are given as pairs of P-expressions and their Preserves text-syntax encodings. +### Individual P-expression `Value`s + ```preserves ⌜⌝ = @@ -182,6 +225,23 @@ text-syntax encodings. ]} ``` +### Whole `Document`s + +```preserves + ⌜{ + key: value + # example of a comment at the end of a dictionary + } + # example of a comment at the end of the input file⌝ += [ {b:[ + key {s:|:|} value + @"example of a comment at the end of a dictionary" {} + ]} + @"example of a comment at the end of the input file" + {} + ] +``` + ## Appendix: Reading vs. Parsing Lisp systems first *read* streams of bytes into S-expressions and then diff --git a/preserves-text.md b/preserves-text.md index b7dfef0..00942a4 100644 --- a/preserves-text.md +++ b/preserves-text.md @@ -273,7 +273,8 @@ value. Each annotation is, in turn, a `Value`, and may itself have annotations. The ordering of annotations attached to a `Value` is significant. - Value =/ ws "@" Value Value + Value =/ ws Annotation Value + Annotation = "@" Value Each annotation is preceded by `@`; the underlying annotated value follows its annotations. Here we extend only the syntactic nonterminal @@ -283,7 +284,7 @@ named “`Value`” without altering the semantic class of `Value`s. interpreted as comments associated with that value. Comments are sufficiently common that special syntax exists for them. - Value =/ ws ";" linecomment (CR / LF) Value + Annotation =/ ";" linecomment (CR / LF) linecomment = * When written this way, everything between the `;` and the end of the line