Trailing comments

This commit is contained in:
Tony Garnock-Jones 2023-10-31 19:32:06 +01:00
parent c18e9dd1fe
commit 23e0e59daf
2 changed files with 72 additions and 11 deletions

View File

@ -3,7 +3,7 @@ title: "P-expressions"
---
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
October 2023. Version 0.1.0.
October 2023. Version 0.1.1.
This document defines a grammar called *Preserves Expressions*
(*P-expressions*, *pexprs*) that includes [ordinary Preserves text
@ -51,7 +51,7 @@ to the syntax class `Value`.
Now that colon is in `Value`, the syntax for `Dictionary` is replaced
with `Block` everywhere it is mentioned.
Block = "{" *Value ws "}"
Block = "{" *Value ws "}"
New syntax for explicit uninterpreted grouping of sequences of values is
introduced, and added to class `Value`.
@ -70,15 +70,39 @@ P-expressions must be rewritten or otherwise interpreted before a
meaningful Preserves value can be arrived at ([see
below](#reading-preserves)).
## <a id="annotations"></a>Annotations and Comments
Annotations and comments attach to the term following them, just as in
the ordinary text syntax. However, it is common in programming notations
to allow comments at the end of a file or other sequential construct:
{
key: value
# example of a comment at the end of a dictionary
}
# example of a comment at the end of the input file
While the ordinary text syntax forbids comments in these positions,
P-expressions allow them:
Document =/ *Value Trailer ws
Record =/ "<" Value *Value Trailer ws ">"
Sequence =/ "[" *Value Trailer ws "]"
Set =/ "#{" *Value Trailer ws "}"
Block =/ "{" *Value Trailer ws "}"
Trailer = 1*Annotation
## <a id="encoding-pexprs"></a>Encoding P-expressions as Preserves
We write ⌜*p*⌝ for the encoding into Preserves of P-expression *p*.
{:.pseudocode.equations}
| ⌜·⌝ | : | **P-expression****Preserves** |
| ⌜·⌝ : **P-expression** | | **Preserves** |
Aside from the special classes `Group`, `Block`, `Comma`, `Semicolon` or
`Colons`, P-expressions are encoded directly as Preserves data.
Aside from the special classes `Group`, `Block`, `Comma`, `Semicolon`,
`Colons`, or `Trailer`, P-expressions are encoded directly as Preserves
data.
{:.pseudocode.equations}
| ⌜`[`*p* ...`]`⌝ | = | `[`⌜*p*⌝ ...`]` |
@ -86,10 +110,10 @@ Aside from the special classes `Group`, `Block`, `Comma`, `Semicolon` or
| ⌜`#{`*p* ...`}`⌝ | = | `#{`⌜*p*⌝ ...`}` |
| ⌜`#!`*p*⌝ | = | `#!`⌜*p*⌝ |
| ⌜`@`*p* *q*⌝ | = | `@`⌜*p*⌝ ⌜*q*⌝ |
| ⌜*p*⌝ | = | *p* **when** *p***Atom** |
| ⌜*p*⌝ | = | *p* when *p***Atom** |
All members of the special classes are encoded as Preserves text
`Dictionary`[^encoding-rationale] values.
All members of the special classes are encoded as Preserves
dictionaries[^encoding-rationale].
[^encoding-rationale]: In principle, it would be nice to use *records*
for this purpose, but if we did so we would have to also encode
@ -101,6 +125,17 @@ All members of the special classes are encoded as Preserves text
| ⌜`,`⌝ | = | `{s:|,|}` |
| ⌜`;`⌝ | = | `{s:|;|}` |
| ⌜`:` ...⌝ | = | `{s:|:` ...`|}` |
| ⌜*t*⌝ | = | ⌜*a*⌝ ... `{}`, where *a* ... are the annotations in *t* and *t***Trailer** |
The empty dictionary `{}` acts as an anchor for the annotations in a
`Trailer`.
We overload the ⌜·⌝ notation for encoding whole `Document`s into
sequences of Preserves values.
{:.pseudocode.equations}
| ⌜·⌝ : **P-expression Document** | ⟶ | **Preserves Sequence** |
| ⌜*p* ...⌝ | = | `[`⌜*p*⌝ ...`]` |
## <a id="reading-preserves"></a>Interpreting P-expressions as Preserves
@ -117,17 +152,25 @@ classes mentioned above.
1. Every `Group` or `Semicolon` that appears is an error.
2. Every `Colons` with two or more colons in it is an error.
3. Every `Comma` that appears is removed from its container.
3. Every `Comma` that appears is discarded.
3. Every `Trailer` that appears is an error.[^discard-trailers-instead-of-error]
4. Every `Block` must contain triplets of `Value`, `Colons` (with a
single colon), `Value`. Any `Block` not following this pattern is an
error. Each `Block` following the pattern is translated to a
`Dictionary` containing a key/value pair for each triplet.
[^discard-trailers-instead-of-error]: **Implementation note.** When
implementing parsing of P-expressions into Preserves, consider
offering an optional mode where trailing annotations `Trailer` are
*discarded* instead of causing an error to be signalled.
## Appendix: Examples
Examples are given as pairs of P-expressions and their Preserves
text-syntax encodings.
### Individual P-expression `Value`s
```preserves
<date 1821 (lookup-month "February") 3>
= <date 1821 {g:[lookup-month "February"]} 3>
@ -182,6 +225,23 @@ text-syntax encodings.
]}
```
### Whole `Document`s
```preserves
⌜{
key: value
# example of a comment at the end of a dictionary
}
# example of a comment at the end of the input file⌝
= [ {b:[
key {s:|:|} value
@"example of a comment at the end of a dictionary" {}
]}
@"example of a comment at the end of the input file"
{}
]
```
## Appendix: Reading vs. Parsing
Lisp systems first *read* streams of bytes into S-expressions and then

View File

@ -273,7 +273,8 @@ value. Each annotation is, in turn, a `Value`, and may itself have
annotations. The ordering of annotations attached to a `Value` is
significant.
Value =/ ws "@" Value Value
Value =/ ws Annotation Value
Annotation = "@" Value
Each annotation is preceded by `@`; the underlying annotated value
follows its annotations. Here we extend only the syntactic nonterminal
@ -283,7 +284,7 @@ named “`Value`” without altering the semantic class of `Value`s.
interpreted as comments associated with that value. Comments are
sufficiently common that special syntax exists for them.
Value =/ ws ";" linecomment (CR / LF) Value
Annotation =/ ";" linecomment (CR / LF)
linecomment = *<any unicode scalar value except CR or LF>
When written this way, everything between the `;` and the end of the line