synit-manual/src/operation/scripting.md

481 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Configuration scripting language
The `syndicate-server` program includes a mechanism that was originally intended for populating
a dataspace with assertions, for use in configuring the server, but which has since grown into
a small Syndicated Actor Model scripting language in its own right. This seems to be the
destiny of "configuration formats"—why fight it?—but the current language is inelegant and
artificially limited in many ways. I have an as-yet-unimplemented sketch of a more refined
design to replace it. Please forgive the ad-hoc nature of the actually-implemented language
described below, and be warned that this is an unstable area of the Synit design.
See near the end of this document for [a few illustrative examples](#examples).
## Evaluation model
The language consists of sequences of instructions. For example, one of the most important
instructions simply publishes (asserts) a value at a given entity (which will often be a
[dataspace](../glossary.md#dataspace)).
The language evaluation context includes an environment mapping variable names to Preserves
[`Value`](../guide/preserves.md#values-and-types)s.
Variable references are lexically scoped.
Each source file is interpreted in a top-level environment. The top-level environment is
supplied by the context invoking the script, and is generally non-empty. It frequently includes
a binding for the variable `config`, which happens to be the default [target variable
name](#the-active-target).
## Source file syntax
*Program* = *Instruction* ...
A configuration source file is a file whose name ends in `.pr` that contains zero or more
Preserves [text-syntax](https://preserves.dev/preserves-text.html)
values, which are together interpreted as a sequence of *Instruction*s.
**Comments.** [Preserves
comments](https://preserves.dev/conventions.html#comments) are ignored. One
unfortunate wart is that because Preserves comments are really
[annotations](https://preserves.dev/preserves-text.html#annotations), they are
required by the Preserves data model to be attached to some other value. Syntactically, this
manifests as the need for *some non-comment following every comment*. In scripts written to
date, often an empty *SequencingInstruction* serves to anchor comments at the end of a file:
```preserves
# A comment
# Another comment
# The following empty sequence is needed to give the comments
# something to attach to
[]
```
## Patterns, variable references, and variable bindings
Symbols are treated specially throughout the language. Perl-style
[sigils](https://en.wikipedia.org/wiki/Sigil_(computer_programming)) control the interpretation
of any given symbol:
- `$`*var* is a **variable reference**. The variable *var* will be looked up in the
environment, and the corresponding value substituted.
- `?`*var* is a **variable binder**, used in pattern-matching. The value being matched at that
position will be captured into the environment under the name *var*.
- `_` is a **discard** or **wildcard**, used in pattern-matching. The value being matched at
that position will be accepted (and otherwise ignored), and pattern matching will continue.
- `=`*sym* denotes the **literal symbol** *sym*. It is used whereever syntactic ambiguity
could prevent use of a bare literal symbol. For example, `=?foo` denotes the literal symbol
`?foo`, where `?foo` on its own would denote a variable binder for the variable named `foo`.
- all other symbols are **bare literal symbols**, denoting just themselves.
The special variable `.` (referenced using `$.`) denotes "the current environment, as a
dictionary".
## The active target
During loading and compilation (!) of a source file, the compiler maintains a compile-time
register called the *active target* (often simply the "target"), containing the *name* of a
variable that will be used at runtime to select an [entity reference](../glossary.md#reference)
to act upon. At the beginning of compilation, it is set to the name `config`, so that whatever
is bound to `config` in the initial environment at runtime is used as the default target for
targeted *Instruction*s.
This is one of the awkward parts of the current language design.
## Instructions
*Instruction* =
    *SequencingInstruction* |
    *RetargetInstruction* |
    *AssertionInstruction* |
    *SendInstruction* |
    *ReactionInstruction* |
    *LetInstruction* |
    *ConditionalInstruction*
### Sequencing
*SequencingInstruction* = `[`*Instruction*...`]`
A sequence of instructions is written as a Preserves sequence. The carried instructions are
compiled and executed in order. NB: to publish a sequence of values, use the `+=` form of
*AssertionInstruction*.
### Setting the active target
*RetargetInstruction* = `$`*var*
The target is set with a variable reference standing alone. After compiling such an
instruction, the active target register will contain the variable name *var*. NB: to publish
the contents of a variable, use the `+=` form of *AssertionInstruction*.
### Publishing an assertion
*AssertionInstruction* =
    `+= `*ValueExpr* |
    *AttenuationExpr* |
    `<`*ValueExpr*` `*ValueExpr*...`>` |
    `{`*ValueExpr*`:`*ValueExpr*` `...`}`
The most general form of *AssertionInstruction* is "`+= `*ValueExpr*". When executed, the
result of evaluating *ValueExpr* will be published (asserted) at the entity denoted by the
active target register.
As a convenient shorthand, the compiler also interprets every Preserves record or dictionary in
*Instruction* position as denoting a *ValueExpr* to be used to produce a value to be asserted.
### <span id="SendInstruction"></span>Sending a message
*SendInstruction* = `! `*ValueExpr*
When executed, the result of evaluating *ValueExpr* will be sent as a message to the entity
denoted by the active target register.
### Reacting to events
*ReactionInstruction* =
    *DuringInstruction* |
    *OnMessageInstruction* |
    *OnStopInstruction*
These instructions establish event handlers of one kind or another.
#### Subscribing to assertions and messages
*DuringInstruction* = `? `*PatternExpr*` `*Instruction*
*OnMessageInstruction* = `?? `*PatternExpr*` `*Instruction*
These instructions publish assertions of the form `<Observe `*pat*` #:`*ref*`>` at the entity
denoted by the active target register, where *pat* is the [dataspace
pattern](../glossary.md#dataspace-pattern) resulting from evaluation of *PatternExpr*, and
*ref* is a fresh [entity](../glossary.md#entity) whose behaviour is to execute *Instruction* in
response to assertions (resp. messages) carrying captured values from the binding-patterns in
*pat*.
When the active target denotes a [dataspace](../glossary.md#dataspace) entity, the `Observe`
record establishes a subscription to matching assertions and messages.
Each time a matching assertion arrives at a *ref*, a new [facet](../glossary.md#facet) is
created, and *Instruction* is executed in the new facet. If the instruction creating the facet
is a *DuringInstruction*, then the facet is automatically terminated when the triggering
assertion is retracted. If the instruction is an *OnMessageInstruction*, the facet is not
automatically terminated.[^automatic-termination]
Programs can react to facet termination using *OnStopInstruction*s, and can trigger early facet
termination themselves using the `facet` form of *ConvenienceExpr* (see below).
#### Reacting to facet termination
*OnStopInstruction* = `?- `*Instruction*
This instruction installs a "stop handler" on the facet active during its execution. When the
facet terminates, *Instruction* is run.
### Destructuring-bind and convenience expressions
*LetInstruction* = `let `*PatternExpr*` = `*ConvenienceExpr*
*ConvenienceExpr* =
    `dataspace` |
    `timestamp` |
    `facet` |
    `stringify` *ConvenienceExpr* |
    *ValueExpr*
Values can be destructured and new variables introduced into the environment with `let`, which
is a "destructuring bind" or "pattern-match definition" statement. When executed, the result of
evaluating *ConvenienceExpr* is matched against the result of evaluating *PatternExpr*. If the
match fails, the actor crashes. If the match succeeds, the resulting binding variables (if any)
are introduced into the environment.
The right-hand-side of a `let`, after the equals sign, is either a normal *ValueExpr* or one of
the following special "convenience" expressions:
- <span id="expr:dataspace">`dataspace`: Evaluates to a fresh, empty
[dataspace](../glossary.md#dataspace) entity.</span>
- `timestamp`: Evaluates to a string containing an
[RFC-3339](https://datatracker.ietf.org/doc/html/rfc3339)-formatted timestamp.
- `facet`: Evaluates to a fresh entity representing the current facet. Sending the message
`stop` to the entity (using e.g. the *SendInstruction* "`! stop`") triggers termination of
its associated facet. The entity does not respond to any other assertion or message.
- `stringify`: Evaluates its argument, then renders it as a Preserves value using Preserves
text syntax, and yields the resulting string.
### Conditional execution
*ConditionalInstruction* = `$`*var*` =~ `*PatternExpr*` `*Instruction*` `*Instruction* ...
When executed, the value in variable *var* is matched against the result of evaluating
*PatternExpr*.
- If the match succeeds, the resulting bound variables are placed in the environment and
execution continues with the first *Instruction*. The subsequent *Instruction*s are not
executed in this case.
- If the match fails, then the first *Instruction* is skipped, and the subsequent
*Instruction*s are executed.
## Value Expressions
*ValueExpr* =
    `#t` | `#f` | *double* | *int* | *string* | *bytes* |
    `$`*var* | `=`*symbol* | *bare-symbol* |
    *AttenuationExpr* |
    `<`*ValueExpr*` `*ValueExpr*...`>` |
    `[`*ValueExpr*...`]` |
    `#{`*ValueExpr*...`}` |
    `{`*ValueExpr*`:`*ValueExpr*` `...`}`
Value expressions are recursively evaluated and yield a Preserves
[`Value`](../guide/preserves.md#values-and-types). Syntactically, they consist of literal
non-symbol atoms, compound data structures (records, sequences, sets and dictionaries), plus
special syntax for *[attenuated](../glossary.md#attenuation) entity references*, *variable references*, and literal symbols:
- *AttenuationExpr*, described below, evaluates to an entity reference with an attached
[attenuation](../glossary.md#attenuation).
- `$`*var* evaluates to the binding for *var* in the environment, if there is one, or crashes
the actor, if there is not.
- `=`*symbol* and *bare-symbol* (i.e. any symbols except [a binding, a reference, or a
discard](#patterns-variable-references-and-variable-bindings)) denote literal symbols.
## Attenuation Expressions
*AttenuationExpr* = `<* $`*var*` [`*Caveat* ...`]>`
*Caveat* =
    `<or [`*Rewrite* ...`]>` |
    `<reject `*PatternExpr*`>` |
    *Rewrite*
*Rewrite* =
    `<accept `*PatternExpr*`>` |
    `<rewrite `*PatternExpr*` `*TemplateExpr*`>`
An attenuation expression looks up *var* in the environment, asserts that it is an entity
reference *orig*, and returns a new entity reference *ref*, like *orig* but
[attenuated](../glossary.md#attenuation) with zero or more *Caveat*s. The result of evaluation
is *ref*, the new attenuated entity reference.
When an assertion is published or a message arrives at *ref*, the sequence of *Caveat*s is
executed **right-to-left**, transforming and possibly discarding the asserted value or message
body. If all *Caveat*s succeed, the final transformed value is forwarded on to *orig*. If any
*Caveat* fails, the assertion or message is silently ignored.
A *Caveat* can be one of three possibilities:
- An `or` of multiple alternative *Rewrite*s. The first *Rewrite* to accept (and possibly
transform) the input value causes the whole `or` *Caveat* to succeed. If all the *Rewrite*s
in the `or` fail, the `or` itself fails. Supplying a *Caveat* that is an `or` containing
zero *Rewrite*s will reject *all* assertions and messages.
- A `reject`, which allows all values through unchanged except those matching *PatternExpr*.
- A simple *Rewrite*.
A *Rewrite* can be one of two possibilities:
- A `rewrite`, which matches input values with *PatternExpr*. If the match fails, the
*Rewrite* fails. If it succeeds, the resulting bindings are used along with the current
environment to evaluate *TemplateExpr*, and the *Rewrite* succeeds, yielding the resulting
value.
- An `accept`, which is the same as `<rewrite <?`*v*` `*PatternExpr*`> $`*v*`>` for some fresh
*v*.
## Pattern Expressions
*PatternExpr* =
    `#t` | `#f` | *double* | *int* | *string* | *bytes* |
    `$`*var* | `?`*var* | `_` | `=`*symbol* | *bare-symbol* |
    *AttenuationExpr* |
    `<?`*var*` `*PatternExpr*`>` |
    `<`*PatternExpr*` `*PatternExpr*...`>` |
    `[`*PatternExpr*...`]` |
    `{`*literal*`:`*PatternExpr*` `...`}`
Pattern expressions are recursively evaluated to yield a [dataspace
pattern](../glossary.md#dataspace-pattern). Evaluation of a *PatternExpr* is like evaluation of
a *ValueExpr*, except that binders and wildcards are allowed, set syntax is not allowed, and
dictionary keys are constrained to being literal values rather than *PatternExpr*s.
Two kinds of binder are supplied. The more general is `<?`*var*` `*PatternExpr*`>`, which
evaluates to a pattern that succeeds, capturing the matched value in a variable named *var*,
only if *PatternExpr* succeeds. For the special case of `<?`*var*` _>`, the shorthand form
`?`*var* is supported.
The pattern `_` ([*discard*, *wildcard*](#patterns-variable-references-and-variable-bindings))
always succeeds, matching any value.
## Template Expressions
*TemplateExpr* =
    `#t` | `#f` | *double* | *int* | *string* | *bytes* |
    `$`*var* | `=`*symbol* | *bare-symbol* |
    *AttenuationExpr* |
    `<`*TemplateExpr*` `*TemplateExpr*...`>` |
    `[`*TemplateExpr*...`]` |
    `{`*literal*`:`*TemplateExpr*` `...`}`
Template expressions are used in [attenuation expressions](#attenuation-expressions) as part of
value-rewriting instructions. Evaluation of a *TemplateExpr* is like evaluation of a
*ValueExpr*, except that set syntax is not allowed and dictionary keys are constrained to being
literal values rather than *TemplateExpr*s.
Additionally, record template labels (just after a "`<`") must be "literal-enough". If any
sub-part of the label *TemplateExpr* refers to a variable's value, the variable must have been
bound in the environment surrounding the *AttenuationExpr* that the *TemplateExpr* is part of,
and must not be any of the capture variables from the *PatternExpr* corresponding to the
template. This is a constraint stemming from the definition of the [syntax used for expressing
capability attenuation](../protocol.md#attenuation-of-authority) in the underlying Syndicated
Actor Model.
## Examples
<span id="example-1">**Example 1.**</span> The simplest example uses no variables, publishing
constant assertions to the implicit default target, `$config`:
```preserves
<require-service <daemon console-getty>>
<daemon console-getty "getty 0 /dev/console">
```
<span id="example-2">**Example 2.**</span> A more complex example subscribes to two kinds of
`service-state` assertion at the dataspace named by the default target, `$config`, and in
response to their existence asserts a rewritten variation on them:
```preserves
? <service-state ?x ready> <service-state $x up>
? <service-state ?x complete> <service-state $x up>
```
In prose, it reads as "during any assertion at `$config` of a `service-state` record with state
`ready` for any service name `x`, assert (also at `$config`) that `x`'s `service-state` is `up`
in addition to `ready`," and similar for state `complete`.
<span id="example-3">**Example 3.**</span> The following example first attenuates `$config`,
binding the resulting capability to `$sys`. Any `require-service` record published to `$sys` is
rewritten into a `require-core-service` record; other assertions are forwarded unchanged.
```preserves
let ?sys = <* $config [<or [
<rewrite <require-service ?s> <require-core-service $s>>
<accept _>
]>]>
```
Then, `$sys` is used to build the initial environment for a [configuration
tracker](./builtin/config-watcher.md), which executes script files in the `/etc/syndicate/core`
directory using the environment given.
```preserves
<require-service <config-watcher "/etc/syndicate/core" {
config: $sys
gatekeeper: $gatekeeper
log: $log
}>>
```
<span id="example-4">**Example 4.**</span> The final example executes a script in response to
an `exec` record being sent as a message to `$config`. The use of `??` indicates a
message-event-handler, rather than `?`, which would indicate an assertion-event-handler.
```preserves
?? <exec ?argv ?restartPolicy> [
let ?id = timestamp
let ?facet = facet
let ?d = <temporary-exec $id $argv>
<run-service <daemon $d>>
<daemon $d {
argv: $argv,
readyOnStart: #f,
restart: $restartPolicy,
}>
? <service-state <daemon $d> complete> [$facet ! stop]
? <service-state <daemon $d> failed> [$facet ! stop]
]
```
First, the current timestamp is bound to `$id`, and a fresh entity representing the facet
established in response to the `exec` message is created and bound to `$facet`. The variable
`$d` is then initialized to a value uniquely identifying this particular `exec` request. Next,
`run-service` and `daemon` assertions are placed in `$config`. These assertions communicate
with the [built-in program execution and supervision service](./builtin/daemon.md), causing a
Unix subprocess to be created to execute the command in `$argv`. Finally, the script responds
to `service-state` assertions from the execution service by terminating the facet by sending
its representative entity, `$facet`, a `stop` message.
## Programming idioms
**Conventional top-level variable bindings.** Besides `config`, many scripts are executed in a
context where `gatekeeper` names a server-wide [gatekeeper](./builtin/gatekeeper.md) entity,
and `log` names an entity that logs messages of a certain shape that are delivered to it.
**Setting the active target register.** The following *pairs* of *Instruction*s first set and
then use the [active target register](#the-active-target):
```preserves
$log ! <log "-" { line: "Hello, world!" }>
```
```preserves
$config ? <configure-interface ?ifname <dhcp>> [
<require-service <daemon <udhcpc $ifname>>>
]
```
```preserves
$config ? <service-object <daemon interface-monitor> ?cap> [
$cap {
machine: $machine
}
]
```
In the last one, `$cap` is captured from `service-object` records at `$config` and is then used
as a target for publication of a dictionary (containing key `machine`).
**Using conditionals.** The syntax of *ConditionalInstruction* is such that it can be easily
chained:
```preserves
$val =~ pat1 [ ... if pat1 matches ...]
$val =~ pat2 [ ... if pat2 matches ...]
... if neither pat1 nor pat2 matches ...
```
**Using dataspaces as ad-hoc entities.** Constructing a dataspace, attaching subscriptions to
it, and then passing it to somewhere else is a useful trick for creating scripted entities able
to respond to a few different kinds of assertion or message:
```preserves
let ?ds = dataspace # create the dataspace
$config += <my-entity $ds> # send it to peers for them to use
$ds [ # select $ds as the active target for `DuringInstruction`s inside the [...]
? pat1 [ ... ] # respond to assertions of the form `pat1`
? pat2 [ ... ] # respond to assertions of the form `pat2`
?? pat3 [ ... ] # respond to messages of the form `pat3`
?? pat4 [ ... ] # respond to messages of the form `pat4`
]
```
---
#### Notes
[^automatic-termination]: This isn't quite true. If, after execution of *Instruction*, the new
facet is "inert"—roughly speaking, has published no assertions and has no subfacets—then it
is terminated. However, since inert facets are unreachable and cannot interact with
anything or affect the future of a program in any way, this is operationally
indistinguishable from being left in existence, and so serves only to release memory for
later reuse.