307 lines
12 KiB
Markdown
307 lines
12 KiB
Markdown
# RacketMQ: An implementation of W3C WebSub
|
|
|
|
This is an implementation of a W3C WebSub Hub in Racket, using the
|
|
actor-style research language [Syndicate](http://syndicate-lang.org/).
|
|
|
|
## What is WebSub?
|
|
|
|
On the 20th of October 2016, the W3C released a First Public Working
|
|
Draft of (what was called at the time) PubSub, later renamed to
|
|
WebSub.
|
|
|
|
See the specification of the W3C WebSub protocol at
|
|
<https://www.w3.org/TR/pubsub/> (and track its development at
|
|
<https://github.com/w3c/pubsub>).
|
|
**N.B.: These URLs will eventually have `websub` in place of `pubsub`.**
|
|
|
|
[w3cspec]: https://www.w3.org/TR/pubsub/
|
|
|
|
## Quick Start
|
|
|
|
1. Install Racket from <http://download.racket-lang.org/>
|
|
2. Install RacketMQ by running `raco pkg install --auto racketmq`
|
|
3. `racketmq --baseurl http://localhost:7827/ --listen localhost 7827`
|
|
|
|
To install from git, replace the `raco pkg install ...` step above
|
|
with an invocation of `make link` from the top directory of your git
|
|
checkout.
|
|
|
|
## Features
|
|
|
|
- Offers both *local topics*, topics whose canonical hub is this hub,
|
|
and *remote topics*, topics whose canonical hub is some other
|
|
("upstream") hub
|
|
|
|
- Support for polling and push-notification for remote topics, with
|
|
configurable poll interval; this allows *hub chaining*.
|
|
|
|
- Uses HTTP `Link` headers when retrieving a topic to determine
|
|
canonical hub and topic URLs; does not extract `link` elements from
|
|
any kind of XML or HTML document, nor does it implement
|
|
`.host-meta` discovery
|
|
|
|
- [WebSocket][]-based subscriptions to WebSub topics, in
|
|
addition to the usual
|
|
[WebHook](https://en.wikipedia.org/wiki/Webhook)-based
|
|
subscriptions.
|
|
|
|
[WebSocket]: https://tools.ietf.org/html/rfc6455
|
|
|
|
## Configuration
|
|
|
|
The most important RacketMQ configuration variable is its canonical
|
|
base URL: the URL prefix used to build URLs for clients to use.
|
|
|
|
When the RacketMQ startup script is given a "`-f` *filename*" option,
|
|
it loads configuration data from the named file. The option can be
|
|
supplied more than once; all named files are imported.
|
|
|
|
For a fully-commented example configuration file, see
|
|
[`racketmq/defaults.rktd`](racketmq/defaults.rktd).
|
|
|
|
Within each file, each configuration entry should be a list (see
|
|
[Racket syntax](https://docs.racket-lang.org/reference/reader.html))
|
|
with a symbol (the "key") as its first item followed by zero or more
|
|
items. Line comments start with semicolon (`;`) as usual for
|
|
S-expression languages.
|
|
|
|
Each configuration file is automatically reread by the server when it
|
|
is changed: if you need to make changes, consider doing so atomically
|
|
by producing an updated configuration file and using
|
|
`rename(2)`/`mv(1)` to activate it.
|
|
|
|
### Required configuration data
|
|
|
|
(canonical-baseurl "http://localhost:7827/")
|
|
|
|
Exactly one "canonical-baseurl" key, containing a URL string naming
|
|
the base URL used for constructing URLs that are given out to third
|
|
parties, such as subscription endpoints for upstream hubs to use.
|
|
|
|
This is *just* for URL construction, and does NOT create any HTTP
|
|
listeners. Those are configured with "http-listener" keys:
|
|
|
|
(http-listener "localhost" 7827)
|
|
;; (http-listener "localhost" 80)
|
|
;; (http-listener "www.example.com" 7827)
|
|
;;
|
|
;; etc.
|
|
|
|
At least one "http-listener" key is required. These cause an HTTP
|
|
server to be spun up for each mentioned port number. Traffic will only
|
|
be accepted for HTTP Host headers mentioned in these keys.
|
|
|
|
Since these are the only mandatory configuration item, RacketMQ can
|
|
run without any configuration file at all if the server is started
|
|
with the `--baseurl` and `--listen` command-line arguments:
|
|
|
|
racketmq --baseurl http://localhost:7827/ --listen localhost 7827
|
|
|
|
### Fine tuning
|
|
|
|
You will seldom want to alter these settings.
|
|
|
|
(max-upstream-redirects 5)
|
|
|
|
When performing discovery / upstream content retrieval, the hub
|
|
will follow this many redirects before deciding it has had enough.
|
|
|
|
(default-lease 86400) ;; 86400 seconds = one day
|
|
(max-lease 604800) ;; 604800 seconds = one week
|
|
|
|
If a subscription request arrives with no specified
|
|
`hub.lease_seconds`, then `default-lease` is used. If a requested
|
|
lease duration exceeds `max-lease`, then `max-lease` is used
|
|
instead.
|
|
|
|
(min-poll-interval 60) ;; seconds
|
|
(default-poll-interval "none") ;; seconds, or "none"
|
|
|
|
Upstream topics will be polled from time to time, according to the
|
|
settings of each local subscription to the topic. Subscriptions may
|
|
supply `hub.poll_interval_seconds` as either a number or the string
|
|
"none". If no `hub.poll_interval_seconds` is supplied in a
|
|
subscription, `default-poll-interval` is used. If all subscriptions
|
|
to an upstream topic have "none" as their poll interval, no polling
|
|
will occur; otherwise, polling will occur at the fastest requested
|
|
rate, but never more frequently than every `min-poll-interval`
|
|
seconds.
|
|
|
|
(subscription-retry-delay 600) ;; seconds
|
|
|
|
If subscription to an upstream hub fails immediately, we will
|
|
schedule a retry in this many seconds.
|
|
|
|
(max-dead-letters 10)
|
|
(max-delivery-retries 10)
|
|
(initial-retry-delay 5.0) ;; seconds
|
|
(retry-delay-multiplier 1.618)
|
|
(max-retry-delay 30) ;; seconds
|
|
|
|
Subscriptions last until explicitly terminated by an unsubscription
|
|
request, implicitly terminated by lease expiry, or implicitly
|
|
terminated by sustained delivery failure.
|
|
|
|
When the hub sends a *content distribution request* (see the WebSub
|
|
spec) to a subscription's callback, if a success response is
|
|
returned, the delivery is considered successful.
|
|
|
|
Otherwise, the hub begins an exponential backoff process, with an
|
|
initial delay of `initial-retry-delay` seconds, increasing by a
|
|
factor of `retry-delay-multiplier` (subject to a cap of
|
|
`max-retry-delay` seconds) with each subsequent attempt until
|
|
`max-delivery-retries` attempts have been made. At that point, if
|
|
all attempts to deliver the particular content distribution request
|
|
have failed, the request is considered a "dead letter" and is
|
|
effectively discarded. Once a request has either succeeded or
|
|
become a dead letter, the hub continues with any further pending
|
|
content distribution requests for the subscription.
|
|
|
|
If more than `max-dead-letters` dead letters pile up for a
|
|
subscription, the subscription is considered too damaged to
|
|
continue to exist, and is terminated.
|
|
|
|
## Hub URL layout
|
|
|
|
- `/hub` — Local subscription management; main Hub URL.
|
|
|
|
This is the main URL for creating and deleting subscriptions to
|
|
(local or remote) topics.
|
|
|
|
- method `POST`: create or delete a subscription, following
|
|
[the specification][w3cspec]. Supply `hub.mode`, `hub.topic`,
|
|
`hub.callback` and other relevant parameters to manage
|
|
subscriptions.
|
|
|
|
- method `GET`, when an `upgrade` header with value `websocket` is
|
|
present: create a streaming subscription to a topic. See below.
|
|
|
|
- `/topic/`*topic* — Local topic endpoint.
|
|
|
|
A *local topic* is a topic managed by this hub. Publishers `POST`
|
|
their content to the local topic endpoint, and subscribers are
|
|
notified of the change. Local topics may be managed explicitly or
|
|
implicitly; any subscription to a local topic will automatically
|
|
cause it to be created, even if it has not been previously
|
|
explicitly `PUT` into existence.
|
|
|
|
- method `PUT`: create a local topic explicitly
|
|
- method `DELETE`: delete an explicitly-created local topic
|
|
- method `HEAD`: get headers associated with the most recent topic value
|
|
- method `GET`: get the most recent topic value
|
|
- method `POST`: update the topic value with the post body
|
|
|
|
- `/sub/`*sub-id* — Upstream subscription endpoint.
|
|
|
|
When a subscription to a remote topic is created, if the remote
|
|
topic has an advertised hub, this hub subscribes to the remote hub,
|
|
and content distribution requests are `POST`ed to a fresh upstream
|
|
subscription endpoint URL.
|
|
|
|
- method `GET`: for verification-of-intent requests from upstream.
|
|
- method `POST`: for content distribution requests from upstream.
|
|
|
|
- `/`*path/to/file/in/htdocs* — Static resource.
|
|
|
|
The `racketmq/htdocs` subdirectory contains static resources to be
|
|
served by the hub.
|
|
|
|
- method `GET`: retrieve a static resource.
|
|
|
|
## Streaming WebSocket-based Subscriptions
|
|
|
|
In addition to the [standard][w3cspec] WebHook-based subscriptions,
|
|
RacketMQ offers [WebSocket][]-based subscriptions.
|
|
|
|
If your server's base URL is `https://example.com/`, then connecting a
|
|
WebSocket to URL `wss://example.com/hub&hub.topic=MYTOPIC` will create
|
|
a streaming subscription to the topic `MYTOPIC`. (For plain `http:`,
|
|
use `ws:`.)
|
|
|
|
Content will be delivered from the server as JSON messages of the form
|
|
|
|
```json
|
|
{
|
|
"topic": "MYTOPIC",
|
|
"link": {
|
|
"hub": "https://example.com/hub",
|
|
"self": "https://example.com/topic/MYTOPIC"
|
|
},
|
|
"content-type": "text/plain",
|
|
"content-base64": "..."
|
|
}
|
|
```
|
|
|
|
The `link` object corresponds to the `Link` headers that would usually
|
|
be sent in a WebSub WebHook-based content distribution request, the
|
|
`content-type` string to the `Content-Type` header, and the
|
|
`content-base64` string to the base64-encoded bytes of the body. The
|
|
`topic` string is always based on the `hub.topic` parameter supplied
|
|
in the URL that the WebSocket was initially connected to.
|
|
|
|
## Conformance
|
|
|
|
At the time of writing, no official list of conformance criteria
|
|
exists; however, there is a draft list of Candidate Recommendation
|
|
implementation criteria at <https://github.com/w3c/pubsub/issues/56>.
|
|
|
|
## Codebase Layout
|
|
|
|
Files at the toplevel of the git checkout:
|
|
|
|
- `COPYING`, `gpl.txt`, `lgpl.txt`: Licensing and copyright information
|
|
- `info.rkt`: Racket package control metadata
|
|
- `nginx.conf`: Example nginx configuration file, for running RacketMQ behind nginx
|
|
|
|
In the `racketmq/` directory are the sources for the RacketMQ server:
|
|
|
|
- `hub.rkt`: **Main entry point for RacketMQ server**
|
|
- `config.rkt`: Actor that tracks changes in config files
|
|
- `protocol.rkt`: Definitions of protocol structures for coordination among RacketMQ actors
|
|
- `hub/`: Source code for the main functions of the RacketMQ server
|
|
- `hub/static-content.rkt`: Actor serving static content from `htdocs/`
|
|
- `hub/subscription.rkt`: Actors implementing downstream WebHook-based subscriptions
|
|
- `hub/websocket.rkt`: Actors implementing downstream WebSocket-based subscriptions
|
|
- `hub/topic-demand.rkt`: Actor that analyzes a subscription topic
|
|
URL, deciding whether it represents a local topic or a remote
|
|
topic.
|
|
- `hub/local-topic.rkt`: Actor implementing a local RacketMQ topic
|
|
- `hub/remote-topic.rkt`: Actors implementing a remote RacketMQ
|
|
topic and WebSub subscribers that relay content from upstream
|
|
hubs (if any) to downstream subscribers
|
|
|
|
The `racketmq/` directory also contains a few other files of interest:
|
|
|
|
- `defaults.rktd`: Fully-commented RacketMQ configuration file
|
|
- `poke.rkt`: Simple interactive tool for interacting with RacketMQ
|
|
- `run`: [Daemontools](https://cr.yp.to/daemontools.html) startup script for the server
|
|
- `log/run`: Daemontools logging startup script for the server
|
|
- `htdocs/`: Static files to be served by the server
|
|
- `htdocs/500.html`: Error document used by nginx when it cannot reach RacketMQ
|
|
- `htdocs/client.html`: Simple interactive tool for experimenting
|
|
with WebSocket subscriptions in the browser
|
|
- `htdocs/client.js`: JavaScript code for `client.html`
|
|
|
|
## Bug Reports
|
|
|
|
Please report issues using this project's Github issues page,
|
|
<https://github.com/tonyg/racketmq/issues>.
|
|
|
|
## License
|
|
|
|
Copyright © 2016 Tony Garnock-Jones <tonyg@leastfixedpoint.com>
|
|
|
|
This program is free software: you can redistribute it and/or modify
|
|
it under the terms of the GNU Lesser General Public License as
|
|
published by the Free Software Foundation, either version 3 of the
|
|
License, or (at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
Lesser General Public License for more details.
|
|
|
|
You should have received a copy of the GNU Lesser General Public
|
|
License along with this program (see the files "lgpl.txt" and
|
|
"gpl.txt"). If not, see <http://www.gnu.org/licenses/>.
|