
307 lines
12 KiB

# RacketMQ: An implementation of W3C WebSub
This is an implementation of a W3C WebSub Hub in Racket, using the
actor-style research language [Syndicate](http://syndicate-lang.org/).
## What is WebSub?
On the 20th of October 2016, the W3C released a First Public Working
Draft of (what was called at the time) PubSub, later renamed to
See the specification of the W3C WebSub protocol at
<https://www.w3.org/TR/pubsub/> (and track its development at
**N.B.: These URLs will eventually have `websub` in place of `pubsub`.**
[w3cspec]: https://www.w3.org/TR/pubsub/
## Quick Start
1. Install Racket from <http://download.racket-lang.org/>
2. Install RacketMQ by running `raco pkg install --auto racketmq`
3. `racketmq --baseurl http://localhost:7827/ --listen localhost 7827`
To install from git, replace the `raco pkg install ...` step above
with an invocation of `make link` from the top directory of your git
## Features
- Offers both *local topics*, topics whose canonical hub is this hub,
and *remote topics*, topics whose canonical hub is some other
("upstream") hub
- Support for polling and push-notification for remote topics, with
configurable poll interval; this allows *hub chaining*.
- Uses HTTP `Link` headers when retrieving a topic to determine
canonical hub and topic URLs; does not extract `link` elements from
any kind of XML or HTML document, nor does it implement
`.host-meta` discovery
- [WebSocket][]-based subscriptions to WebSub topics, in
addition to the usual
[WebSocket]: https://tools.ietf.org/html/rfc6455
## Configuration
The most important RacketMQ configuration variable is its canonical
base URL: the URL prefix used to build URLs for clients to use.
When the RacketMQ startup script is given a "`-f` *filename*" option,
it loads configuration data from the named file. The option can be
supplied more than once; all named files are imported.
For a fully-commented example configuration file, see
Within each file, each configuration entry should be a list (see
[Racket syntax](https://docs.racket-lang.org/reference/reader.html))
with a symbol (the "key") as its first item followed by zero or more
items. Line comments start with semicolon (`;`) as usual for
S-expression languages.
Each configuration file is automatically reread by the server when it
is changed: if you need to make changes, consider doing so atomically
by producing an updated configuration file and using
`rename(2)`/`mv(1)` to activate it.
### Required configuration data
(canonical-baseurl "http://localhost:7827/")
Exactly one "canonical-baseurl" key, containing a URL string naming
the base URL used for constructing URLs that are given out to third
parties, such as subscription endpoints for upstream hubs to use.
This is *just* for URL construction, and does NOT create any HTTP
listeners. Those are configured with "http-listener" keys:
(http-listener "localhost" 7827)
;; (http-listener "localhost" 80)
;; (http-listener "www.example.com" 7827)
;; etc.
At least one "http-listener" key is required. These cause an HTTP
server to be spun up for each mentioned port number. Traffic will only
be accepted for HTTP Host headers mentioned in these keys.
Since these are the only mandatory configuration item, RacketMQ can
run without any configuration file at all if the server is started
with the `--baseurl` and `--listen` command-line arguments:
racketmq --baseurl http://localhost:7827/ --listen localhost 7827
### Fine tuning
You will seldom want to alter these settings.
(max-upstream-redirects 5)
When performing discovery / upstream content retrieval, the hub
will follow this many redirects before deciding it has had enough.
(default-lease 86400) ;; 86400 seconds = one day
(max-lease 604800) ;; 604800 seconds = one week
If a subscription request arrives with no specified
`hub.lease_seconds`, then `default-lease` is used. If a requested
lease duration exceeds `max-lease`, then `max-lease` is used
(min-poll-interval 60) ;; seconds
(default-poll-interval "none") ;; seconds, or "none"
Upstream topics will be polled from time to time, according to the
settings of each local subscription to the topic. Subscriptions may
supply `hub.poll_interval_seconds` as either a number or the string
"none". If no `hub.poll_interval_seconds` is supplied in a
subscription, `default-poll-interval` is used. If all subscriptions
to an upstream topic have "none" as their poll interval, no polling
will occur; otherwise, polling will occur at the fastest requested
rate, but never more frequently than every `min-poll-interval`
(subscription-retry-delay 600) ;; seconds
If subscription to an upstream hub fails immediately, we will
schedule a retry in this many seconds.
(max-dead-letters 10)
(max-delivery-retries 10)
(initial-retry-delay 5.0) ;; seconds
(retry-delay-multiplier 1.618)
(max-retry-delay 30) ;; seconds
Subscriptions last until explicitly terminated by an unsubscription
request, implicitly terminated by lease expiry, or implicitly
terminated by sustained delivery failure.
When the hub sends a *content distribution request* (see the WebSub
spec) to a subscription's callback, if a success response is
returned, the delivery is considered successful.
Otherwise, the hub begins an exponential backoff process, with an
initial delay of `initial-retry-delay` seconds, increasing by a
factor of `retry-delay-multiplier` (subject to a cap of
`max-retry-delay` seconds) with each subsequent attempt until
`max-delivery-retries` attempts have been made. At that point, if
all attempts to deliver the particular content distribution request
have failed, the request is considered a "dead letter" and is
effectively discarded. Once a request has either succeeded or
become a dead letter, the hub continues with any further pending
content distribution requests for the subscription.
If more than `max-dead-letters` dead letters pile up for a
subscription, the subscription is considered too damaged to
continue to exist, and is terminated.
## Hub URL layout
- `/hub` — Local subscription management; main Hub URL.
This is the main URL for creating and deleting subscriptions to
(local or remote) topics.
- method `POST`: create or delete a subscription, following
[the specification][w3cspec]. Supply `hub.mode`, `hub.topic`,
`hub.callback` and other relevant parameters to manage
- method `GET`, when an `upgrade` header with value `websocket` is
present: create a streaming subscription to a topic. See below.
- `/topic/`*topic* — Local topic endpoint.
A *local topic* is a topic managed by this hub. Publishers `POST`
their content to the local topic endpoint, and subscribers are
notified of the change. Local topics may be managed explicitly or
implicitly; any subscription to a local topic will automatically
cause it to be created, even if it has not been previously
explicitly `PUT` into existence.
- method `PUT`: create a local topic explicitly
- method `DELETE`: delete an explicitly-created local topic
- method `HEAD`: get headers associated with the most recent topic value
- method `GET`: get the most recent topic value
- method `POST`: update the topic value with the post body
- `/sub/`*sub-id* — Upstream subscription endpoint.
When a subscription to a remote topic is created, if the remote
topic has an advertised hub, this hub subscribes to the remote hub,
and content distribution requests are `POST`ed to a fresh upstream
subscription endpoint URL.
- method `GET`: for verification-of-intent requests from upstream.
- method `POST`: for content distribution requests from upstream.
- `/`*path/to/file/in/htdocs* — Static resource.
The `racketmq/htdocs` subdirectory contains static resources to be
served by the hub.
- method `GET`: retrieve a static resource.
## Streaming WebSocket-based Subscriptions
In addition to the [standard][w3cspec] WebHook-based subscriptions,
RacketMQ offers [WebSocket][]-based subscriptions.
If your server's base URL is `https://example.com/`, then connecting a
WebSocket to URL `wss://example.com/hub&hub.topic=MYTOPIC` will create
a streaming subscription to the topic `MYTOPIC`. (For plain `http:`,
use `ws:`.)
Content will be delivered from the server as JSON messages of the form
"topic": "MYTOPIC",
"link": {
"hub": "https://example.com/hub",
"self": "https://example.com/topic/MYTOPIC"
"content-type": "text/plain",
"content-base64": "..."
The `link` object corresponds to the `Link` headers that would usually
be sent in a WebSub WebHook-based content distribution request, the
`content-type` string to the `Content-Type` header, and the
`content-base64` string to the base64-encoded bytes of the body. The
`topic` string is always based on the `hub.topic` parameter supplied
in the URL that the WebSocket was initially connected to.
## Conformance
At the time of writing, no official list of conformance criteria
exists; however, there is a draft list of Candidate Recommendation
implementation criteria at <https://github.com/w3c/pubsub/issues/56>.
## Codebase Layout
Files at the toplevel of the git checkout:
- `COPYING`, `gpl.txt`, `lgpl.txt`: Licensing and copyright information
- `info.rkt`: Racket package control metadata
- `nginx.conf`: Example nginx configuration file, for running RacketMQ behind nginx
In the `racketmq/` directory are the sources for the RacketMQ server:
- `hub.rkt`: **Main entry point for RacketMQ server**
- `config.rkt`: Actor that tracks changes in config files
- `protocol.rkt`: Definitions of protocol structures for coordination among RacketMQ actors
- `hub/`: Source code for the main functions of the RacketMQ server
- `hub/static-content.rkt`: Actor serving static content from `htdocs/`
- `hub/subscription.rkt`: Actors implementing downstream WebHook-based subscriptions
- `hub/websocket.rkt`: Actors implementing downstream WebSocket-based subscriptions
- `hub/topic-demand.rkt`: Actor that analyzes a subscription topic
URL, deciding whether it represents a local topic or a remote
- `hub/local-topic.rkt`: Actor implementing a local RacketMQ topic
- `hub/remote-topic.rkt`: Actors implementing a remote RacketMQ
topic and WebSub subscribers that relay content from upstream
hubs (if any) to downstream subscribers
The `racketmq/` directory also contains a few other files of interest:
- `defaults.rktd`: Fully-commented RacketMQ configuration file
- `poke.rkt`: Simple interactive tool for interacting with RacketMQ
- `run`: [Daemontools](https://cr.yp.to/daemontools.html) startup script for the server
- `log/run`: Daemontools logging startup script for the server
- `htdocs/`: Static files to be served by the server
- `htdocs/500.html`: Error document used by nginx when it cannot reach RacketMQ
- `htdocs/client.html`: Simple interactive tool for experimenting
with WebSocket subscriptions in the browser
- `htdocs/client.js`: JavaScript code for `client.html`
## Bug Reports
Please report issues using this project's Github issues page,
## License
Copyright © 2016 Tony Garnock-Jones <tonyg@leastfixedpoint.com>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this program (see the files "lgpl.txt" and
"gpl.txt"). If not, see <http://www.gnu.org/licenses/>.