Design
This section is about the design of kannader, some parts internal as well as some that may have an impact on regular use cases. Nothing here should be required for properly configuring a kannader instance, but it should help understanding why some design choices were made.
If you find something of practical impact that is not well-explained enough in the rest of the book, and had to resort to this section to properly use kannader, this is a documentation issue — please open an issue about that!
And, on the other hand, if there is an aspect of the design of kannader that you do not understand, it probably should have an answer in this section.
Guiding Principles
- Security
- Simplicity
- Flexible Configuration
High-Level Architecture Overview
Code layout
-
smtp-message
handles the SMTP protocol, at the parsing and serialization level. -
smtp-server
exposes aninteract
function, that semantically takes in a configuration and a client with which it's going to interact, and handles the SMTP interaction with this single client. -
smtp-queue
runs a queue for use by SMTP servers, delegating to a storage handler and a transport for sending messages that have reached their scheduled time for sending. -
smtp-queue-fs
implements a storage handler forsmtp-queue
that relies on the filesystem.
Not yet implemented
-
smtp-client
relays emails to external email servers. -
kannader
exposes the API of all above crates to consumers an an more opinionated way. -
kannader-bin
interacts with API with hooks, so that changing the configuration does not require rebuilding the whole server — see the configuration format chapter for more details
Configuration Format
Kannader's distinctive feature is its configuration format. The objective is to have the configuration be as flexible and fast as could be, yet be easily usable for users with simple needs.
Fear ye not! These two objectives are not contradictory. The idea for a solution handling these two use cases is to first build a flexible and fast configuration format. Once this is done, it then becomes possible to write wrappers around this configuration format that expose its flexibility in an easy-to-use format.
Flexibility
The first and foremost property we need for kannader's configuration format is flexibility. It is flexibility that will allow kannader to be usable in most contexts, and it is the one that will be built on to provide easy-to-use configuration formats.
As a consequence, it is the one on which no compromise is possible.
For maximal flexibility, kannader is designed as a set of libraries. However, libraries are not what a system administrator wants to have to manage. As a consequence, a more configuration-y configuration format is required.
This configuration format consists, basically, in hooks to the behavior of kannader. Every place a hook could be used for doing something reasonably useful, a hook should be available.
Performance
Once we have hooks everywhere, a problem quickly arises: performance. If hooks are used everywhere, then hooks must be performant. Would that not be the case, the whole server would be slowed down.
Performant hooks are usually written in Lua. This is what was considered in the first design of kannader's configuration format.
However, Lua has some big issues: the only good implementations that exist are all written in C, and running such code unsandboxed would go against kannader's safety motto. It would be surprising if not a single security flaw was present in Lua1. Another major issue of Lua is its being exotic. While other programs, like rspamd or nginx, already use or allow Lua scripting, system administrators usually are not fluent in Lua.
This led to searching for another option. Thus arose the idea of sandboxing a Lua VM: it is possible to run any kind of interpreter from within a sandbox. And a sandbox mechanism that recently gained quite a lot of traction is WebAssembly (thereafter abbreviated wasm).
Wasm is fast. Wasm provides a sandbox. Wasm interpreters are designed for safety, as they run untrusted code found anywhere on the Internet.
Wasm thus is the choice of predilection for writing the hooks: it's extremely fast, yet allows for near-perfect configurability.
Ease of use
The one thing that wasm gets really, really wrong, among our requirements, is ease of use. Yet, it is maybe one of our most important objectives with kannader's configuration format: to provide a configurable, yet easy configuration format.
This can thankfully be worked around, by using “wrappers.” These take the configurability of the wasm hooks, and turn it into an easy-to-use format.
These “wrappers” are, in fact, only pre-compiled wasm blobs that are provided alongside kannader. They provide configuration formats that vary in ease-of-use and configurability, allowing to adjust the knob between the two.
Kannader, being passed a wasm configuration blob at startup (as well as a list of which local and remote resources should be made available to the sandbox), can thus simply provide pre-built wasm configuration blobs that then parse and enforce a specific configuration format.
Configuration blob examples
An example of what such configuration blobs could do would be to read a format that mimicks OpenSMTPD's configuration format, and then translates it into the appropriate kannader hooks, for easy migration. This takes a “regular” previously-existing SMTP server's configuration file, and turns it into a kannader configuration.
But more interesting things can be done with this scheme. For instance, it is possible to have the wasm configuration blob read a Lua file, and then run it through a Lua interpreter, thus making the total flexibility available to the end-user like direct Lua hooks would have done, to the expense of some performance by running the Lua VM2.
Or, another wasm configuration blob could read a Python file, and run it through MicroPython3. This provides a language that will probably be more familiar to the sysadmin, yet usable for most non-maximal-performance use cases.
But it is possible to do more than just have configuration blobs that
read a generic configuration file and converts it into kannader hooks.
It is also possible to have the choice of the configuration blob
itself be part of the configuration. For instance, a configuration
blob could handle the “local server with local users only” use case,
that would setup authentication for all sending that's not directed
towards local users (using callbacks provided by eg. MicroPython code
for things like knowing whether a user is local), antispam and
antivirus if configured to do so, and maybe even automatically
validate the From:
header if configured to.
This is a case of encoding domain-specific knowledge in the configuration blob, in order to make it much easier to write. It includes almost no flexibility, but should make the configuration as simple to write as possible.
Or in the configuration Lua code, vulnerability that the attacker could then exploit over the network, with this issue being more likely the less the configuration code is sandboxed.
The wasm sandbox's overhead is probably lower than LuaJIT's sandbox's overhead, while providing much better security, which makes it a tool of choice for flexibility and performance. However, the fact that LuaJIT is most likely very far from being available on wasm does mean that only a non-JIT version of a Lua VM can be run on top of wasm… which implies that Lua interpretation will be much slower. However, this should hopefully not be a problem, as for all non-intensive use cases the time spent in hooks will probably not be a concern, and intensive use cases whose use case is not covered by an existing configuration blob will probably write and compile their own.
While it may appear surprising to suggest MicroPython insted of a regular CPython instance, this is based on the fact that kannader needs to have one interpreter instance per message in flight, to make sure there is no interference between two messages. As such, the CPython resident memory size would probably be prohibitive for use cases that see a high number of emails flow through. Use cases that only handle few emails would probably work well with CPython, but, the differences between CPython and MicroPython not being that important for something that after all is nothing but a configuration format, MicroPython will probably be a better choice.
Security
No high privilege - no root
For historical reasons smtpd servers required root privilege to run. This necessity was explained by several things:
-
the ability to use TCP port 25 on Unix-like systems requires root privilege. (on modern Linux at least
CAP_NET_BIND_SERVICE
capability is required) -
the ability to ignore DAC and write mail to user's directory requries root (
CAP_CHOWN
andCAP_DAC_OVERRIDE
) -
amusingly, the ability to drop privilege requires root too. (to employ privilege separation techniques, e.g. spawn child processes and set their UID and GID to a non-privileged user, root or
CAP_SETUID
andCAP_SETGID
capabilities are needed)
But all of the above are rudiments of the past.
It is no longer a catastrophe if an unprivileged process binds to transport layer ports less than 1024. Everyone should consider reading and writing the network medium as unlimited due to hardware no longer costing a million dollars, regardless of what an operating system does.
Rise and Fall of the Operating System
The very action of opening TCP port 25 can be delegated to a very small privileged program or OS service manager. OS service managers run with root privilege anyway. The SMTP server can accept the passed descriptor and use it without ever having to escalate privilege.
Nowadays people are do not usually login to their mail server via ssh to check their mail.
Modern mail servers are running MDAs like Dovecot to present an IMAP interface
to the user. Very likely the mail itself is stored in a virtual mailboxes, owned
by one user (usually vmail
)
This practically obsoletes the necessity to support mbox or Maildir in the SMTP server. Accepted mail can be handed over to MDA via LMTP or piped into another executable.
One might argue that privilege separation is still necessary to ensure security and separation of concerns (even Rust might have vulnerabilities discovered in the future). But this also can be delegated to the OS's service managers - it is a tool that is designed to orchestrate processes.
All this makes it possible to run the SMTP server as a non-privileged user (or set of non-privileged users)
Safe Programming Language - Rust
Deprecate legacy interfaces - no mbox, .forward and alike
I think several of these errata help demonstrate that principles like eliminating legacy interfaces and reducing complexity are vital to maintaining security.
rethinking openbsd security by Ted Unangst
Simplicity
Do not re-invent the wheel.
- Do one job - transfer mail. Transfer to MDAs and other SMTP servers
- Do not orchestrate processes - offload this work to the OS service manager.
- Don't do MDA's work - let MDA deliver messages to users. Hand the mail to MDA via LMTP
On the other hand, simplicity does not mean that things should be hard to get to work -- quite the contrary. For example with the OS service manager, with simple things built, it is possible to either use the OS service manager to handle orchestration (the preferred solution), but also to run a minimal wrapper that kannader provides. This is required for the cases where the OS service manager has different abilities than the one for which kannader was designed.
How The Queue Works
High-Level Overview
Kannader, as a library design, supports any storage mechanism that can
implement the
Storage
trait.
The core idea of this trait is, that each function must either succeed or return an error for future retry.
The storage must provide basically three sets of primitives:
- For listing the emails that were left over from previous runs:
list_queue
,find_inflight
andfind_pending_cleanup
- For queuing emails and reading them from the storage:
enqueue
andread_inflight
, that both operate on already-SMTP-encoded mails
- For handling state changes for each email:
reschedule
send_start
,send_done
andsend_cancel
drop
andcleanup
This being said, we do not expect system administrators to write their own storage systems, unless they have very particular needs. As a consequence, an implementation is provided with kannader, and bundled in kannader the executable, that works with a local filesystem queue like most other SMTP servers do by default.
Provided implementation: queueing with the local filesystem
File Structure
<queue>/data
: location for the contents and metadata of the emails in the queue<queue>/queue
: folder for holding symlinks to the emails<queue>/inflight
: folder for holding symlinks to the emails that are currently in flight<queue>/cleanup
: folder for holding symlinks to the emails that are currently being deleted after being successfully sent
<queue>/data
is the only directory that holds things that are not
symbolic links. All other folders only hold symbolic links that must
point into <queue>/data
as relative links.
Assumptions
- Moving a symlink to another folder is atomic between
<queue>/queue
,<queue>/inflight
and<queue>/cleanup
- Moving a file is atomic between files in the same
<queue>/data/**
folder - Creating a symlink in the
<queue>/queue
folder is atomic - Once a write is flushed without error, it is guaranteed not to be changed by something other than a kannader instance (or another system aware of kannader's protocol and guarantees)
<queue>/data
Each email in <queue>/data
is a folder, that is constituted of:
<mail>/contents
: the RFC5322 content of the email<mail>/<dest>/metadata
: the JSON-encodedMailMetadata<U>
<mail>/<dest>/schedule
: the JSON-encodedScheduleInfo
couple
Both <mail>/<dest>/metadata
and <mail>/<dest>/schedule
could
change over time. In this case, the replacement gets written by
writing a <filename>.{{random_uuid}}
then renaming it in-place.
Enqueuing Process
When enqueuing, the process is:
- Create
<queue>/data/<uuid>
, thereafter named<mail>
- For each destination (ie. recipient email address):
- Create
<mail>/<uuid>
, thereafter named<mail>/<dest>
- Write
<mail>/<dest>/schedule
and<mail>/<dest>/metadata
- Create
- Give out the Enqueuer to the user for writing
<mail>/contents
- Wait for the user to commit the Enqueuer
- Create a symlink from
<queue>/queue/<uuid>
to<mail>/<dest>
for each destination
Starting and Cancelling Sends
When starting to send or cancelling a send, the process is:
- Move
<queue>/queue/<id>
to<queue>/inflight/<id>
(or back)
Cleaning Up
When done with sending a mail and it thus needs to be removed from disk, the process is:
- Move
<queue>/inflight/<id>
to<queue>/cleanup/<id>
- Remove
<queue>/cleanup/<id>/*
(which actually are in<queue>/data/<mail>/<dest>/*
) - Remove the target of
<queue>/cleanup/<id>
(the folder in<queue>/data/<mail>
) - Check whether only
<queue>/data/<mail>/contents
remains, and if so remove it as well as the<queue>/data/<mail>
folder - Remove the
<queue>/cleanup/<id>
symlink
Community
Kannader's community is currently mostly on IRC/Matrix. Feel free to drop by, be it for asking questions, giving ideas, raising issues or just saying hi!
The main channel is #kannader
on libera.chat.
Here are a few ways to join it:
- Join
#kannader:libera.chat
from Matrix, eg. using the online client, or - Use a temporary online client.