Changes

anarcat · 98df3763
--- a/service/email.md
+++ b/service/email.md
@@ -929,3 +929,977 @@ time:
  * [LDAP is a SPOF for MX and SRS](https://gitlab.torproject.org/tpo/tpa/team/-/issues/41888)
  * [We're susceptible to spoofing](https://gitlab.torproject.org/tpo/tpa/team/-/issues/41889)
+## Installation
+### DKIM configuration
+Hosts which generate outbound mail should be configured to add `DKIM-Signature`
+headers. This is done by including the `profile::postfix::dkim` class on Puppet
+nodes which handle mail and configuring Postfix's `main.cf` to use it as a
+milter. When the node's `profile::postfix::mail_processing` flag is
+set to true, this is done automatically.
+This class will install the `opendkim` and `opendkim-tools` packages, manage
+the `/etc/opendkim.conf` configuration file, generate a private key under
+`/etc/opendkim/keys` and set up what is needed for OpenDKIM and Postfix to
+communicate.
+By default, the class configures OpenDKIM to only sign email where the sender
+domain part is identical to the node's FQDN. However, this is not often the
+case and the OpenDKIM `Domain` parameter must be specified.
+To do this, add a `profile::opendkim::domain` key in Hiera, ideally in the
+yaml file that provide data for the node's role, eg.
+`hiera/roles/email::submission.yaml` for the submission sever. The value must
+be specified as an array containing one or more email sender domains. For
+example, if a node sends mail as `gettor@torproject.org`, the
+`profile::opendkim::domain` must be an array that contains `torproject.org`.
+Once this configuration is applied on the node, at least one new DNS entry must
+be created for external hosts to be able to validate DKIM signatures. For
+example, on the `submit-01` host, a `/etc/opendkim/keys/2022-submit-01.txt`
+file is automatically generated containing a DNS entry appropriate for
+validating mails signed for the `@torproject.org` sender domain:
+        2022-submit-01._domainkey       IN      TXT     ( "v=DKIM1; h=sha256; k=rsa; "
+                  "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA6iJa25pvx5qatYV4u44zolHLMiqkWZBvF6UJcX8hrSNES/aw/k4zXiliChs3eUtGHjf5aSNC6TrOndfQqRxMxNn+XhSEsUa6zFhZeeAOIi0A3TPRd5qL8aWfHH2NtHZAnZ5lodkA6HjJ/HpyrJvFuyuJ94yNL/bjvRWu+bMwixBIYr6znDoJYGTPC5YHZt48bJgvg3lAb3vIwD"
+                  "bkBw+bMkZCbRjSQuOM52pg6uQLSBiBeQHqWkSd03vp4A906jWaMLDHMfVZDDrXLg+QG2nAOoJmZ0l5argoIRiEG/8GO72FI2dEKJaXgXYqpSXGCtzZJNIr8schHFZBirZBLljbEwIDAQAB" )  ; ----- DKIM key 2022-submit-01 for torproject.org
+If a subdomain prefix is used (e.g. `crm.torproject.org`) make sure the
+`_domainkey` is under that prefix:
+        2022-crm-int-01._domainkey.crm IN      TXT     ( "v=DKIM1; h=sha256; k=rsa; "
+                  "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAtSNGCjHmZnGrnBb9nCsPUc6MjZd5QueGKV+iXwcRNfU0LapFZMi5t7WE/kTPJsRWIF8AMHymNqLA5835m5LwaBBXZdu1utNARKSXDzGsEjxuDiAnSqD0Rb1px1JA+Eex0RC3thYZuyIIAxK31pXxJt2mowXtrhIkuKFB2YpE0yUudKuDZIZZ3YNH025czK/jFLD6TH+5xD9Cej"
+                  "H0MB6tE4O41rCjZUjSZ7Ar7BjVID6foCmlbr/3EG7dbzQv6YqH19OX6YgL0UMfG2RhvhWEUNYghS6K88vTelnHx/ShUzIeu05jd6mi9OLCA/Hl2bFRsa0f1ttHKpnzuC+ecn0sWwIDAQAB" )  ; ----- DKIM key 2022 for crm.torproject.org
+The key may now be tested with the command below. Make sure to use the correct
+sender domain for the `-d` command line argument:
+        opendkim-testkey -d torproject.org -s 2022-$(hostname) -vv
+This should show, once DNS propagated:
+        root@submit-01:/etc/opendkim/keys# opendkim-testkey -d torproject.org -s 2022-$(hostname) -vv
+        opendkim-testkey: using default configfile /etc/opendkim.conf
+        opendkim-testkey: key loaded from /etc/opendkim/keys/2022-submit-01.private
+        opendkim-testkey: checking key '2022-submit-01._domainkey.torproject.org'
+        opendkim-testkey: key secure
+If you see `record not found`, it's because the DNS record wasn't found. See if
+DNS has propagated properly, maybe flush negative responses with
+`unbound-control flush-negative`.
+The `keys not secure` message means you are not using DNSSEC, which should
+*not* happen in our configuration. Investigate if you do see the warning.
+At this point it's a good idea to [test outgoing mail](#testing-outgoing-mail):
+        echo "this is a test email" | mail -r postmaster@torproject.org -s 'test email from anarcat' -- check-auth@verifier.port25.com
+#### DKIM signing on the mail relay
+On some hosts it's not practical to configure OpenDKIM because it generates
+some mail but otherwise doesn't process mail, so its `mail_processing` flag is
+not enabled. Usually such hosts route their outbound email though `mta-dal-01`, so
+we can use it to sign email on behalf of other hosts.
+#### Manual DKIM configuration (deprecated)
+This was a rushed OpenDKIM deployment procedure that was used in
+[tpo/tpa/team#40981][] and [tpo/tpa/team#40988][] (eugeni and
+submit-01). It has been added to Puppet in [tpo/tpa/team#40989][].
+This procedure is DEPRECATED. Hosts MUST be configured with Puppet
+(above) instead. The procedure is kept only for historical reference.
+ 1. install OpenDKIM:
+        apt install opendkim opendkim-tools
+ 2. ensure you have those lines in `/etc/opendkim.conf`:
+        LogWhy          yes
+        Mode            s # sign only, use sv to also check incoming
+        Domain          torproject.org
+        Selector        2022-submit-01 # 2022-submit-01._domainkey.torproject.org
+        Keyfile         /etc/opendkim/keys/2022-submit-01.private
+        Socket          local:/var/spool/postfix/opendkim/opendkim.sock
+    Note that the `Selector` and `Domain` fields are prone to change
+    if the server is sending mail from more than one domain. For
+    example, on `crm-int-01`, we also put `crm-int-01.torproject.org`
+    and `crm.torproject.org` in there. The selector is also based on
+    the year of creation (`2022` in this case) and the short hostname
+    of the server the key belongs to (`submit-01`) so that we don't
+    have to copy that private key around.
+ 3. generate the keys and directories:
+        mkdir -p /etc/opendkim/keys &&
+        mkdir /var/spool/postfix/opendkim &&
+        chown opendkim /var/spool/postfix/opendkim &&
+        opendkim-genkey --directory=/etc/opendkim/keys/ --selector=2022-submit-01 --domain=torproject.org --verbose
+ 4. grant Postfix access to the OpenDKIM socket:
+        adduser postfix opendkim
+ 5. restart the server:
+        service opendkim restart
+ 6. add the keys from `/etc/opendkim/keys/2022-submit-01.txt` into
+    DNS, e.g.:
+        2022-submit-01._domainkey       IN      TXT     ( "v=DKIM1; h=sha256; k=rsa; "
+                  "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA6iJa25pvx5qatYV4u44zolHLMiqkWZBvF6UJcX8hrSNES/aw/k4zXiliChs3eUtGHjf5aSNC6TrOndfQqRxMxNn+XhSEsUa6zFhZeeAOIi0A3TPRd5qL8aWfHH2NtHZAnZ5lodkA6HjJ/HpyrJvFuyuJ94yNL/bjvRWu+bMwixBIYr6znDoJYGTPC5YHZt48bJgvg3lAb3vIwD"
+                  "bkBw+bMkZCbRjSQuOM52pg6uQLSBiBeQHqWkSd03vp4A906jWaMLDHMfVZDDrXLg+QG2nAOoJmZ0l5argoIRiEG/8GO72FI2dEKJaXgXYqpSXGCtzZJNIr8schHFZBirZBLljbEwIDAQAB" )  ; ----- DKIM key 2022-submit-01 for torproject.org
+    if you had a subdomain prefix (e.g. in [tpo/tpa/team#40981][] we
+    used `crm.tpo`), make sure the `_domainkey` is under that prefix:
+        2022._domainkey.crm IN      TXT     ( "v=DKIM1; h=sha256; k=rsa; "
+                  "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAtSNGCjHmZnGrnBb9nCsPUc6MjZd5QueGKV+iXwcRNfU0LapFZMi5t7WE/kTPJsRWIF8AMHymNqLA5835m5LwaBBXZdu1utNARKSXDzGsEjxuDiAnSqD0Rb1px1JA+Eex0RC3thYZuyIIAxK31pXxJt2mowXtrhIkuKFB2YpE0yUudKuDZIZZ3YNH025czK/jFLD6TH+5xD9Cej"
+                  "H0MB6tE4O41rCjZUjSZ7Ar7BjVID6foCmlbr/3EG7dbzQv6YqH19OX6YgL0UMfG2RhvhWEUNYghS6K88vTelnHx/ShUzIeu05jd6mi9OLCA/Hl2bFRsa0f1ttHKpnzuC+ecn0sWwIDAQAB" )  ; ----- DKIM key 2022 for crm.torproject.org
+ 7. then test the key with:
+        opendkim-testkey -d torproject.org -s 2022-submit-01 -vv
+    ... which should show, once DNS propagated:
+        root@submit-01:/etc/opendkim/keys# opendkim-testkey -d torproject.org -s 2022-submit-01 -vv
+        opendkim-testkey: using default configfile /etc/opendkim.conf
+        opendkim-testkey: key loaded from /etc/opendkim/keys/2022-submit-01.private
+        opendkim-testkey: checking key '2022-submit-01._domainkey.torproject.org'
+        opendkim-testkey: key secure
+    If you see `record not found`, it's because the DNS record wasn't
+    found. See if DNS has propagated properly, maybe flush negative
+    responses with `unbound-control flush-negative`.
+    The `keys not secure` message means you are not using DNSSEC,
+    which should *not* happen in our configuration. Investigate if you
+    do see the warning.
+ 8. stop puppet to keep it from messing with the Puppet config
+        puppet agent --disable 'rush opendkim deployment by hand, tpo/tpa/team#40988'
+ 8. hook into postfix:
+        postconf -e milter_default_action=accept &&
+        postconf -e smtpd_milters=local:opendkim/opendkim.sock &&
+        postconf -e non_smtpd_milters=local:opendkim/opendkim.sock
+ 9. reload postfix (warning: this will retry the queue)
+        service postfix reload
+ 10. [test outgoing mail](#testing-outgoing-mail):
+     ```
+     echo "this is a test email" | mail -r postmaster@torproject.org -s 'test email from anarcat' -- check-auth@verifier.port25.com
+     ```
+[tpo/tpa/team#40981]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40981
+[tpo/tpa/team#40988]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40988
+[tpo/tpa/team#40989]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40989
+If you get double OpenDKIM signatures, consider adding
+`receive_override_options=no_milters` to the server in `master.cf`, if
+already does some content filtering. 
+It's possible that some mail doesn't get signed when injected from
+Mailman, consider changing the `InternalHosts` to:
+    InternalHosts   lists-01.torproject.org,lists.torproject.org,127.0.0.1,::1,localhost # cargo-culted, to remove?
+If the signatures come from another server and cause problems (for
+example with Mailman), you can strip those with:
+    # strip incoming sigs, typically from submit-01 or other servers on mailman
+    RemoveOldSignatures yes
+### SPF records
+In [tpo/tpa/team#40990](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40990), anarcat deployed "soft" [SPF](https://en.wikipedia.org/wiki/Sender_Policy_Framework) records
+for all outgoing mail servers under `torproject.org`. The full
+specification of SPF is in [RFC7208](https://www.rfc-editor.org/rfc/rfc7208), here's a condensed
+interpretation of our (current, 2025) policy:
+#### torproject.org
+```
+@			IN	TXT	"v=spf1 a:crm-int-01.torproject.org a:submit-01.torproject.org a:rdsys-frontend-01.torproject.org a:polyanthum.torproject.org a:srs-dal-01.torproject.org a:mta-dal-01.torproject.org ~all"
+```
+This is a "soft" (`~all`) record that will tell servers to downgrade
+the reputation of mail send with a `From: *@torproject.org` header
+when it doesn't match any of the preceding mechanisms.
+We use the `a:` mechanism to point at 6 servers that normally should
+be sending mail as `torproject.org`:
+ * `crm-int-01`, the [CRM](service/crm) server
+ * `submit-01`, the submission mail server
+ * `rdsys-frontend-01`, the rdsys server
+ * `polyanthum`, the bridges server
+ * `srs-dal-01`, the sender-rewriting server
+ * `mta-dal-01`, our MTA
+The `a` mechanism tells SPF-compatible servers to check the `A` and
+`AAAA` records of the given server to see if it matches with the
+connecting server. 
+We use the `a:` mechanism instead of the (somewhat more common) `ip4:`
+mechanism because we do not want to add both the IPv4 and IPv6
+records.
+#### db.torproject.org: a
+Some servers have a record like that:
+```
+db			IN	A	49.12.57.132				; alberti
+			IN	AAAA	2a01:4f8:fff0:4f:266:37ff:fea1:4d3	; alberti
+			IN	MX	0 alberti
+			IN	TXT	"v=spf1 a ~all"
+```
+This is also a "soft" record that tells servers to check the `A` or
+`AAAA` records (`a`) to see if it matches the connecting server. It
+will match only if the connecting server has an IP matching the `A` or
+`AAAA` record for `db.torproject.org`.
+#### lists.torproject.org: mx
+```
+lists		IN	TXT	"v=spf1 mx ~all"
+```
+This is also a "soft" record that tells servers to check the Mail
+Exchanger record (`MX`) to see if it matches the connecting server.
+It might be better to use a `a:` record here to avoid a DNS lookup,
+but it might actually be possible that the MX for lists is in a
+different location than the web interface, for example.
+#### gitlab: a and CNAME
+Some mail servers have a record like:
+```
+gitlab       IN    CNAME    gitlab-02
+gitlab-02    IN    TXT      "v=spf1 a ~all" ; no one else than gitlab can send for gitlab
+```
+This is similar to the `db` record in that it uses a `a` mechanism,
+but the actual name is behind a `CNAME` record.
+In other words, only that server can send email as
+`gitlab-02.torproject.org`. But since there's also a `CNAME` from
+`gitlab` to `gitlab-02`, this policy actually also applies to
+`gitlab.torproject.org`:
+    $ dig -t txt gitlab.torproject.org +nostats +nocomments +nocmd +noquestion
+    gitlab.torproject.org.	1415	IN	CNAME	gitlab-02.torproject.org.
+    gitlab-02.torproject.org. 3496	IN	TXT	"v=spf1 a ~all"
+#### CRM: hard record
+Finally, one last example is the CiviCRM records:
+```
+crm			IN	A	116.202.120.186	; crm-int-01
+			IN	AAAA	2a01:4f8:fff0:4f:266:37ff:fe4d:f883
+			IN	TXT	"v=spf1 a -all"
+			IN	MX	0 crm
+```
+Those are similar to the `db.torproject.org` records except they are
+"hard" (`-all`) which should, in theory, make other servers completely
+reject attempts from servers not in the `A` or `AAAA` record of
+`crm.torproject.org`.
+#### Debugging SPF
+[vamsoft's SPF checker](https://vamsoft.com/support/tools/spf-policy-tester) is good to quickly diagnose issues.
+### DMARC records
+[DMARC](https://en.wikipedia.org/wiki/DMARC) records glue together SPF and DKIM to tell which *policy*
+to apply once the rules defined above check out (or not). It is
+defined in [RFC7489](https://www.rfc-editor.org/rfc/rfc7489.html) and has a [friendly homepage](https://dmarc.org/) with a [good
+introduction](https://dmarc.org/overview/). Note that DMARC usage has been growing steadily
+since 2018 and more steeply since 2021, see the [usage stats](https://dmarc.org/stats/dmarc/). See
+also the [Alex top site usage](https://dmarc.org/stats/alexa-top-sites/).
+Our current DMARC policy is:
+    _dmarc  IN  TXT "v=DMARC1;p=none;pct=100;rua=mailto:postmaster@torproject.org"
+That is a "soft" policy (`p=` is `none` instead of `quarantine` or
+`reject`) that applies to all email (`pct=100`) and sends reports to
+the `postmaster@` address.
+Note that this applies to *all* subdomains by default, to change the
+subdomain policy, the `sp=` mechanism would be used (same syntax as
+`p=`, e.g. `sp=quarantine` would apply the `quarantine` policy to
+subdomains, independently of the top domain policy). See [RFC 7489
+section 6.6.3](https://www.rfc-editor.org/rfc/rfc7489.html#section-6.6.3) for more details on discovery.
+### Submission server
+To setup a new submission mail server, create a machine with the
+`email::submission` role in Puppet. Ideally, it should be on a network
+with a good IP reputation.
+In `letsencrypt.git`, add an entry for that host's specific TLS
+certificate. For example, the `submit-01.torproject.org` server has a
+line like this:
+    submit-01.torproject.org submit.torproject.org
+Those domains are glued together in DNS with:
+    submission              IN      CNAME   submit-01
+    _submission._tcp        IN      SRV     0 1 465 submission
+This implies there is only *one* `submission.torproject.org`, because
+one cannot have multiple `CNAME` records, of course. But it should
+make replacing the server transparent for end-users.
+The latter SRV record is actually specified in [RFC6186](https://datatracker.ietf.org/doc/html/rfc6186), but may
+not be sufficient for all automatic configuration. We do *not* go
+deeper into auto-discovery, because that typically implies IMAP
+servers and so on. But if we would, we could consider using [this
+software which tries to support all of them](https://github.com/Monogramm/autodiscover-email-settings) (e.g. [Microsoft](https://docs.microsoft.com/en-us/exchange/architecture/client-access/autodiscover?view=exchserver-2019),
+[Mozilla](https://developer.mozilla.org/en-US/docs/Mozilla/Thunderbird/Autoconfiguration/FileFormat/HowTo), [Apple](https://developer.apple.com/library/archive/featuredarticles/iPhoneConfigurationProfileRef/index.html)). For now, we'll only stick with the SRV
+record.
+### Mailman server
+See the [mailman documentation](service/lists#installation).
+## Upgrades
+Upgrades should generally be covered by the normal Debian package
+workflow.
+## SLA
+There is no SLA specific to this service, but mail delivery is
+generally considered to be high priority. Complaints about delivery
+failure should be filed as [issues in our ticket tracker](#issues) and
+addressed.
+## Design and architecture
+The submission email service allows users to submit mail as if they
+were on a `torproject.org` machine. Concretely, it is a Postfix server
+which relays email to anywhere once [SASL authentication](https://en.wikipedia.org/wiki/Simple_Authentication_and_Security_Layer) is
+passed.
+Most of the code is glue code in Puppet, along with a small set of
+patches to ud-ldap which were sent (and mostly accepted) upstream.
+### Email password sync flow
+This horrid diagram describes the way email passwords are set from
+LDAP to the submission server:
+![](email/submit-mail-password-sync-diagram.png)
+## Services
+The "submission" port (587) was previously used in the documentation
+by default because it is typically *less* blocked by ISP firewalls
+than the "smtps" port (465), but both are supported. Lately, the
+documentation has been changed for suggest port 465 first instead.
+The TLS server is authenticated using the regular Let's Encrypt CA
+(see [TLS documentation](howto/tls)).
+## Storage
+Mail services currently do not involve any sort of storage other than
+mail queues (below).
+## Queues
+Mail servers typically transfer emails into a queue on reception, and
+flush them out of the queue when the email is successfully
+delivered. Temporary delivery failures are retried for 5 days
+([`bounce_queue_lifetime`](https://www.postfix.org/postconf.5.html#bounce_queue_lifetime) and [`maximal_queue_lifetime`](https://www.postfix.org/postconf.5.html#maximal_queue_lifetime)).
+We use the Postfix defaults for those settings, which may vary from
+the above.
+## Interfaces
+Most of Postfix and Dovecot operations are done through the
+commandline interface.
+## Authentication
+On the submission server, SASL authentication is delegated to a dummy
+Dovecot server which is *only* used for authentication (i.e. it
+doesn't provide IMAP or POP storage). Username/password pairs are
+deployed by [ud-ldap](howto/ldap) into `/etc/dovecot/private/mail-passwords`.
+The LDAP server stores those passwords in a `mailPassword` field and
+the web interface is used to modify those passwords. Passwords are
+(currently) encrypted with a salted MD5 hash because of compatibility
+problems between the Perl/ud-ldap implementation and Dovecot which
+haven't been resolved yet.
+## Implementation
+Most software in this space is written in C (Postfix, Dovecot, OpenDKIM).
+## Related services
+The submission and mail forwarding services both rely on the [LDAP
+service](howto/ldap), for secrets and aliases, respectively.
+The [mailing list service](services/lists) and [schleuder](service/schleuder) both depend on basic
+email services for their normal operations. The [CiviCRM service](service/crm)
+is also a particularly large mail sender.
+## Issues
+There is no issue tracker specifically for this project, [File][] or
+[search][] for issues in the [team issue tracker][search] with the
+~Email label.
+When reporting email issues, do mind the [reporting email problems](doc/reporting-email-problems)
+documentation.
+ [File]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/new
+ [search]: https://gitlab.torproject.org/tpo/tpa/team/-/issues?label_name%5B%5D=Email
+The submission project was coordinated and launched in [ticket
+#30608][]. 
+The emergency changes to the infrastructure (including DKIM, DMARC,
+and SPF records) were done as part of [TPA-RFC-44](policy/tpa-rfc-44-email-emergency-recovery)
+([tpo/tpa/team#40981](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40981)).
+## Maintainer
+This service is mostly written as a set of Puppet manifests. It was
+built by anarcat, and is maintained by TPA.
+Some parts of the mail services (the submission service, in
+particular) depends on patches on `userdir-ldap` that were partially
+merged in the upstream, see [LDAP docs](howto/ldap#maintainer-users-and-upstream) for details.
+## Users
+Users of this service are mostly core tor members. But effectively,
+any email user on the internet can interact with our mail servers in
+one way or another.
+## Upstream
+Upstreams vary.
+Most of the work done in our mail services is performed by
+[Postfix](https://www.postfix.org/), which is an active project and de-facto standard for new
+mail servers out there. Postfix was written by [Wietse Venema](https://www.porcupine.org/wietse/)
+while working at [IBM research](http://www.research.ibm.com/).
+The [Dovecot mail server](https://dovecot.org/) was written by [Timo Sirainen](https://en.wikipedia.org/wiki/Timo_Sirainen) and is
+one of the most widely used IMAP servers out there. It is an active
+upstream as well.
+[OpenDKIM](https://github.com/trusteddomainproject/OpenDKIM) is not in such good shape: it hasn't had a commit or
+release in over 4 years (as of late 2022).
+## Monitoring and metrics
+The Postfix server is monitored by Prometheus, as with all servers in the
+`publicmail` group. This only checks that the SMTP port is open. We do
+not have end to end delivery monitoring just yet, that is part of the
+[improve mail services milestone](https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/4), specifically [issue 40494](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40494).
+The submission server is monitored like other mail servers that have
+`profile::postfix::mtail_monitor` enabled, which is that it has the `mtail` exporter
+(`profile::prometheus::postfix_mtail_exporter`). The [Grafana
+dashboard](https://grafana.torproject.org/d/Ds5BxBYGk/postfix-mtail?orgId=1&var-node=submit-01.torproject.org) should provide shiny graphs.
+## Tests
+### Submission server
+See [Testing the submission server](#testing-the-submission-server).
+## Logs
+Mail logs are in `/var/log/mail.log` and probably systemd
+journals. They contain PII like IP addresses and usernames and are
+regularly purged. 
+Mails incoming on the submission server are scanned by fail2ban to ban
+IP addresses trying to bruteforce account passwords.
+## Backups
+No special backup of this service is required.
+If we eventually need to host mailboxes, those *may* require special
+handling as large Maildir folders are known to create problems with
+backup software.
+## Other documentation
+This service was setup following some or all of those documents:
+ * [Anarcat's home email setup](https://anarc.at/services/mail/)
+ * [Postfix SASL howto](http://www.postfix.org/SASL_README.htm)
+ * [Dovecot configuration](https://doc.dovecot.org/)
+   * [VirtualUsers](https://wiki.dovecot.org/VirtualUsers)
+   * [passwd file](https://doc.dovecot.org/configuration_manual/authentication/passwd_file/)
+   * [password databases](https://doc.dovecot.org/configuration_manual/authentication/password_databases_passdb/)
+   * [user databases](https://doc.dovecot.org/configuration_manual/authentication/user_databases_userdb/)
+ * [RFC821](https://tools.ietf.org/html/rfc821) (SMTP, 1982) and [RFC5321](https://tools.ietf.org/html/rfc5321) (SMTP, 2008)
+ * [RFC6409](https://www.rfc-editor.org/rfc/rfc6409) (Email submission, 2011)
+ * [RFC8314](https://www.rfc-editor.org/rfc/rfc8314) ("Cleartext Considered Obsolete: Use of Transport
+   Layer Security (TLS) for Email Submission and Access, makes port
+   465 legitimate for submission)
+ * [RFC5598](https://www.rfc-editor.org/rfc/rfc5598) (Internet Mail Architecture, 2009)
+ * [RFC6186](https://datatracker.ietf.org/doc/html/rfc6186) (SRV auto-discovery)
+ * [ticket 30608](https://gitlab.torproject.org/tpo/tpa/team/-/issues/30608), the original request for the submission service
+ * [first discussion of the submission service](meeting/2019-06-03#new-mail-service-requests), the project was
+   actually agreed upon at the Stockholm meeting in 2019
+# Discussion
+The mail services at Tor have been rather neglected,
+traditionally. No effort was done to adopt modern standards (SPF,
+DKIM, DMARC) which led to significant deliverability problems in late
+2022.
+## Overview
+<!-- describe the overall project. should include a link to a ticket -->
+<!-- that has a launch checklist -->
+<!-- if this is an old project being documented, summarize the known -->
+<!-- issues with the project. --> 
+## Security and risk assessment
+No audit was ever performed on the mail services. 
+The lack of SPF records and DKIM signatures mean that users must rely
+on out-of-band mechanisms (like OpenPGP) to authenticate incoming
+emails. Given that such solutions (especially OpenPGP) are not widely
+adopted, in effect it means that anyone can easily impersonate
+`torproject.org` users.
+We have heard regular reports of phishing attempts against our users
+as well ([tpo/tpa/team#40596](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40596)), sometimes coming from our own
+domain. Inbound mail filters are expected to improve that situation
+([tpo/tpa/team#40539](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40539)).
+## Technical debt and next steps
+The next step in this project is to rebuild a proposal to followup on
+the long term plan from TPA-RFC-44 (TPA-RFC-45, [issue
+tpo/tpa/team#41009](https://gitlab.torproject.org/tpo/tpa/team/-/issues/41009)). This will mean either outsourcing mail
+services or building a proper mail hosting service.
+## Proposed Solutions
+We went through a number of proposals to improve mail services over
+time:
+ * [TPA-RFC-15: Email services](policy/tpa-rfc-15-email-services) (rejected, replaced with
+   TPA-RFC-31)
+ * [TPA-RFC-31: outsource email services](policy/tpa-rfc-31-outsource-email) (rejected as well, in
+   favor of TPA-RFC-44 and following)
+ * [TPA-RFC-44: Email emergency recovery](policy/tpa-rfc-44-email-emergency-recovery) (DKIM, SPF, DMARC
+   records, long term plan postponed)
+ * TPA-RFC-45: Mail architecture (long term plans from TPA-RFC-44)
+## High availability notes
+If we do host our own IMAP servers eventually, we would like them to
+be highly available, without human intervention. That means having an
+"active-active" mirror setup where the failure of one host doesn't
+affect users at all and doesn't require human intervention to restore
+services.
+We already know quite well how to do an active/passive setup: DRBD
+allows us to replicate entire disks between machines. It *might* be
+possible to do the same with active/active setups in DRBD, in theory,
+but in practice this quickly runs into filesystem limitations, as
+(e.g.) ext4 is *not* designed to be accessed by multiple machines
+simultaneously.
+Dovecot has a [replication system called dsync](https://doc.dovecot.org/configuration_manual/replication/) that replicates
+mailboxes over a pipe. There are examples for TCP, TLS and SSH. [This
+blog post](http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html) explains the design as well. A pair of [director](https://doc.dovecot.org/admin_manual/director/dovecotdirector/)
+processes could be used to direct users to the right server. [This
+tutorial](http://web.archive.org/web/20201111212844/https://blog.le-vert.net/?p=160) seems to have been useful for people.
+Dovecot also shows a [HAProxy configuration](https://doc.dovecot.org/configuration_manual/haproxy/).  A script called
+[poolmon](https://github.com/brandond/poolmon/tree/master) seems to be used by some folks to remove/re-add backends
+to the director when the go unhealthy. Dovecot now ships a
+[dovemon](https://doc.dovecot.org/configuration_manual/dovemon/) program that works similarly, but it's only available in
+the non-free "Pro" version.
+There's also a [ceph plugin](https://github.com/ceph-dovecot/dovecot-ceph-plugin) to store emails in a Ceph backend.
+It also seems possible to store mailbox and index objects in an
+[object storage backend](https://doc.dovecot.org/admin_manual/dovecot_backend/), a configuration documented in the
+[Dovecot Cluster Architecture](https://doc.dovecot.org/admin_manual/dovecot_cluster_architecture/). It seems that, unfortunately, this
+is part of the "Pro" version of Dovecot, not usable in the free
+version (see [mailbox formats](https://doc.dovecot.org/admin_manual/mailbox_formats/)). There's also someone who
+implemented a [syncthing backend](https://github.com/fragtion/dovecot-core).
+## Submission server proposal
+Note: this proposal was discussed inline in the old
+[howto/submission](howto/submission) page, before the TPA-RFC process existed. It is
+kept here for historical reference.
+The idea is to create a new server to deal with delivery problems
+torproject.org email users are currently seeing. While they can
+receive email through their `user@torproject.org` forwards without too
+much problem, their emails often get dropped to the floor when
+*sending* from that email address.
+It is suspected that users are having those problems because the
+originating servers are not in the `torproject.org` domain. The hope
+is that setting up a new server inside that domain would help with
+delivery. There's anecdotal evidence (see [this comment](https://gitlab.torproject.org/legacy/trac/-/issues/30608#note_2325187) for
+example) that delivery emails from existing servers (over SSH to
+`iranicum`, in that example) improves reliability of email delivery
+significantly.
+This project came out of [ticket #30608][], which has the launch
+checklist. 
+[ticket #30608]: https://bugs.torproject.org/30608
+Note: [this article](https://news.purelymail.com/posts/blog/2019-06-21-deliverability-for-the-rest-of-us.html) has a good overview of deliverability issues
+faced by autonomous providers, which we already face on eugeni, but
+might be accentuated by this project.
+### Goals
+#### Must have
+ * basic compatibility with major clients (Thunderbird, Mail.app,
+   Outlook, Gmail?)
+ * delivery over secure (TLS + password) SMTP
+ * credentials stored in LDAP
+#### Nice to have
+ * automatic client configuration
+ * improved delivery over current federated configuration
+ * delivery reliability monitoring with major providers (e.g. hotmail,
+   gmail, yahoo)
+ * pretty graphs
+ * formalized SSH-key delivery to avoid storing cleartext passwords on
+   clients
+#### Non-Goals
+ * 100%, infaillable, universal delivery to all providers (ie. emails
+   will still be lost)
+ * mailbox management (ie. no incoming email, IMAP, POP, etc)
+ * spam filtering (ie. we won't check outgoing emails)
+ * no DKIM, SPF, DMARC, or ARC for now, although maybe a "null" SPF
+   record if it helps with delivery
+### Approvals required
+Approved by vegas, requested by network team, agreed with TPA at the
+Stockholm meeting.
+### Proposed Solution
+The proposed design is to setup a new email server in the [howto/ganeti](howto/ganeti)
+cluster (currently `gnt-fsn`) with the user list synchronized from
+LDAP, using a new password field (named `mailPassword`). The access
+would therefore be granted only to LDAP users, and LDAP accounts would
+be created as needed. In the short term, LDAP can be used to modify
+that password but in the mid-term, it would be modifiable through the
+web interface like the `webPassword` or `rtcPassword` fields.
+#### Current inventory
+ * active LDAP accounts: 91
+ * non-LDAP forwards (to real people): 24
+ * role forwards (to other @torproject.org emails): 76
+Forward targets:
+ * riseup.net: 30
+ * gmail.com: 21
+ * other: 93 (only 4 domains have more than one forward)
+Delivery rate: SMTP, on eugeni, is around 0.5qps, with a max of 8qps
+in the last 7 days (2019-06-06). But that includes mailing lists as
+well. During that period, around 27000 emails were delivered to
+@torproject.org aliases.
+### Cost
+Labor and `gnt-fsn` VM costs. To be detailed.
+Below is an evaluation of the various Alternatives that were considered.
+#### External hosting cost evaluation
+ * Google: 8$/mth/account? (to be verified?)
+ * riseup.net: anarcat requested price quotation
+ * koumbit.org:  default pricing: 100$/year on shared hosting and 50GB
+   total, possibly no spam filter. 1TB disk: 500$/year. disk
+   encryption would need to be implemented, quoted 2000-4000$ setup
+   fee to implement it in the AlternC opensource control panel.
+ * self-hosting: ~4000-500EUR setup, 5000EUR-7500EUR/year, liberal
+   estimate (will probably be less)
+ * [mailfence](https://mailfence.com/en/secure-business-email.jsp) 1750 setup cost and 2.5 euros per user/year
+Note that the self-hosting cost evaluation is for the fully-fledged
+service. Option 2, above, of relaying email, has overall negligible
+costs although that theory has been questioned by members of the
+sysadmin team.
+#### Internal hosting cost evaluation
+This is a back-of-the-napkin calculation of what it would cost to host
+actual email services at TPA infrastructure itself. We consider this
+to be a “liberal” estimate, ie. costs would probably be less and time
+estimates have been padded (doubled) to cover for errors.
+Assumptions:
+ * each mailbox is on average, a maximum of 10GB
+ * 100 mailboxes maximum at first (so 1TB of storage required)
+ * LUKS full disk encryption
+ * IMAP and basic webmail (Roundcube or Rainloop)
+ * “Trees” mailbox encryption out of scope for now
+Hardware:
+ * Hetzner px62nvme 2x1TB RAID-1 64GB RAM 75EUR/mth, 900EUR/yr
+ * Hetzner px92 2x1TB SSD RAID-1 128GB RAM 115EUR/mth, 1380EUR/yr
+ * Total hardware: 2280EUR/yr, ~200EUR setup fee
+This assumes hosting the server on a dedicated server at Hetzner.  It
+might be possible (and more reliable) to ensure further cost savings
+by hosting it on our shared virtualized infrastructure. Calculations
+for this haven’t been performed by the team, but I would guess we
+might save around 25 to 50% of the above costs, depending on the
+actual demand and occupancy on the mail servers.
+Staff:
+ * LDAP password segregation: 4 hours*
+ * Dovecot deployment and LDAP integration: 8 hours
+ * Dovecot storage optimization: 8 hours
+ * Postfix mail delivery integration: 8 hours
+ * Spam filter deployment: 8 hours
+ * 100% cost overrun estimate: 36 hours
+ * Total setup costs: 72 hours @ 50EUR/hr: 3600EUR one time
+This is the most imprecise evaluation. Most email systems have been
+built incrementally.  The biggest unknown is the extra labor
+associated with running the IMAP server and spam filter. A few
+hypothesis:
+ * 1 hour a week: 52 hours @ 50EUR/hr: 2600EUR/yr
+ * 2 hours a week: 5200EUR/yr
+I would be surprised if the extra work goes beyond one hour a week,
+and will probably be less. This also does not include 24/7 response
+time, but no service provider evaluated provides that level of service
+anyways.
+Total:
+ * One-time setup: 3800EUR (200EUR hardware, 3600EUR staff)
+ * Recurrent: roughly between 5000EUR and 7500EUR/year, majority in staff
+### Alternatives considered
+There are three dimensions to our “decision tree”:
+ 1. Hosting mailboxes or only forwards: this means that instead of
+    just forwarding emails to some other providers, we actually allow
+    users to store emails on the server. Current situation is we only
+    do forwards
+ 2. SMTP authentication: this means allowing users to submit email
+    using a username and password over the standard SMTP (technically
+    “submission”) port. This is currently not allowed also some have
+    figured out they can do this over SSH already.
+ 3. Self-hosted or hosted elsewhere: if we host the email service
+    ourselves right now or not. The current situation is we allow
+    inbound messages but we do not store them. Mailbox storage is
+    delegated to each individual choice of email provider, which also
+    handles SMTP authentication.
+Here are is the breakdown of pros and cons of each approach. Note that
+there are multiple combinations of those possible, for example we
+could continue not having mailboxes but allow SMTP authentication, and
+delegate this to a third party. Obviously, some combinations (like no
+SMTP authentication and mailboxes) are a little absurd and should be
+taken with a grain of salt.
+#### TP full hosting: mailboxes, SMTP authentication
+Pros: 
+ * Easier for TPA to diagnose email problems than if email is hosted
+   by an undetermined third party
+ * People’s personal email is not mixed up with Tor email.
+ * Easier delegation between staff on rotations
+ * Control over where data is stored and how
+ * Full control of our infrastructure
+ * Less trust issues
+Cons:
+ * probably the most expensive option
+ * requires more skilled staff
+ * high availability harder to achieve
+ * high costs
+#### TP not hosting mailboxes; TP hosting outgoing SMTP authentication server
+Pros:
+ * No data retention issues: TP not responsible for legal issues
+   surrounding mailboxes contents
+ * Solves delivery problem and nothing else (minimal solution)
+ * We’re already running an SMTP server
+ * SSH tunnels already let our lunatic-fringe do a version of this
+ * Staff keeps using own mail readers (eg gmail UI) for receiving mail
+ * Federated solution
+ * probably the cheapest option
+ * Work email cannot be accessed by TP staff
+Cons:
+ * SMTP-AUTH password management (admin effort and risk)
+ * Possible legal requests to record outgoing mail? (SSH
+   lunatic-fringe already at risk, though)
+ * DKIM/SPF politics vs “slippery slope”
+ * Forces people to figure out some good ISP to host their email
+ * Shifts the support burden to individuals
+ * Harder to diagnose email problems
+ * Staff or “role” email accounts cannot be shared
+#### TP pays third party (riseup, protonmail, mailfence, gmail??) for full service (mailboxes, delivery)
+Pros:
+ * Less admin effort
+ * Less/no risk to TP infrastructure (legal or technical)
+ * Third party does not hold email data hostage; only handles outgoing
+ * We know where data is hosted instead of being spread around
+Cons:
+ * Not a federated solution
+ * Implicitly accepts email cartel model of “trusted” ISPs
+ * Varying levels of third party data management trust required
+ * Some third parties require custom software (protonmail)
+ * Single point of failure.
+ * Might force our users to pick a provider they dislike
+ * All eggs in the same basket
+#### Status quo (no mailboxes, no authentication)
+Pros:
+ * Easy. Fast. Cheap. Pick three.
+Cons:
+ * Shifts burden of email debugging to users, lack of support
+Details of the chosen alternative (SMTP authentication):
+ * Postfix + offline LDAP authentication (current proposal)
+ * Postfix + direct LDAP authentication: discarded because it might
+   fail when the LDAP server goes down. LDAP server is currently not
+   considered to be critical and can be restarted for maintenance
+   without affecting the rest of the infrastructure.
+ * reusing existing field like `webPassword` or `rtcPassword` in LDAP:
+   considered a semantic violation.
+See also internal Nextcloud document.
+No benchmark considered necessary.
+## New mail setup as per 2024-11-25
+Our new MX (mx-dal-01.torproject.org) and SRS (srs-dal-01.torproject.org) servers try to fit into the picture presented in TPA-RFC-44:
+![](../policy/tpa-rfc-44-email-emergency-recovery/architecture-post.png)
+SRS would classify as 'other TPA mail server' in this picture. It notably *does* send mail to internet non-TPO mail hosts.
+### Routing
+Our main domain name is `torproject.org`. There are numerous subdomains and domain variants (e.g., `nevii.torproject.org`, `torproject.net`, etc.). These are all alias domains, meaning all addresses will be aliased to their `torproject.org` counterpart.
+Lacking mailboxes, a `torproject.org` e-mail address can either be defined as an alias or as a forward.
+Aliases are defined in Hiera.
+Domain aliases are defined in Hiera and through puppet exported resources.
+Forwards are defined in Hiera and in LDAP.
+The MX resolves all aliases. It does *not* resolve forwards, but transports them to the SRS server(s). It does *not* deliver mail to internet non-TPO mail servers.
+The SRS server resolves all forwards, applies sender rewriting when necessary, and sends the mail out into the world.
+### DNS
+MX servers need to be part of the `torproject.org` MX record.
+SRS servers need to be part of the `srs.torproject.org` MX record.
+SRS servers need to be added to the `torproject.org` SPF record.
+SRS servers need to have their DKIM keys added to `torproject.org` DNS.
+All mail servers need reverse DNS records.
+All mail servers need a TLSA record.
+### TLS
+All mail servers need a valid public CA (Let's Encrypt) certificate.
+All mail servers implement DANE (both in terms of publishing a TLSA record and checking for TLSA records when delivering mail).
+### DMARC
+DMARC checks are performed on the MX servers and Authentication-Results are added as header. DMARC is not strictly enforced, but failure to pass results in penalties to the spam score.
+The SRS servers add DKIM signatures for all mail that has a From: header with a `torproject.org` address (header `From`, not envelope From). It does so using rspamd.
+### Spam and virus checks
+The MX server checks incoming mail for spam and virus using rspamd and clamav.
+Viruses and very obvious spam get rejected straight away.
+Suspicion of possible spam results in gray listing, with spam results added as headers when the mail does go through.
+### SRS
+SRS servers perform sender rewriting, changing the envelope `From` to a `torproject.org` address where needed.
+The MX servers translate rewritten `torproject.org` addresses back to their original to process bounce mails.