Fix formatting a bit authored by lelutin's avatar lelutin
we get rid of a couple of markdownlint issues with this. we'll use
fenced code blocks in order to have syntax highlighting: that should
make it a bit easier to read in the rendered version.

I also changed the point about keeping the psql session open as a
sub-point since it's not really something that needs to be explicitly
done. so the list is one point shorter now.
......@@ -246,12 +246,12 @@ send an email containing with a random string to the requester to prove that the
control the email address.
Secondly, the redaction request must be precise and not overly broad. For
example, redacting all instances of "Joe" from the mail archives would
not be acceptable.
example, redacting all instances of "Joe" from the mail archives would not be
acceptable.
Once all that is established, the actual redaction can proceed.
If the requests is limited to one or few messages, then the first compliance
If the request is limited to one or few messages, then the first compliance
option would be to simply delete the messages from the archives. This can be
done using an admin account directly from the web interface.
......@@ -265,165 +265,168 @@ Such "surgical" redaction is done using SQL statements against the `mailman3`
database directly, as mailman doesn't offer any similar compliance mechanism.
In this example, we'll pretend to handle a request to redact the name "Foo Bar"
and an associated email address, "foo@bar.com":
and an associated email address, `foo@bar.com`:
0. Login to `lists-01`, run `sudo -u postgres psql` and `\c mailman3`
1. Backup the affected database rows to temporary tables:
```
CREATE TEMP TABLE hyperkitty_attachment_redact AS
SELECT * FROM hyperkitty_attachment
```sql
CREATE TEMP TABLE hyperkitty_attachment_redact AS
SELECT * FROM hyperkitty_attachment
WHERE
content_type = 'text/html'
and email_id IN
content_type = 'text/html'
and email_id IN
(SELECT id FROM hyperkitty_email
WHERE content LIKE '%Foo Bar%'
OR content LIKE '%foo@bar.com%');
OR content LIKE '%foo@bar.com%');
CREATE TEMP TABLE hyperkitty_email_redact AS
SELECT * from hyperkitty_email
WHERE content LIKE '%Foo Bar%'
OR content LIKE '%foo@bar.com.com%';
CREATE TEMP TABLE hyperkitty_email_redact AS
SELECT * from hyperkitty_email
WHERE content LIKE '%Foo Bar%'
OR content LIKE '%foo@bar.com.com%';
CREATE TEMP TABLE hyperkitty_sender_redact AS
SELECT * from hyperkitty_sender
WHERE address = 'foo@bar.com';
CREATE TEMP TABLE hyperkitty_sender_redact AS
SELECT * from hyperkitty_sender
WHERE address = 'foo@bar.com';
CREATE TEMP TABLE address_redact AS
SELECT * FROM address
WHERE display_name = 'Foo Bar'
OR email = 'foo@bar.com';
CREATE TEMP TABLE address_redact AS
SELECT * FROM address
WHERE display_name = 'Foo Bar'
OR email = 'foo@bar.com';
CREATE TEMP TABLE user_redact AS
SELECT * from "user"
WHERE display_name = 'Foo Bar';
```
CREATE TEMP TABLE user_redact AS
SELECT * from "user"
WHERE display_name = 'Foo Bar';
```
2. Run the actual modifications inside a transaction:
BEGIN;
-- hyperkitty_attachment --
-- redact the name and email in html attachments
-- (only if found in plaintext email)
UPDATE hyperkitty_attachment
SET content = convert_to(
replace(
convert_from(content, 'UTF8'),
'Foo Bar',
'[REDACTED]'
),
'UTF8')
WHERE
content_type = 'text/html'
AND email_id IN
(SELECT id FROM hyperkitty_email
WHERE content LIKE '%Foo Bar%');
UPDATE hyperkitty_attachment
SET content = convert_to(
replace(
convert_from(content, 'UTF8'),
'foo@bar.com',
'[REDACTED]'
), 'UTF8')
WHERE
content_type = 'text/html'
AND email_id IN
(SELECT id FROM hyperkitty_email WHERE content LIKE '%foo@bar.com%');
-- --- hyperkitty_email ---
-- redact the name and email in plaintext emails
UPDATE hyperkitty_email
SET content = REPLACE(content,
'Foo Bar <foo@bar.com>',
'[REDACTED]')
WHERE content LIKE '%Foo Bar <foo@bar.com>%';
UPDATE hyperkitty_email
SET content = REPLACE(content,
'Foo Bar',
'[REDACTED]')
WHERE content LIKE '%Foo Bar%';
UPDATE hyperkitty_email
SET content = REPLACE(content,
'foo@bar.com',
'[REDACTED]')
WHERE content LIKE '%foo@bar.com%';
UPDATE hyperkitty_email -- done
SET sender_name = '[REDACTED]'
WHERE sender_name = 'Foo Bar';
-- obfuscate the sender_id, must be unique
-- combines the two updates to satisfy foreign key constraints:
WITH sender AS (
UPDATE hyperkitty_sender
SET address = encode(sha256(address::bytea), 'hex')
WHERE address = 'foo@bar.com'
RETURNING address
) UPDATE hyperkitty_email
SET sender_id = encode(sha256(sender_id::bytea), 'hex')
WHERE sender_id = 'foo@bar.com';
-- address --
-- redact the name and email
-- email must match the identifier used in hyperkitty_sender.address
UPDATE address -- done
SET display_name = '[REDACTED]'
WHERE display_name = 'Foo Bar';
UPDATE address -- done
SET email = encode(sha256(email::bytea), 'hex')
WHERE email = 'foo@bar.com';
-- user --
-- redact the name
-- use double quotes around the table name
-- redact display_name in user table
UPDATE "user"
SET display_name = '[REDACTED]'
WHERE display_name = 'Foo Bar';
```sql
BEGIN;
-- hyperkitty_attachment --
-- redact the name and email in html attachments
-- (only if found in plaintext email)
UPDATE hyperkitty_attachment
SET content = convert_to(
replace(
convert_from(content, 'UTF8'),
'Foo Bar',
'[REDACTED]'
),
'UTF8')
WHERE
content_type = 'text/html'
AND email_id IN
(SELECT id FROM hyperkitty_email
WHERE content LIKE '%Foo Bar%');
UPDATE hyperkitty_attachment
SET content = convert_to(
replace(
convert_from(content, 'UTF8'),
'foo@bar.com',
'[REDACTED]'
), 'UTF8')
WHERE
content_type = 'text/html'
AND email_id IN
(SELECT id FROM hyperkitty_email WHERE content LIKE '%foo@bar.com%');
-- --- hyperkitty_email ---
-- redact the name and email in plaintext emails
UPDATE hyperkitty_email
SET content = REPLACE(content,
'Foo Bar <foo@bar.com>',
'[REDACTED]')
WHERE content LIKE '%Foo Bar <foo@bar.com>%';
UPDATE hyperkitty_email
SET content = REPLACE(content,
'Foo Bar',
'[REDACTED]')
WHERE content LIKE '%Foo Bar%';
UPDATE hyperkitty_email
SET content = REPLACE(content,
'foo@bar.com',
'[REDACTED]')
WHERE content LIKE '%foo@bar.com%';
UPDATE hyperkitty_email -- done
SET sender_name = '[REDACTED]'
WHERE sender_name = 'Foo Bar';
-- obfuscate the sender_id, must be unique
-- combines the two updates to satisfy foreign key constraints:
WITH sender AS (
UPDATE hyperkitty_sender
SET address = encode(sha256(address::bytea), 'hex')
WHERE address = 'foo@bar.com'
RETURNING address
) UPDATE hyperkitty_email
SET sender_id = encode(sha256(sender_id::bytea), 'hex')
WHERE sender_id = 'foo@bar.com';
-- address --
-- redact the name and email
-- email must match the identifier used in hyperkitty_sender.address
UPDATE address -- done
SET display_name = '[REDACTED]'
WHERE display_name = 'Foo Bar';
UPDATE address -- done
SET email = encode(sha256(email::bytea), 'hex')
WHERE email = 'foo@bar.com';
-- user --
-- redact the name
-- use double quotes around the table name
-- redact display_name in user table
UPDATE "user"
SET display_name = '[REDACTED]'
WHERE display_name = 'Foo Bar';
```
3. Look around the modified tables, do `COMMIT;` if all good, otherwise `ROLLBACK;`
4. Ending the `psql` session discards the temporary tables, so keep it open
5. Look at the archives if everything is ok
6. End the `psql` session
To rollback changes after the transaction has been committed to the database,
using the temporary tables:
* Ending the `psql` session discards the temporary tables, so keep it open
UPDATE hyperkitty_attachment hka
SET content = hkar.content
FROM hyperkitty_attachment_redact hkar WHERE hka.id = hkar.id;
4. Look at the archives to confirm that everything is ok
UPDATE hyperkitty_email hke
SET content = hker.content,
sender_id = hker.sender_id,
sender_name = hker.sender_name
FROM hyperkitty_email_redact hker WHERE hke.id = hker.id;
5. End the `psql` session
UPDATE hyperkitty_sender hks
SET address = hksr.address
FROM hyperkitty_sender_redact hksr WHERE hks.mailman_id = hksr.mailman_id;
UPDATE address a
SET email = ar.email,
display_name = ar.display_name
FROM address_redact ar WHERE a.id = ar.id;
To rollback changes after the transaction has been committed to the database,
using the temporary tables:
UPDATE "user" u
SET display_name = ur.display_name
FROM user_redact ur WHERE u.id = ur.id;
```sql
UPDATE hyperkitty_attachment hka
SET content = hkar.content
FROM hyperkitty_attachment_redact hkar WHERE hka.id = hkar.id;
UPDATE hyperkitty_email hke
SET content = hker.content,
sender_id = hker.sender_id,
sender_name = hker.sender_name
FROM hyperkitty_email_redact hker WHERE hke.id = hker.id;
UPDATE hyperkitty_sender hks
SET address = hksr.address
FROM hyperkitty_sender_redact hksr WHERE hks.mailman_id = hksr.mailman_id;
UPDATE address a
SET email = ar.email,
display_name = ar.display_name
FROM address_redact ar WHERE a.id = ar.id;
UPDATE "user" u
SET display_name = ur.display_name
FROM user_redact ur WHERE u.id = ur.id;
```
The next time such a request occur, it might be best to deploy the
above formula as a simple "noop" Fabric task.
......
......