Hi all,
SecureDrop is migrating from GPG to Sequoia in our upcoming 2.7.0 release. I want to give a brief overview of how SecureDrop works and then a request for review if anyone has some time.
For those not familiar, SecureDrop is a whistleblower submission system. Sources upload documents and send messages for journalists to review said information and determine whether or not to act on it. There's a high-level architecture overview at https://docs.securedrop.org/en/stable/what_is_securedrop.html.
== Our PGP operations ==
Submissions are encrypted for journalists using PGP. The journalist secret key is kept offline on an airgapped machine, while the public key is on the SecureDrop server and generally publicly available.
When a source logs in, they are given a diceware passphrase ("codename" in SD documentation). The server generates a PGP keypair protected by that passphrase. When the source submits a document/message, it is encrypted for the journalist key and stored. (Note: sources can pre-encrypt their submission locally and upload that, in which case we don't further encrypt it again, but our impression is that this is rarely done).
If the journalist chooses to reply to source, that message is encrypted to the specific source and journalist (so journalists can read messages they sent). When a source logs in with their passphrase, we decrypt any journalist replies and display them to the source.
Currently this is all done using GPG, using a vendored copy of the unmaintained pretty_bad_protocol library: https://github.com/freedomofpress/securedrop/tree/develop/securedrop/pretty_bad_protocol.
== Migration to Sequoia ==
Any new source gets a keypair generated by Sequoia, which is stored in our SQLite database in armored format instead of the GPG keyring.
Upon upgrade we run a one-time migration that iterates over the GPG keyring, exporting public keys into our database.
When sources log in, we use their passphrase to export their secret key out of GPG and into our database as well.
We've also added some checks for SecureDrop admins to reject SHA-1 keys during installation using sq-keyring-linter.
To bring Sequoia into our Python application, we've written a small Rust crate ("redwood") that implements the encryption, decryption and key generation functions and can be exported to Python using the PyO3 bridge: https://github.com/freedomofpress/securedrop/tree/develop/redwood.
== Review request ==
Thanks to Wiktor who's already reviewed some of our code and Neal for some input on issues and people who replied to my questions in IRC.
We'd appreciate if anyone could take a look at our Sequoia-interfacing code, with a focus on the following areas:
* Are we creating PGP key pairs correctly? * Is this (armored text) a safe format to store keys in? * Are we generally using the correct Sequoia APIs / are we using those APIs correctly?
All of our Rust code is at https://github.com/freedomofpress/securedrop/tree/develop/redwood and the main Python<-->Rust integration is at https://github.com/freedomofpress/securedrop/blob/develop/securedrop/encryption.py.
And if there are any other concerns or potential issues that other migrations have encountered that we might have not considered.
== Thanks ==
As a general note, it has been fantastic to use Sequoia. So thank you to everyone who has worked on it!
-- Kunal, on behalf of the SecureDrop team
Hi Kunal,
Thanks for reaching out!
On Wed, 11 Oct 2023 19:19:06 +0200, Kunal Mehta via Devel wrote:
SecureDrop is migrating from GPG to Sequoia in our upcoming 2.7.0 release.
That's exciting news for us!
When a source logs in, they are given a diceware passphrase ("codename" in SD documentation). The server generates a PGP keypair protected by that passphrase.
The generate_source_key_pair function takes an email address:
https://github.com/freedomofpress/securedrop/blob/develop/redwood/src/lib.rs...
Is it really a good idea to identify sources by their email address? Note: OpenPGP doesn't require a self-signed user ID, and Sequoia (unlike gpg) supports using certificates without any user IDs.
When sources log in, we use their passphrase to export their secret key out of GPG and into our database as well.
Do keys for sources expire eventually? If not, does that mean you're planning on keeping this migration code around forever? If so, we should probably prioritize this issue:
https://gitlab.com/sequoia-pgp/sequoia/-/issues/928
How are you authenticating the public key material? Are you assuming that the password is the trust root, and using that to authenticate the user's public key? If so, you might be vulnerable to the KO attacks:
KO attacks are possible when the secret key material is not stored on a trusted medium.
To bring Sequoia into our Python application, we've written a small Rust crate ("redwood") that implements the encryption, decryption and key generation functions and can be exported to Python using the PyO3 bridge: https://github.com/freedomofpress/securedrop/tree/develop/redwood.
I took a look at redwood and filed a few minor issues. I think this is a really great example of a point solution done well. We've been reluctant to write generic wrappers for different languages as there is so much API surface to cover, and have instead advocated for exactly this approach. Its good to see that it works.
We'd appreciate if anyone could take a look at our Sequoia-interfacing code, with a focus on the following areas:
- Are we creating PGP key pairs correctly?
generate_source_key_pair looks fine. As mentioned above, I wonder if using a user id is really sensible.
https://github.com/freedomofpress/securedrop/blob/develop/redwood/src/lib.rs...
Where are the keys for journalists generated?
- Is this (armored text) a safe format to store keys in?
Good question!
Here's what a password protected key looks like:
$ sq key generate --with-password --output - --rev-cert /dev/null | sq packet dump No user ID given, using direct key signature Enter password to protect the key: Repeat the password once more: Error: File FileOrStdout(Some("/dev/null")) exists, use "sq --force ..." to overwrite Secret-Key Packet, new CTB, 134 bytes Version: 4 Creation time: 2023-10-13 09:19:43 UTC Pk algo: EdDSA Pk size: 256 bits Fingerprint: 3813AF675F10B96FE7E8045B736052477A1D3BE2 KeyID: 736052477A1D3BE2
Secret Key:
Encrypted S2K: Iterated Hash: SHA256 Salt: 4E9DC4C911A889C7 Hash bytes: 65011712 Sym. algo: AES-256
The password is stretched using OpenPGP's "iterated and salted s2k" KDF. You can read about it here:
https://datatracker.ietf.org/doc/html/rfc4880#section-3.7.1.3
It's not a great KDF anymore. The crypto refresh introduces an Argon2 variant, which has better security properties:
https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-11.html#na...
In practice, I don't think it matters, as you are using a high entropy password (diceware).
- Are we generally using the correct Sequoia APIs / are we using those
APIs correctly?
My impression after carefully reviewing the redwood code is, yes. The minor issues that I found, I reported.
It would be helpful for us if you could share how you approached using Sequoia's API, and any issues you had along the way. Did you mostly look at the documentation? the examples? Did you look at the implementation? other programs (sq?) that use sequoia-openpgp? What types of difficulties did you have? What could have made working with sequoia easier? Any other suggestions?
As a general note, it has been fantastic to use Sequoia. So thank you to everyone who has worked on it!
:D Thanks for the feedback! One of our major goals was to make the API usable and pleasant to work with, and it sounds like we achieved that!
:) Neal
Hi,
On 10/13/23 03:21, Neal H. Walfield wrote:
The generate_source_key_pair function takes an email address:
https://github.com/freedomofpress/securedrop/blob/develop/redwood/src/lib.rs...
Is it really a good idea to identify sources by their email address?
It's not actually their email, it's a randomly generated ID, we're just using the "email" terminology because that's what GPG used.
Note: OpenPGP doesn't require a self-signed user ID, and Sequoia (unlike gpg) supports using certificates without any user IDs.
I neglected to mention this in my initial email (but later clarified on IRC), we still need to be GPG-compatible until we've ported the SecureDrop Workstation to Sequoia as well (https://github.com/freedomofpress/securedrop-workstation/issues/812).
When sources log in, we use their passphrase to export their secret key out of GPG and into our database as well.
Do keys for sources expire eventually? If not, does that mean you're planning on keeping this migration code around forever? If so, we should probably prioritize this issue:
Source keys don't ever expire, but best practice is to delete the source and their keypair once the journalist is done communicating with them, so in theory instances should eventually get rid of all the remaining GPG-backed sources. But the code won't be around forever, probably a few years, our long-term plan is to have SecureDrop adopt an end-to-end encrypted protocol that most likely isn't going to be OpenPGP-based (see https://securedrop.org/news/future-directions-for-securedrop/).
How are you authenticating the public key material? Are you assuming that the password is the trust root, and using that to authenticate the user's public key? If so, you might be vulnerable to the KO attacks:
KO attacks are possible when the secret key material is not stored on a trusted medium.
TIL about this. We're storing the key pair in our database so you'd need a SQL injection attack to modify the key first, and then we're not vulnerable to the specific attacks described in the paper since we don't generate signatures (just encryption/decryption) and Sequoia isn't vulnerable to the second decryption attack.
I took a look at redwood and filed a few minor issues. I think this is a really great example of a point solution done well. We've been reluctant to write generic wrappers for different languages as there is so much API surface to cover, and have instead advocated for exactly this approach. Its good to see that it works.
Thanks for the issues and review :-) I'll add that one of the secondary reasons for writing the limited Python bindings ourselves was so we could begin the process of shipping written-here Rust code and all the tooling, etc. to go along with that.
Here's what a password protected key looks like:
<snip>
In practice, I don't think it matters, as you are using a high entropy password (diceware).
Ack.
It would be helpful for us if you could share how you approached using Sequoia's API, and any issues you had along the way. Did you mostly look at the documentation? the examples? Did you look at the implementation? other programs (sq?) that use sequoia-openpgp? What types of difficulties did you have? What could have made working with sequoia easier? Any other suggestions?
Sure! I started with https://docs.sequoia-pgp.org/sequoia_guide/chapter_02/index.html and basically copied that into our code, iteratively adjusting it as needed by using the API reference docs and occasionally peeking at the underlying code when I needed to figure out a type's name. I don't recall looking at the code of other programs that use Sequoia.
As a person who is reasonably familiar with GPG, I mostly struggled with understanding lower-level OpenPGP concepts. For example, I had no idea what a PKESK is, so it took me a while to figure out why the code failed to decrypt messages encrypted for multiple recipients (https://github.com/freedomofpress/securedrop/pull/6891) - the example code in the guide (reasonably) just assumed it was the first/only recipient.
So while not specific to Sequoia, I'd suggest it would be nice if there was some guide to OpenPGP concepts that is less intimidating than the RFC. Hope that helps, I don't have any actual critiques for Sequoia itself!
-- Kunal