003-extra - Matrix's Cryptography

from trash cat tech chat

14 Jun 2022

Listen: ogg | mp3

Show Notes

trash cat (they/them) and Juliana (she/her) talk about the cryptography used in Matrix.

Timestamps

0:00.000 Introductory notes
2:52.309 [Theme]
3:01.252 Diffie-Hellman
11:05.985 [Transition]
11:10.772 Security properties of an encryption protocol (PGP)
13:45.825 Deniable authentication
20:11.040 Forward and backward secrecy
29:03.772 Matrix's hybrid Megolm+Olm protocol
42:07.426 Which keys are long-term?
44:19.092 [Outro]

Transcript

Introductory notes

trash cat: Hey, editing trash cat here. So, we had this whole conversation about the encryption protocol used in Matrix, and throughout this conversation, I was like, "Yeah, I don't want to explain Diffie-Hellman right now. We're just gonna... I'm just gonna hand-wave away the 'How do we do some parts of this?'" And when Juliana said "I'm curious how this works in practice" I'm like, "Well, it's Diffie-Hellman, but I don't really want to get into Diffie-Hellman right now." So we had that whole conversation about Matrix's Megolm and Olm algorithms and how they work together and all that stuff. And then, after that, we took a break, and when we got back, I was like, "Okay, well, Juliana, I want to explain to you how Diffie-Hellman works because the math is not that complicated, and it's really cool. Whatever." And so I... We sat down, and while we were still recording, I explained, "Here's how Diffie-Hellman works." And I decided those things would make more sense in the opposite order, if the Diffie-Hellman part went first so that you, when I make reference to it later, understand what Diffie-Hellman is. And then we have the Matrix-specific part. And so, the Diffie-Hellman part is more math-oriented. The math is not very complicated, but it's more math-oriented than the rest of the crypto conversation. And yeah, if you're not interested in the Diffie-Hellman-specific part -- I'll put time stamps in the show notes, but it should be about 10 minutes in. You can just skip ahead and get to the Matrix-specific part. The other thing I want to note is just, the Diffie-Hellman explanation is for the standard Diffie-Hellman with just... integers, the discrete log problem, etc. In all of the examples that we're actually talking about, in Signal, in Matrix, in whatever, it's using elliptic curve cryptography, so it's Diffie-Hellman over an elliptic curve. It's... It works the same way: You're just doing scalar multiplication of points on an elliptic curve rather than doing exponentiation of integers, but it's worth saying it is elliptic curve Diffie-Hellman, not just Diffie-Hellman. Yeah, so I hope you enjoy this extra segment all about cryptography.

[Theme]

tc: You're listening to trash cat tech chat, a Librepunk podcast.

Diffie-Hellman

tc: Diffie-Hellman relies on the idea that the discrete log problem is hard. So, do you know... Are you comf-- I mean, you're a programmer, so you know what a modulus is.

Juliana: Yeah.

tc: [laughs] Obviously, so...

J: I did pass geometry. I know what a log is, kind of sort of too.

tc: Yeah. Well, so a log as you would think about it is the exponent that gets you to whatever value, right?

J: Yeah.

tc: But what we're gonna talk about here is a discrete log, discrete meaning it's gonna be an integer, basically, rather than like a... somewhere falling on a continuous line of numbers. So, what that means is... Say we're working in like mod 5 to be simple, or something, and you want to know... the discrete log base 3 of 4 in mod 5. You take 3, and you raise it to some exponent power. And because we're in mod 5 -- So, what you do -- I mean, the answer is gonna be 2. But -- just for simplicity. But what you do is you take 3, and you raise it to, say, the 1st power. And you get 3. And then you raise it to the second power instead. You try that, and you get 9, but we're in mod 5, so that's 4. Right?

J: Yes, that makes sense.

tc: So the question is: What exponent do you need to raise your base to in order to get the given output when you're working with a modulus.

J: Okay, I think I see where -- Okay, yeah.

tc: And this is really hard... This is in that class of problems that are so integral to cryptography where it's very difficult to compute this answer, but it's very easy to check. Right? 'Cause if...

J: Yeah.

tc: If the answer is, you know, 7 billion, and you're working with some like, really high modulus and whatever... The answer's 7 billion. That's really hard to compute 'cause you... Because it wraps around regularly, because of the modulus...

J: Yeah.

tc: ...it's really hard to estimate in the way that you can with like a continuous logarithm, right?

J: Yeah.

tc: And so... That's the idea. It's very hard to compute, but it's easy to check. If you know the answer, you just raise it to that power instead of having to try every single possibility, and you say, "Oh yeah, it works. Cool." And like, all of cryptography is based on problems that are difficult to do one way but easy to verify.

J: That's cool.

tc: And that's the idea is you, I mean... So... the way that Diffie-Hellman works is we start with the assumption that the discrete log problem is hard, that for large values, it's very difficult to know what number, what exponent power a given base was raised to to achieve this result under this modulus. And we're gonna be working with a prime modulus so we get like, everything is relatively prime to the modulus, and that gives us some useful properties and whatever. So... Does that make sense so far?

J: I believe so, yes. To clarify, though, the key would be the value that... It would be the log, right? The...

tc: Yeah, the exponent.

J: Yeah. Yes, the exponent. Yeah, okay.

tc: Mhm. And the log is an exponent, so yeah. [laughs]

J: That's cool. I didn't realize it was just like... just some number, I guess. I mean, obviously everything is some number, but I don't know. Yeah. Anyway, that's super cool.

tc: So the way Diffie-Hellman works is: Alice and Bob agree upon some base and some modulus. And... I'm gonna call those g and p. g is the base, and p is the modulus. Okay?

J: Okay.

tc: And Alice computes a, her private key, which is just a random number, and Bob computes b, his private... sorry, I say "compute". They... I want to use the word "generate". They generate these random numbers. So Alice has a, her private key. Bob has b, his private key. Okay?

J: Okay.

tc: They compute their own public keys by raising g (I'm calling it g 'cause it's a generator in, like, abstract algebra. But um... Or, it's a generator of a field or whatever, of a group...

J: Okay.

tc: ...in abstract algebra. It's been a while since I did this abstract stuff, but it like, it doesn't matter.) The base. They raise the base to their own private key power, and that yeil-- with a modulus. And that yields their public key. So A, Alice's public key in this case, is g^a mod p.

J: Oh, that's super cool. That's... That is way simpler than I realized.

tc: Yeah, it's super simple. So what you have there is, right? You can't go backwards because the discrete log problem is hard. You can't take A and compute a from it, but, you know, computing A from a is really easy. Are you on board so far?

J: Yes.

tc: Excellent. So, what they do... The goal of Diffie-Hellman, right? is for them to compute a shared secret that only the two of them know. And so what they do is Alice sends her public key to Bob. Bob sends his public key to Alice. Alice takes B, Bob's public key, which, remember, is g^b, and she raises it to the a power, which, you know, because of commutativity and whatever, ends up being g^(ab). And Bob does the same thing but in reverse. He takes A, Alice public key, which is g^a, and raises it to the b power, which gives him g^(ab).

J: Okay, nice.

tc: And so they have the same shared secret. However, from only observing them, the only values an outsider, a third party would know are g^a and g^b. And if you raise one of those to the other power, you don't get that shared secret. Because, you know, g^(ag^b) is not the same ting as g^(ab). And the discrete log problem's hard, you can't compute the private keys from the public keys, etc.

J: That's super cool, actually.

tc: Yeah. So the math is actually really simple. Writing it out, right? it's like 3 steps. And the most... I mean, the only things that you really need to know are... understand how a modulus affects things and that... Basically, accept that the discrete log problem is hard, and then after that, you just... it's just exponentiation. It's just manipulating exponents. It's so simple. I love it because it's so, like, elegant.

Security properties of an encryption protocol (PGP)

tc: So, do you want to talk about Matrix cryptography? Because I am really excited to talk about that, and also I'm gonna do my best not to spend too much time on it. [laughs]

J: I would love to talk about Matrix cryptography, but I... Literally everything I know about Matrix cryptography above like, the basics of key exchange I have learned from you, so it might be better if you just do the thing.

tc: Okay. [laughs] So um... I want to start with like, What does an encrypted messaging protocol give us? What propertie-- what security properties do we get from that? And if we look at something like PGP or something similar to that, the basic properties that we get in like all the options... The basic properties that we get are confidentiality, meaning when Alice sends a message to Bob, only Alice and Bob should be able to read the contents of the message. It should be private between the two of them. And then integrity and authenticity. And what those mean is when Alice sends a m-- I'm pairing those together because we don't really need to worry about the difference between them. I mean, sorry. They are different things. The difference between them matters. They go together in every instance that I'm going to be talking about them. We get both of them at the same time from whatever mechanism we're using. So I'm just going to put them together in the same box.

J: What were those two things again?

tc: Integrity and authenticity.

J: Okay.

tc: And what that means is when Alice sends a message to Bob, Bob knows that Alice was the author of this message. That's the authenticity. He knows it's authentic. And he knows that Alice... that the message that he received is the same as the message that Alice sent, that it hasn't been modified, tampered with, manipulated, or just corrupted along the way.

J: Okay.

tc: And that's integrity. And we get those two things together in the way that they're implemented. So, with like PGP for example, they would be implemented with a cryptographic signature, and the signature provides both properties. It provides both authenticity and integrity. So we have encryption for confidentiality, and we have signatures for authenticity and integrity. Okay?

J: Okay. That's a handy pairing, yeah.

Deniable authentication

tc: And the problem with PGP is, well, there are actually many problems with PGP. The ones we're gonna deal with right now are 1. I'm gonna start with the authenticity, actually. The way that it accomplishes authenticity (and integrity) is with a cryptographic signature. So Alice signs her message to Bob with her own private key. If you know anything about encryption, er, cryptography, you'll know what I'm talking about, probably. Alice signs her message to Bob with her own private key, and then she encrypts it for Bob and sends it on and whatever, blah blah blah. What Bob ends up with in this model is a message that was signed by Alice's private key that only Alice knows.

J: Okay.

tc: And this gives Bob (or someone who compromises Bob's phone or whatever) the power to prove cryptographically that Alice was the author of that message, which means Bob can take this message to a third party, like a judge or something, and say, "Look, I have proof that Alice wrote this message." Our goal, what we generally will want to accomplish and what Matrix wants to accomplish, is "off-the-record" communication. Alice and Bob want to be able to have an off-the-record conversation that doesn't have that kind of proof of who said what and what they said.

J: Interesting. Okay.

tc: So PGP is not a good tool for that kind of thing. So what they do instead... But they do want the authenticity and the integrity that comes with it! Bob does want to be able to know himself 100% confidence that Alice wrote the message, but he doesn't want -- Alice doesn't want Bob to be able to prove to someone else that Alice wrote the message. That's the key distinction. Does that make sense?

J: Yes.

tc: So what Bob does, or what Alice and Bob do, is rather than Alice signing the message with her own private key that is exclusive to her, they negotiate a shared -- it's a symmetric key, but -- they negotiate a shared secret that both of them know. And Alice uses this secret to authenticate the message using something like a signature. It's a MAC, a message authentication code, but it's like a signature where it proves that the message was written (or authorized or whatever) by someone who knew that shared key, that number that only Alice and Bob know. So when Al-- when Bob receives a message that has been MACed (think of it as signed, but), that has been authenticated with this key that only Alice and Bob know, Bob looks at it and says, "Okay." He checks the MAC. He says, "This is valid. It means it came from this key. Only Alice and I know this key. I know that I myself did not write this message. Therefore, it must have been Alice."

J: Ohhh that's an interesting twist.

tc: Right? And that argument only works for Bob. Bob is the only person who can make that argument because Bob is the only one who knows that Bob did not write the message.

J: Yeah.

tc: And then there's some other stuff that goes on. What I'm talking about here is like, OTR, Off-the-Record Communication. It's a messaging encryption protocol from 2004, and then Signal, which is based on OTR. These protocols use this thing, this kind of construct, and they do some other stuff to kind of further afford deniability in who knows that key and whatever, but that's kind of the essential idea is you can't prove which party that knows the key wrote the message, and you may not be able to prove who knows the key in the first place. You may not even be able to narrow it down to just Alice and Bob as potential authors. And also, interestingly, if Bob were to show this message to someone else and say, "Look, if you take this key, you can authenticate that the message came from someone who knows this key, and therefore it must have been Alice or Bob" and whatever, blahblahblah. In doing so, he would have to give the key to that person, and they would then -- that third party would then become one of the people on that list of people who could have created the MAC. It's interesting. [laughs]

J: That is extremely interesting.

tc: But that's the basic idea. We get this property called deniable authentication. We can still authenticate each other, and we still get that, you know, Bob knows that Alice wrote the message, and he knows that it hasn't been tampered with. They still have that integrity property. But it's deniable. Alice -- if Bob tries to say "Look, Alice wrote this", Alice can say, "I didn't write this. How do you know Bob didn't? How do you know it wasn't Bob who wrote it?" So that's the first issue, the first property that we want is deniable authentication, and then there are two more that we'll get from this protocol that are not offered by PGP. So are we good so far? Are you clear on deniable authentication? Do you have any questions?

J: I think I get it. I mean, I'm curious about implementations, but that's kind of beyond scope.

tc: Yeah. Very briefly, because uh -- You know about Diffie-Hellman?

J: Yes.

tc: So essentially, take Alice's key and Bob's key. Do a Diffie-Hellman key exchange. You get a symmetric key that both of them know and that both of them know came from the other person's key, right?

J: Oh, okay.

tc: And then you use that. And it's more complicated. There end up being a bunch of different keys in this protocol that we're gonna be using. But that's like, the basic idea. Yeah.

J: Gotcha.

Forward and backward secrecy

tc: Then the other two properties kind of go hand-in-hand. I'm gonna talk about them together. They are forward secrecy and the same thing but in reverse, which has many names. So, the other problem with a PGP-like scheme is that in PGP, messages are -- So, Alice sends a message to Bob. She encrypts it with Bob's public key. I'm abstracting a little bit. There's a symmetric encryption thing going on. But essentially, she encrypts it with Bob's public key. Someone with Bob's private key can decrypt the message. That's how it works. If Bob has the same public key for like a decade or something, if he has it for a long time, all the messages that are ever sent to him are all going to be encrypted with the same public key.

J: Yeah.

tc: And what that means is someone who first captures and stores Bob's encrypted messages over a period of time, and then second, compromises Bob's private key at some point in time, say, after Bob has been receiving messages for 10 years, can go back and decrypt every message that's ever been sent to Bob.

J: Yeah, that sounds pretty bad.

tc: Yeah, and we don't want that. And so [laughs] And that's one of the big problems with PGP as a model is that it works like that. And a lot of other things work like that too. So what we want instead is this thing called forward secrecy. And very briefly, forward secrecy is like, you change the key regularly so that someone who, you know, captures all of Bob's messages over 10 years and then compromises Bob's key after 10 years can't go back and decrypt the old messages because they were using different keys. They can only decrypt like, the current message and maybe future messages going on from there, depending on how the forward secrecy is implemented. But they can't go backwards.

J: That sounds a lot better.

tc: Yeah, it's... It is a much better idea for a system. So... and there's... I recommend reading, for anyone interested who is also interested in like, the cryptography, I recommend reading the paper Off-the-Record Communication, or Why Not To Use PGP. It talks about this stuff. It introduces OTR, which is -- which was, like, The Thing for, I don't know, a decade from 2004 to when Signal came out and improved upon it, basically.

J: Yeah.

tc: But yeah, it's um... We want forward secrecy.

J: We do.

tc: Yeah, and Signal... I'm gonna start talking about Signal specifically now. Matrix's thing is going to be partially based on Signal, so this is important background. Signal provides deniable authentication with the way that it, uh... It does this like, extended triple-Diffie-Hellman key exchange initially to provide I guess a strong level of deniability, and then from there it does that kind of Diffie-Hellman to get a shared symmetric key where both parties have authenticated each other; however, it is deniable authentication from the X3DH handshake, and then they use that shared key to authenticate the messages, and you get the deniability there. I'm hand-waving that stuff away. Don't worry about it. So, Signal... Well, let me step back. A mechanism for forward secrecy is referred to as a "ratchet" in reference to, you know, the mechanical tool a ratchet that importantly cannot move backwards. Right? And that's the idea of forward secrecy is you can't move backwards. You can't take a current key and use it to decrypt old messages.

J: Yeah.

tc: Signal uses 2 ratchets. [laughs] And it's... That's why its algorithm is called the "Double Ratchet Algorithm". And so, one of these ratchets derives future keys -- or, I'm gonna abstract away some of the details again, but future key data is derived from past key data...

J: Makes sense.

tc: ...using a one-way function. It's a hashing function. I don't really want to explain cryptographic hash functions right now, but it's a one-way function. So key n comes from key n-1. Again, that's not, like, 100% accurate. There's like a KDF chain thing, a key derivation function chain which derives both a key and like, the state for the next chain which can be used to derive the next key and so one. But basically think of it as each key comes from the previous key. Okay?

J: Makes sense.

tc: So because each key comes from the previous key, and the function for deriving the next key is a one-way function, you can always get the next key if you have the current key, but you can never go back and get the previous key. Does that make sense?

J: That does.

tc: Okay. So that's one of the ratchets. It's a hashing ratchet. It provides forward secrecy because the current key is based on the previous key, but you can't go backwards and get the previous key or any of the keys before it. So if you compromise the current key, all previous messages encrypted with previous keys are still secure.

J: That's progress. That's good. I like this.

tc: The other ratchet that Signal uses... Well, let me step back. So, the problem with the hashing ratchet is it provides forward secrecy, but it doesn't provide the analogous backward secrecy. (Or, it's also called future secrecy or self-healing or post-compromise security. There are a lot of names for this idea.) But essentially, if you, rather than deterministically deriving the future keys from previous keys, if you introduce randomness into the system regularly -- think analogously like, you and your friend go to the park and create a new secret every week that is new and random. Even if someone compromises your current key, they will eventually get locked back out. Right?

J: Yeah.

tc: And so that's the goal of backward secrecy/future secrecy/self-healing/whatever you wanna call it. The hashing ratchet does not provide that because all the keys are generated deterministically from previous key data. But this other ratchet that Signal uses, which... What it actually does is it involves each party... Each time they reply to the other party, they generate a new ephemeral key, and then they do a Diffie-Hellman with the other party's previous -- er, the other party's current key and their new key, and then what that means is Alice and Bob are both introducing new randomness into the system at regular intervals. But if you're not, like, well-versed in cryptography, just think about it this way: They regularly introduce new randomness into the system which an attacker who compromises, at a given point in time, their key does not have access to these new sources of randomness, and so they get locked back out. Does that make sense?

J: It does make sense, yeah.

tc: So Signal has this Double Ratchet Algorithm. It has these two ratchets which work in tandem with each other to ensure that basically, if you compromise a single key or a single message, the absolute least amount of possible data gets exposed from that. Like, just that message or just that message and maybe the reply but not further than that, or whatever.

Matrix's hybrid Megolm+Olm protocol

J: Okay. So Signal tends to be one-to-one, right? And then Matrix, by contrast, focuses on group chats.

tc: Signal does have a group chat function. So does XMPP, which has a... There's a thing called OMEMO which is the Signal Protocol adapted for use with XMPP. You can do OMEMO with group chats. But these tend to face an issue with scaling because essentially, again, Signal has those two ratchets. And one of them is the hashing ratchet where you just derive future key data from past key data. That one is really cheap to do. It doesn't take a lot of computational power to do that process. The other one, which involves creating new random numbers and then doing Diffie-Hellman key exchanges and then using those keys, that one's more expensive, and you have to do it with... You have to do, in a group chat, that process with all the different participants in the conversation. And so when you have a lot of different participants in the conversation, it becomes a very expensive operation. Like, it's fine to do if you're in a conversation with 4 people or whatever, but when you want to have like a thousand people in a large encrypted room together, which is one of the goals of Matrix, it just becomes prohibitively computationally costly.

J: Yeah. There's a saying in programming that "Math is free, and communication is prohibitively expensive." So that's the cost coming in.

tc: That's funny. [laughs] So yeah, we've got these two ratchets from Signal, and one of them -- I mean, these were both things prior to Signal. Signal put them together. Whatever. Everything is based on previous work, right? "Standing on the shoulders of giants." We've got these two ratchets that are in Signal, and one of them is expensive but more effective, and one of them is cheaper but less effective because it doesn't give us that post-compromise security, the self-healing. And so, what Matrix does is interesting. It uses two different protocols and combines them together. So, one of them is called Olm, O-L-M. Uh, which, I feel like I should explain where that name comes from. That's another name for a proteus, the like, aquatic salamander thing. And that's, I think, in reference to -- Well, the important thing about them is they have regenerative protocol-- er, they have a regenerative property. This, I think... I think this name is given in reference to Signal... I'm unclear about whether it's the entire Signal Protocol or just the Double Ratchet Algorithm, but regardless, Signal was originally named -- I'm gonna do my best to pronounce this word correctly -- for the axolotl, which is the word that looks like "ax-o-lot-l", but from what I understand, it is a Nahuatl word that we've sort of adopted into English, and, I don't know, I read an argument once that it's, for like, preservation-of-Indigenous-languages reasons, it's important to not change the pronunciation of that word, not anglicize it.

J: That's awesome, and I haven't heard about this, and I'll make an effort to pronounce it correctly. I didn't even, you know, think about the fact that we're grossly mispronouncing it in English, but that is important. I agree with you.

tc: Yeah. I didn't think about it either until I read someone else say it. But yeah, so the Signal Protocol, or the Double Ratchet Algorithm, one of the two, was originall called the Axolotl protocol, and it was called that because those salamander things, axolotl, have a self-healing property, and Signal, from one of those two ratchets, the more expensive one, has a self-healing property. It was in reference to that. So an olm is another type of aquatic salamander creature with a regenerative property. And I think it's named for the Signal thing.

J: New Scientist has an article about one that didn't move for 7 years. So that's cool.

tc: [laughs]

J: Love these lil guys!

tc: Yeah. Anyway, so Matrix has a thing called Olm, named for that proteus salamander thing, and this is just an adaptation of the Signal Protocol for Matrix. It's a little bit different. I believe it only does a triple-Diffie-Hellman exchange; it doesn't do the extended triple-Diffie-Hellman key exchange which includes like a 4th Diffie-Hellman optionally. But basically it's the same thing as the Signal Protocol; it's just for use in Matrix. And then Matrix also has a second protocol called Megolm. (Which I think is just supposed to be like "mega olm" or something. I don't know.) And what that is is it includes the cheaper ratchet where you derive future keys from past key data, but it doesn't include the more expensive ratchet. So it provides forward secrecy but not backward secrecy. And because of the way in actual implementation Matrix handles this, it also arguably provides a weaker form of forward secrecy and whatever. I'm not gonna deal with that right now. That's outside the scope of this conversation. But basically, Megolm provides forward secrecy but not backward secrecy, not self-healing, and Olm provides both and deniability. Megolm interestingly does not provide deniability. It uses non-repudiable (non-deniable) signatures like PGP does. But I'll explain how they combine together. It's interesting. So, when you go to send a message -- an end-to-end encrypted message on Matrix, you encrypt it using the Megolm protocol, the, I'm going to say, "weaker" one, okay? It's more efficient, but it's weaker.

J: Okay.

tc: And also, it should be said, Megolm is actually very efficient because -- The way it works is essentially you send... It's actually very simple. You send your key... I'm not going to talk about how you send it, but you send your key that you're going to be using, or your key data from which keys are derived, to all the members of a room, and you send your public key that you'll be using for authenticity. You'll be signing with the corresponding private key. And so you send the same data to all the members of the room. Okay? Then you encrypt your message with your own -- that key that you sent to everyone, that's unique to you, and you sign the message with that, with the private key that's unique to you. And everyone in the room receives the exact same message. It's encrypted with the same key, it's signed with the same key for everyone. So you can see why this would be very cheap compared to something where you have to individually handshake and encrypt separately for each person.

J: Yeah. So like... I can't think of a good analogy for the handshaking for everyone, but as opposed to like a peer-to-peer model?

tc: Yeah. I mean, very simply, think of it just as like, level of complexity in... Say you want to distribute a message to everyone. Level of complexity in posting it on a bulletin board, versus making one copy for each person and handing them out individually by hand.

J: Yeah.

tc: Like, very different levels of complexity.

J: Indeed.

tc: So, when you send a message with Megolm, you... You have your unique encryption and signing keys. The encryption key is shared. It's a symmetric key, so everyone could encrypt messages themselves using that key if they wanted, but you sign yours with your own key so that they -- with your own private key so that they can't spoof that. You send a message, encrypted, signed, all that stuff, and then the next message that you send out is encrypted with the next key. You just derive the next key using that forwarding ratchet, and you just keep doing that. And that's how Megolm works. So it provides forward secrecy. It does not provide deniability. It does not provide self-healing. But it's very efficient. Right?

J: Okay. Yeah.

tc: And then, there's a question of "Well, how do you distribute the keys?" And this is the part that's really interesting. And the answer is you distribute the keys with Olm, and you periodically -- it's, I think, by default once per week, or after you've sent 100 messages -- you "refresh" this key, so to speak. You like, end the current Megolm session that you're in with the encryption keys that are all ratcheted from the same origin and the signing key, and you start a new session with a new signing key and a new initial encryption key ratchet state, from which all the rest of the keys are derived. And so what we get from that combination of things... This is similar to like, you use the same key for a week (again, you're forwarding it, but it doesn't get the self-healing property). You use the same kind of key for a week, and then after a week, you meet up through some, you know, secret channel or whatever that's... And you derive a new key that you're going to use for the next week. I feel like that was not a clear explanation, but what you get from this, from this combination is you get forward secrecy because Megolm actually provides forward secrecy. You get self-healing on a weekly basis because every week you're changing the encryption key ratchet state thing. You're adding new randomness to the system once per week. So you don't have as strong, as regular a self-healing property, but you do still have it. It's not like "Oh, over the course of this 10-year period, if you compromise the first key, you compromise everything." And you get deniability through some interesting nuance because while you are non-repudiably (non-deniably) signing all of your messages, the key that you are using to sign those messages was negotiated with everyone via a deniable channel, i.e., the Olm protocol. And so, while you can't deny that that key was used to encrypt -- or, sorry, to sign all the messages, you can deny that the key is yours.

J: Interesting. Okay.

tc: Right? And so that's how Matrix's hybrid Megolm-and-Olm encryption protocol works, basically.

J: Nice. Well-explained.

tc: Thank you. I ended up going into way more detail than I intended to, but it's fine. We'll just do like, episode 3a, episode 3b or something. [laughs]

J: Yeah, that's alright.

tc: Okay, at this stage, I think I'm done explaining how the crypto works. [laughs] Do you have any questions? Anything you want me to elaborate on? Or do you wanna move on from that?

J: I don't... think so.

Which keys are long-term?

J: But just to be 100% clear: There's no key that you hold onto that doesn't change, right?

tc: Correct.

J: Okay.

tc: Well, um, sorry. [laughs] There's no encryption key... You have like, identity keys, which are different from your encryption keys. So you have long-term identity keys, which are used to like, verify who you are and are used in the creation of the encryption keys, but through the complicated triple-Diffie-Hellman key exchange thing that provides you deniability of the resulting keys, of the resulting ephemeral keys.

J: Okay.

tc: Um, so if you go into Element or whatever, and you look at your... whatever it is, the privacy settings or something, and it has the session keys...

J: Yeah.

tc: ...for your device, and it's -- I think Element just lists your own device's one, but whatever. So you've got your session key. That is a long-term key for your -- for that device. As long as you are using that device, that key should not change unless you manually change it. But that key is not used for encryption; it's used for identity. And because it's not used for encryption, it's okay that it's long-term because you're not... You can't use it to retroactively compromise things, basically.

J: That makes sense.

tc: Yeah. That is an important distinction. Maybe outside the scope of what listeners really need to know, but it is important in the context of the cryptography, right? Which keys are safe to be long-term, and which ones are not?

J: Yeah.

tc: But yeah, no encryption keys are ever long-term. They're all ephemeral. It's just a question of how long do you keep them around for? Kind of.

[Outro]

tc: You've reached the end of this episode of trash cat tech chat. Check out the show notes for links and other information. This podcast is licensed under a Creative Commons Attribution-ShareAlike 4.0 license. Music by Karl Casey @ White Bat Audio.

Credits

Music by Karl Casey @ White Bat Audio

Librepunk Podcasts

003-extra - Matrix's Cryptography

Show Notes

Timestamps

Transcript

Introductory notes

[Theme]

Diffie-Hellman

Security properties of an encryption protocol (PGP)

Deniable authentication

Forward and backward secrecy

Matrix's hybrid Megolm+Olm protocol

Which keys are long-term?

[Outro]

Links

Credits