The Features of 3.5: Hybrid ML-KEM

This is the fourth in a series of posts about the features of OpenSSL 3.5. Its target audience is people who are curious about internet security, but who don’t recognize the acronyms in that list.

  1. QUIC server
  2. External QUIC library interface
  3. Post-quantum cryptography
  4. Hybrid ML-KEM in TLS v1.3

Previously we discussed how quantum computing represents a threat to the cryptographic systems that secure digital communication. In order to ensure privacy and well into the future, we need to find new ways of encrypting data.The ML in ML-KEM is a new type of public key designed to do just that. In the field of cryptography, ML stands for “module lattice”1 and it’s used in one method standardized by the National Institute of Standards and Technology (NIST) for encrypting keys.

Suppose you have a secret key that consists of a handful of variables assigned to specific values:

Secret key
    x = 3
    y = 1
    

The next step is to create a system of linear equations that can be solved with the secret key. You might remember problems like this from an algebra class:

2x +  6y = 12
6x - 10y =  8

It’s easy to generate these problems and nearly as easy to solve them. A system of equations can usually be solved when the number of equations equals the number of variables. So on their own, linear equations don’t seem promising for creating a cryptographic system. But Oded Regev discovered that by adding random errors it was possible to create a series of problems that are nearly impossible to solve without already having the answers.

2x +  6y = 11 (+1)
6x - 10y =  9 (-1)
7x +  3y = 25 (-1)
8x - 14y =  9 (+1)

The process is simple: create a series of math problems, calculate the right side of the equation with our secret key and introduce a small error. If the error is kept secret, none of the equations are exactly correct so they can’t be solved with the usual methods. But we can reframe the problem. Instead of finding the values for the variables that get the right answers, the problem is to find the values that get the closest to solving the equations.

My example isn’t so hard since there are only two variables and I gave them relatively small values. You might be able to guess the secret key just by trying a few possibilities. With a few more variables, the problem becomes tricky. In addition adding more equations (with random errors) makes it harder to find a solution, not easier.

To add one more twist, the method uses modular arithmetic. This sounds complicated, but you use it every day when calculating times. If you go to bed at 11pm and sleep 8 hours, what time do you wake up? With normal math, the answer would be 19am. We don’t say it that way because our clocks reset at midnight. Instead we calculate the remainder when dividing 19 by 12 and say we woke up at 7am.

People who express time using a 24-hour clock also use modular arithmetic except they reset on 24 instead of 12. If you go to bed at 23:00 and wake up 8 hours later, that’s 07:00. Whichever number we pick is called the “modulus”. When you reach the modulus, just reset to 0 and start counting up from there. Most of us don’t even know we are doing modular arithmetic when we calculate times.

Martinvl, CC BY-SA 4.0
, via Wikimedia
Commons

Instead of 12 or 24, we can reset at any number. So in this example, let’s use 17. In mathematical notation, the modulus is written (mod 17) to remind the reader that numbers reset to 0 when they reach 17.

2x +  6y ≈ 11 (mod 17)
6x - 10y ≈  9 (mod 17)
7x +  3y ≈  8 (mod 17)
8x - 14y ≈  9 (mod 17)

Given this simplified example, our public key might be:

2x +  6y = 11
6x - 10y =  9
7x +  3y =  8
8x - 14y =  9

It’s not easy to determine the secret values of x and y from this set of equations because of the random errors. But it turns out to be relatively easy to send a message that only someone who holds the secret key can read. The process starts by selecting a sample of the equations in the public key and adding them together:

  2x +  6y = 11
+ 8x - 14y =  9

 10x -  8y = 3 (mod 17)

The sender doesn’t know either the secret key or the error. But that doesn’t matter. If senders want to send the message “0”, they can just send this equation as-is. If they want to send the message “1”, they add half of 17 to the equation:

0:
 10x -  8y =  3 
1:
 10x -  8y = 12 

Someone who intercepts this message can’t guess whether the sender meant 1 or 0 from seeing this equation but the receiver uses the secret key to get an answer:

  10x -  8y = 5 (mod 17)

Because of the errors, the receiver probably won’t get exactly the same result as the sender sent. But if the secret provides an answer close to what the sender sent, we can be pretty sure they meant 0 and if the sender’s equation is closer to the answer plus half of 17, we know they meant 1. Obviously real encryption will use more variables, more equations and more bits.

Sending an entire equation to represent either 0 or 1 will quickly become inefficient as the message grows. But remember we aren’t using this method to send the entire message, but just another key that will be more efficient to use. Further, this is the easy-to-understand version of the algorithm. The actual ML-KEM algorithm as implemented in OpenSSL 3.5 is more efficient.

Cryptographers have not found a way for quantum computers to break the ML-KEM algorithm. Nor have they found any vulnerabilities using conventional computing. Unlike prime factorization, the new algorithm hasn’t endured years of testing in real-world applications. So which algorithm should people chose? OpenSSL 3.5 answers “Why not both?”

That’s what the “hybrid” in “Hybrid ML-KEM” signifies. Starting with OpenSSL 3.5, the preferred method to send keys is to encode them with a traditional algorithm2 and also with ML-KEM. It’s a form of Swiss cheese security since an attack would need to successfully defeat two different encryption methods in order to get at the secret key.

This change is not without cost. The handshake benchmark increased from 13ms to 15.5ms after this feature was incorporated into the master branch. Recall, however, that this cost only occurs at the start of a TLS connection. After the handshake is done, the more efficient symmetric key is used. For applications in which every millisecond counts, OpenSSL can be configured to accept faster (though potentially less secure) key exchange methods.

The OpenSSL Foundation thanks everyone who contributed this feature, including Michael Baentsch, Viktor Dukhovni, Shane Lontis, Paul Dale, David Kelsey and Martin Schmatz.

To go deeper:

Next time we discuss the EVP_SKEY interface.

Clock image courtesy of Martinvl, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons


  1. Slightly confusingly, ML can also stand for Machine Learning, which is commonly associated with artificial intelligence. To make things worse, the modular lattice method mathematically equivalent to learning with errors (LWE), which is the mathematical problem explained in this post. To add one more “ML” to the mix, the specific technique used by ML-KEM is called the Module Learning With Errors (MLWE) problem. ↩︎

  2. In this case, the algorithm is called X25519 and it’s not a prime number factorization problem, but is an example of elliptic-curve cryptography (ECC)↩︎