Facebook Fizz memory leak vulnerability (CVE-2019-11924) reproduce and analysis

Posted on 2020-03-30

Fizz is an open source TLS 1.3 implementation developed by Facebook. This post is about a memory leak vulnerability (CVE-2019-11924) in Fizz project. There’s almost no information about it on the internet other than a brief advisory and the patch commits on github. In this post, I’ll show you how to reproduce this vulnerability and make an analysis.

1. Vulnerability information

The only information you can find about this vulnerability on the internet is as follow.

A brief advisory made by Facebook:

Description: A peer could send empty handshake fragments containing only padding which would be kept in memory until a full handshake was received, resulting in memory exhaustion. This issue affects versions v2019.01.28.00 and above of fizz, until v2019.08.05.00.

Two patch commits on github:

Summary:
Zero length (all padding) handshake are forbidden by RFC. Allowing these was a regression in D13754697 (2c6f78a).

This is a partial fix for CVE-2019-11924

This commit patched the EncryptedReadRecordLayer::read() function.

Summary:
It is possible that a peer might send us records in a manner such that there is a 16KB record and only 1 byte of handshake message in each record. Since we normally just trim the IOBuf, we would end up holding 16K of data per actual byte of data. To prevent this we allocate a contiguous buffer to copy over these bytes for handshake messages for now.

This is a partial fix for CVE-2019-11924

This commit patched the ReadRecordLayer::readEvent() function.

We can get a few key points about the vulnerability according to the patch commits:

It causes memory leak problem;
It occurs during TLS 1.3 handshake procedure;
Zero length (all padding) handshake message can trigger it;
It is related to encrypted message rather than plaintext message.

2. Reproduce the vulnerability

In this section, I’ll show you how to reproduce this vulnerability.

2.1. TLS 1.3 Handshake overview

First of all, we need to know more about the TLS 1.3 handshake procedure. The purpose of handshake in TLS is to negotiate a protocol version, select cryptographic algorithms, optionally authenticate each other, and establish shared secret keying material between client and server. According to RCF 8446, the basic message flow for full TLS handshake is as follow:

       Client                                           Server

Key  ^ ClientHello
Exch | + key_share*
     | + signature_algorithms*
     | + psk_key_exchange_modes*
     v + pre_shared_key*       -------->
                                                  ServerHello  ^ Key
                                                 + key_share*  | Exch
                                            + pre_shared_key*  v
                                        {EncryptedExtensions}  ^  Server
                                        {CertificateRequest*}  v  Params
                                               {Certificate*}  ^
                                         {CertificateVerify*}  | Auth
                                                   {Finished}  v
                               <--------  [Application Data*]
     ^ {Certificate*}
Auth | {CertificateVerify*}
     v {Finished}              -------->
       [Application Data]      <------->  [Application Data]

+  Indicates noteworthy extensions sent in the previously noted message.
*  Indicates optional or situation-dependent messages/extensions that are not always sent.
{} Indicates messages protected using keys derived from a [sender]_handshake_traffic_secret.
[] Indicates messages protected using keys derived from [sender]_application_traffic_secret_N.

As you can see, there are many different types of messages during handshake procedure. The question is which specific message can trigger the vulnerability?

In TLS 1.3, all handshake messages after the Server Hello are now encrypted. According to my conclusion in the vulnerability information section, the Client Finished message seems to be the answer.

2.2. Normal Client Finished message

TLS consists of two primary components:

Handshake protocol: Authenticates the communicating parties, negotiates cryptographic modes and parameters, and establishes shared keying material.

Record protocol: Uses the parameters established by the handshake protocol to protect traffic between the communicating peers. The record protocol divides traffic up into a series of records, each of which is independently protected using the traffic keys.

In this section, I’ll show you what a normal Client Finished message looks like and how it is generated.

2.2.1. Client Finished message in handshake layer

Here is an example of plaintext Client Finished message.

0x14 0x00 0x00 0x30

// Handshake Header
// 0x14 - handshake message type 0x14 (finished)
// 0x00 0x00 0x30 - 0x30 (48) bytes of handshake finished data follow

0x38 0x41 0x86 0x0a 0x64 0x5c 0x35 0x51 0x98 0x7b 0x01 0x3a 0x93 0xf7 0xdd 0x5b
0x38 0x76 0xb7 0x7c 0x7b 0x13 0xea 0x7b 0x59 0xb0 0x4f 0x29 0x3f 0x8f 0xd1 0x14
0xb7 0x5f 0x50 0x7b 0xe5 0xef 0x9a 0x5c 0xc0 0xf3 0x39 0x7d 0x58 0x24 0xd0 0xa9

// Verify Data

The structure of it can be divided into two parts.

It starts with four bytes of Handshake Header. Each handshake message starts with a type and a length. The rest of the message is verify_data which is computed as follows:

verify_data = HMAC(
    finished_key,
    Transcript-Hash(Handshake Context, Certificate*, CertificateVerify*)
)

* Only included if present.

The finished_key is computed from the BaseKey using HKDF-Expand function.

1	finished_key = HKDF-Expand-Label(BaseKey, "finished", "", Hash.length)

The BaseKey for Client Finished message is client_handshake_traffic_secret which is calculated in the previous cryptographic negotiation steps.

2.2.2. Client Finished message in record layer

In record layer, the plaintext Client Finished message will be encrypted.

The structure of a encrypted TLS 1.3 record is as follow:

struct {
    ContentType opaque_type = application_data; /* 23 */
    ProtocolVersion legacy_record_version = 0x0303; /* TLS v1.2 */
    uint16 length;
    opaque encrypted_record[TLSCiphertext.length];
} TLSCiphertext;

Here ia an example of an encrypted TLS record.

0x17 0x03 0x03 0x00 0x45

// Record Header
// The TLS 1.3 record is encrypted into a TLS 1.2 record "wrapper" that looks like application data.
// 0x17 - type is 0x17 (application data)
// 0x03 0x03 - legacy protocol version of "3,3" (TLS 1.2)
// 0x00 0x45 - 0x45 (69) bytes of wrapped data follows

0x3e 0xbf 0x4b 0xf2 0xf7 0x18 0xfb 0xee 0x06 0xf7 0x7d 0xbd 0x43 0xa2 0xde 0xe5
0xe3 0x41 0x54 0xdc 0xf2 0x89 0x1b 0xd7 0xa1 0x57 0x9e 0xb0 0xee 0xe6 0x11 0x8e
0x29 0x7e 0xc6 0x21 0x72 0x23 0xcf 0x6c 0x5a 0x24 0xfa 0xa6 0x77 0x43 0x32 0x71
0x01 0xe3 0x37 0xd2 0x18

// Encrypted Data
// This data is encrypted with the client handshake key.

0xc8 0x19 0x07 0x3b 0xe2 0x76 0xe5 0xf9 0xa7 0x3f 0x40 0x7b 0x27 0x06 0xff 0x96

// Auth Tag
// This is the AEAD authentication tag that protects the integrity of the encrypted data and the record header.

The encrypted_record is computed as follow:

1	AEADEncrypted = AEAD-Encrypt(write_key, nonce, additional_data, plaintext)

The write_key here is client_write_key which is is calculated in the previous cryptographic negotiation steps. The nonce is a random number which is derived from the sequence number and the client_write_iv.

2.3. Crafted Client Finished message

To trigger the vulnerability, we need to construct a Client Finished message that satisfies the following conditions:

It’s must be a valid handshake message;
It can be decrypted correctly by server;
It’s not a real Client Finished message. Otherwise it can only be received once.
It’s not empty, but in the meanwhile the length of it should be 0.

How this can be done? After several attempts, I found the right way. The message structure we need to modify is the TLSPlaintext. The normal TLSPlaintext structure is as follow:

struct {
    ContentType type;
    ProtocolVersion legacy_record_version;
    uint16 length;
    opaque fragment[TLSPlaintext.length];
} TLSPlaintext;

The crafted TLSPlaintext can be described as:

1	bytearray(b'\x16') + bytearray(b'\x00' * 1024 * 16)

It’s simply a byte of ContentType (0x16 indicates handshake) followed by 16K bytes of 0x00 (padding). 16K is the maximum size of TLS record size. It’s a valid TLS record, but not a real Client Finished message. After that, the message should be encrypted to TLSCiphertext correctly and then send to server.

Note: Except for the modification to TLSPlaintext, everything else should follow the TLS 1.3 protocol standards.

2.3.1. How dose Fizz server determine the length of encrypted message

As I mentioned above, our goal is to make the message not empty, but in the meanwhile the length is 0. Why does an all padding handshake message can do this?

The encrypted message will be decrypted and assigned to a TLSMessage variable when server received it from socket. Let’s dive into the TLSMessage variable assignment logic in Fizz sever. The related code is in the EncryptedReadRecordLayer::read() function.

TLSMessage msg;
auto currentBuf = decryptedBuf->get();
bool nonZeroFound = false;
do {
    currentBuf = currentBuf->prev();
    size_t i = currentBuf->length();
    while (i > 0 && !nonZeroFound) {
        nonZeroFound = (currentBuf->data()[i - 1] != 0);
        i--;
    }
    if (nonZeroFound) {
        msg.type = static_cast<ContentType>(currentBuf->data()[i]);
    }
    currentBuf->trimEnd(currentBuf->length() - i);
} while (!nonZeroFound && currentBuf != decryptedBuf->get());
if (!nonZeroFound) {
    throw std::runtime_error("No content type found");
}
msg.fragment = std::move(*decryptedBuf);

First, the encrypted message will be decrypted to decryptedBuf:

What this piece of code does is basically iterating over the decryptedBuf from end to start while trying to find the first non-zero byte. Assign the first non-zero byte to msg.type (MessageType byte is in the buffer end after decryption). Then, folly::IOBuf::trimEnd() is called to trim the 0x00 bytes in the end of the buffer. Finlay, the trimmed buffer is assigned to msg.fragment.

msg.fragment is a folly::IOBuf object whose length is a class variable managed by the class itself. folly:IOBuf::trimEnd() function is as follow:

void trimEnd(std::size_t amount) {
    DCHECK_LE(amount, length_);
    length_ -= amount;
}

As you can see, the trimEnd operation just adjust the tail pointer and does not modify any actual data in the buffer. Thus, the length of msg.fragment is 0, but it actually holds 16K of data.

2.3.2. The memory leak problem

A complete attack flow can be described as follow:

A client follows the TLS 1.3 protocol standards;
Send crafted Client Finished repeatedly at the normal Client Finished message sending procedure.

At server side, you will see the memory usage of Fizz server process is growing rapidly. In my test, the crafted Client Finished message was sended for 10000 times.

The initial memory usage of Fizz server process is about 13M:

After 10000 times of the crafted Client Finished message sending, the memory usage is about 210M:

3. Root cause analysis

The memory leak problem exists in ReadRecordLayer::readEvent() function.

case ContentType::handshake: {
    unparsedHandshakeData_.append(std::move(message->fragment));
    auto param = decodeHandshakeMessage(unparsedHandshakeData_);
    if (param) {
        VLOG(8) << "Received handshake message "
                << toString(boost::apply_visitor(EventVisitor(), *param));
        return param;
    } else {
        continue;
    }
}

The unparsedHandshakeData_ here is a folly::IOBufQueue object which is basically a chain of folly:IOBuf objects. Every new message is appended to unparsedHandshakeData_, and normally the memory is freed after the handshake message was decoded.

In this attack scenario, Fizz server is waiting a Client Finished message to finish the handshake procedure. But we send a crafted message which is not a real Client Finished message. Fizz server will stuck in the handshake message read event loop because the handshake procedure is not done.

Furthermore, the memory free logic in decodeHandshakeMessage() cannot be reached because every message->fragment (folly::IOBuf object) appended to unparsedHandshakeData_ is 0 length. Another important thing is that the message->fragment actually holds 16K memory data (as I explained above), even if the length of it is 0.

Taking all the above factors into consideration, the memory leak problem is easy to understand.

4. Patch analysis

The official patch fixes two functions. For the memory leak problem in ReadRecordLayer::readEvent(), the patch code is as follow:

-    unparsedHandshakeData_.append(std::move(message->fragment));
+    std::unique_ptr<folly::IOBuf> handshakeMessage =
+        unparsedHandshakeData_.move();
+    message->fragment->coalesce();
+    constexpr size_t kExtraAlloc = 1024;
+    if (!handshakeMessage) {
+        handshakeMessage =
+            folly::IOBuf::create(message->fragment->length() + kExtraAlloc);
+    } else if (handshakeMessage->tailroom() < message->fragment->length()) {
+        handshakeMessage->reserve(
+            0, message->fragment->length() + kExtraAlloc);
+    }
+    memcpy(
+        handshakeMessage->writableTail(),
+        message->fragment->data(),
+        message->fragment->length());
+    handshakeMessage->append(message->fragment->length());
+    unparsedHandshakeData_.append(std::move(handshakeMessage));

The new code allocate a contiguous buffer to copy over these bytes. In addition, kExtraAlloc bytes is applied in order to avoid needing to re-allocate a lot of times if we receive a lot of small messages. So the maximum memory usage is length + kExtraAlloc. Here is a an example of the result for an all padding handshake massage:

The fix in EncryptedReadRecordLayer::read() is more fundamental. It rejects zero length (all padding) handshake message directly because it is forbidden by RFC.

-   if (!msg.fragment) {
+   if (!msg.fragment || msg.fragment->empty()) {
    if (msg.type == ContentType::application_data) {
        msg.fragment = folly::IOBuf::create(0);
    } else {

The RFC 8446 says and I quote in section 5.4. Record Padding:

Application Data records may contain a zero-length TLSInnerPlaintext.content if the sender desires. This permits generation of plausibly sized cover traffic in contexts where the presence or absence of activity may be sensitive. Implementations MUST NOT send Handshake and Alert records that have a zero-length TLSInnerPlaintext.content; if such a message is received, the receiving implementation MUST terminate the connection with an “unexpected_message” alert.

OK, That’s all for this post. During my research, I found some very useful resources, and I listed them at the reference section. Have fun.

Reference

RFC 8446: The Transport Layer Security (TLS) Protocol Version 1.3
https://tools.ietf.org/html/rfc8446
The New Illustrated TLS Connection
https://tls13.ulfheim.net/
Awesome SSL/TLS Hacks
https://github.com/lenny233/awesome-tls-hacks