HTTP/2 and How it Works

TLS helps improve security. Now it is time for performance enhancement, the focus of HTTP/2 design.

The first challenge is making it compatible with HTTP/1.x.

To achieve it, HTTP/2 inherits semantics from HTTP/1.x, including request methods, status codes, headers, etc.

Most importantly, it inherits the scheme design. There is no http2 or http2s schemes. We continue using the http and https. This decision avoids so much trouble in upgrading the protocol.

While keeping the semantics as is, HTTP/2 introduces new “syntax”, featuring:

  • connection preface,
  • header compression,
  • binary-encoded frames, and
  • streams.

Connection Preface

After TLS handshakes, a browser must send a 24-byte connection preface to the server. It confirms that the browser wants to use HTTP/2.

The connection preface is a simple plain text in ASCII code.

In an early draft of HTTP/2 on May 29, 2013, the connection preface was

FOO * HTTP/2.0\\\\r\\\\n\\\\r\\\\nBA\\\\r\\\\n\\\\r\\\\n.

Since July 8, 2013 draft, it was changed to

PRI * HTTP/2.0\\\\r\\\\n\\\\r\\\\nSM\\\\r\\\\n\\\\r\\\\n.

Besides the separator \\\\r\\\\n\\\\r\\\\n and HTTP/2.0, we see a keyword PRISM, making people think of the "secrete program".

A fun fact — if you capture the connection preface in WireShark, it simply names it “Magic.”

HyperText Transfer Protocol 2
Stream: Magic
Magic: PRI * HTTP/2.0\\\\r\\\\n\\\\r\\\\nSM\\\\r\\\\n\\\\r\\\\n

Once the server receives the “magic,” it will expect to receive and send messages based on HTTP/2 protocol standards. At this moment, the browser can start preparing its first request.

Header Compression

Header compression is an essential change in the protocol syntax.

In HTTP/1.x, we use the gzip to compress the body in a request or response while leaving the vast headers as is.

Here is a part of the request headers when requesting Medium homepage doc.

accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9,pt-BR;q=0.8,pt;q=0.7,zh-TW;q=0.6,zh;q=0.5,fr;q=0.4
cache-control: no-cache
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4445.0 Safari/537.36

That’s a lot of bytes merely in headers.

HTTP/2 introduces a new compression algorithm, HPACK, specially designed for header compression.

How HPACK compress the headers? Here comes the first syntax update.

HTTP/2 deprecated the start-line and status line in HTTP/1.x, using pseudo-header fields instead.

A typical status-line is HTTP/1.1 200 OK, including the protocol version, a status code, and a status text.

In HTTP/2, it becomes :status: 200, using a pseudo-header :status for the status code and removing the version and the useless status text.

There are 3 key features here:

  • The pseudo-header fields start with a :.
  • HTTP/2 pseudo-headers are required to be lower cases. It eliminates ambiguities.
  • All data in headers are now in key-value pairs.

HPACK is a stateful algorithm, asking the browser and the server to maintain a read-only static table.

There are 61 entries in the static table. Both ends can send the header name and the corresponding value based on the index.

For example, the server can send 8 instead of HTTP/1.1 200 OK, drastically reducing headers' size.

Moreover, a dynamic table is introduced for new pairs, attaching to the static table’s end.

Take the user-agent header as an example. The browser and the server both add it to dynamic tables and indexed it with a number, let's say, 62.

Next time, the browser can send 62 instead of hundreds of bytes repetitively.

The dynamic table grows while more communications happen. Eventually, a once large header could reduce to dozens of bytes.

Binary-Encoded Frames

The compressed headers and body are ready — time to send the message.

HTTP/2 decides to use the binary format for its messages instead of ASCII codes, making messages computer-friendly.

When parsing plain text, a computer needs to handle ambiguities, such as upper and lower cases and various kinds of spaces. By contrast, binary data eliminates vaguenesses.

Let’s forget the typical headers-body structure in HTTP/1.x and see how HTTP/2 handles a message.

The message is broken up into pieces. Each of them is a frame.

  • HEADERS Frame is for header data
  • DATA Frame is for body data

The protocol doesn’t see messages anymore. It looks at frames.

Here is the structure of a frame:

  • A Length indicates the length of the frame. A frame is usually less than 2^14 bits but can be up to 2^24 bits. Therefore, its size usually is less than 16K.
  • A Type shows the frame's type, such as data frames (HEADERS frame, DATA frame) and flow-control frames (SETTINGS frame, PRIORITY frame, etc.). HTTP/2 defines 10 types, but it can be up to 2^8 types. You can define custom types when needed.
  • A Flag is for simple flow control, such as END_HEADERS indicating the end of the headers data.
  • A Stream Identifier marks the stream identity. The identifier can be up to 2^31, and its most significant bit is reserved. We will see how the stream works soon.
  • The Data is the frame payload.

Let’s look at an example of the frame header.

Stream: HEADERS, Stream ID: 1, Length 196, 200 OK
Length: 196
Type: HEADERS (1)
Flags: 0x04, End Headers
00.0 ..0. = Unused: 0x00
..0. .... = Priority: False
.... 0... = Padded: False
.... .1.. = End Headers: True
.... ...0 = End Stream: False
0... .... .... .... .... .... .... .... = Reserved: 0x0
.000 0000 0000 0000 0000 0000 0000 0001 = Stream Identifier: 1
.....
.....

In the example,

  • The length shows that the frame’s size is 196 bytes.
  • The type is 1, meaning it is a HEADERS frame.
  • The flags are 0x04 (End Headers), meaning this is the last frame of all headers frames.
  • The Stream Identifier is 1, meaning this response is in the stream[1].

This feature leads to how HTTP/2 handles delivering frames — using streams.

Streams

After breaking messages into pieces, a browser sends all frames to the server end.

On the receiving end, frames could arrive at different times. A data frame may arrive earlier than its corresponding headers frame.

How can we assemble all related frames into a message?

When breaking up the message, the browser marks corresponding frames with a stream ID (the Stream Identifier) and their orders. With this information, frames can easily be reassembled in order on the receiving end.

It feels like each frame is sent in its corresponding “stream.”

Moreover, the corresponding request and response are sharing the same stream ID. With it, the browser can match its requests and responses.

Though streams are existing only conceptually, you can treat them as real.

In the same connection, you can have multiple HTTP conversations simultaneously. It is known as multiplexing.

In this way, each stream is decoupled, and (almost) no head-of-line blocking issues.

Stream State

The Flag in a frame is for changing stream state.

There are 5 states:

  1. Idle: the stream has not yet been created.
  2. Open: the stream is created and running.
  3. Half-Close: the browser has completed sending requests and waiting for responses.
  4. Closed: the stream is ended.
  5. Reserved: the stream is reversed for server-push messages.

To simplify the case, let’s skip the 5th and only look into the first 4 states.

When the browser sends the HEADERS frame, a stream is created, and a stream ID is assigned.

The stream starts its Open state, and both ends can send and receive data.

When no more requests are waiting to be sent on the browser, it sends a frame with the END_STREAM flag, telling the server that the stream is about to close.

The server understands and sends its last response with an END_STREAM flag.

Finally, the communication ends, and the stream is closed.

HTTP/2 doesn’t reuse the same stream ID. Therefore, the lifecycle of a stream is equivalent to a request-response message in HTTP/1.x.

A new id is assigned until it reaches 2³¹. When the last id is used, the browser sends a GOAWAY frame to initialize a new TCP connection, and the stream ID is reset.

There are some other worth-mentioning features of the stream.

  • Both the browser and the server can initialize a stream independently.
  • The stream ID is assigned in increasing order. The odd number is for the browser, and the even number is for the server. Therefore, you see the odd number more often.
  • The stream-0 is reserved for flow-control. It cannot be closed.
  • It is possible to prioritize streams. A server can respond with requests for CSS files before the ones for images.
  • HTTP/2 assumes the persistent connection is used. Therefore, no Connection: keep-alive is required.

An Updated Protocol Stack

On top of TLS, HTTP/2 adds a new layer with HPACK and Stream.

Moreover, it requires TLS 1.2 to offer enhanced security. As we saw in TLS 1.3 improvements, some cipher suites are also deprecated in HTTP/2, such as DES and SHA-1.

Do’s and Don’ts

With all optimizations that have been done in HTTP/2, some previous solutions for HTTP/1.x don’t make sense anymore.

We should deprecate Sprits and start using images in the <img> tag. It was for reducing the number of requests in HTTP/1.x, and the reason is no longer valid in HTTP/2.

Moreover, it is expensive to use Sprits when it comes to caching it. One changed icon could end up invalidating the entire Sprits file.

We should stop embedding base64-encoded resources (images, CSS, and JS files) in HTML.

It reduces the number of requests but increases the size of the HTML file. Besides, HTTP/2 cannot cache and prioritize the embedded resources.

Also, we should stop using domain sharding.

It was more like a workaround for the 6-connection restriction on browsers. HTTP/2 resolve the issue gracefully with multiplexing.

In HTTP/2, using domain sharding doesn’t improve the performance. By contrast, it increases the cost of initializing connections and maintaining the HPACK tables for each connection.

References

a coder 👨‍💻