IP and MAC layers have restricted memories to send packets. Consequently, both will limit the length of messages.
This restriction requires TCP to package variable-length bytes into several segments before serving them to the IP layer. Each segment should be in an appropriate length.
Here is a simple diagram showing how segments are sent through the internet.
The client’s HTTP layer is sending an 18-byte stream to the target server.
While byte 16–18 yet to reach the TCP layer, byte 12–15 is passed through it. TCP packaged them into a segment and attached a TCP header, marked in the yellow area.
Next, the segment was wrapped by the IP layer, sent through the internet, and reach the server.
Suppose a TCP segment exceeds the length supported by the layers underneath. In that case, IP layer will take responsibility for breaking the large segment into pieces. It is an expensive process, so we want to avoid it.
But how TCP determine the length? It is depending on Maximum Segment Size (MSS).
Ideally, we want to select an MSS that maximizes the amount of data, minimizes the proportion of the header, and avoids further breaking down in the IP layer.
By default, MSS is 536 bytes. Where this number comes from?
- IP default Maximum Transmission Unit (MTU) is 576 bytes. Anything over the size will be split.
- IP header takes 20 bytes.
- TCP header takes 20 bytes.
536 = 576 - 20 - 20.
You can tell that MSS only indicates the data body’s size and doesn’t include the header.
MSS is negotiating during the TCP 3-way handshake. Here is the TCP segment header format. MSS sits in the TCP Options.
In the SYN message, the client suggests an MSS at 1460 bytes. It is known as Sender Maximum Segment Size (SMSS).
In the server’s response, it suggests a segment of no larger than 1400 bytes. Since the size comes from the server, it is known as Receiver Maximum Segment Size (RMSS).
To guarantee the delivery, TCP needs to complete two features:
- The receiver sends the acknowledgement (ACK) to the sender when receiving a message.
- The sender retransmits the message when it is lost.
Let’s start with a simple model.
- Initially, the sender maintains a timer after sending the message.
- The 1st ACK is received before the time expires. Then, the timer is reset for the 2nd message and repeats the process.
- The 2nd timer is expired, and no ACK is received. Retransmission starts.
This simple design is straightforward but inefficient, as each message needs to wait until the previous ACK is returned.
Let’s improve it.
By marking each message with an ID, we can quickly send multiple messages. The same ID is linked to the corresponding timer and the ACK message.
If one message is lost, #3, for example. The sender retransmits it.
What is the “message ID” in TCP? It is the Sequence Number.
Here is a message example.
- This TCP segment length is 647 bytes.
- The sender has sent 1461 bytes before this message.
- The sender will send more bytes starting from 2108.
The sequence number can be up to 2³². Then it will restart.
This brings a problem.
Imagine we have a sequence number only up to 4, and each time, we send 1 byte.
- During the process, 2nd byte (marked as #2) is lost. Based on the design, the #2 will be retransmitted at a later time.
- We continue sending more bytes until the sequence number starts from #1 again. At his moment, the retransmitted #2 is sent.
- The receiver doesn’t know if this is an old #2 or a new one. This could mess things up.
The timestamp is introduced in TCP Options to resolve the issue.
The timestamp can disambiguate segments with the same sequence numbers.
Let’s attach the timestamp to each segment.
The receiver reads the timestamp B and compares it with the previous timestamp E, confirming that this is retransmission. Otherwise, the timestamp should have been larger than E.
TCP Fast Retransmission
TCP has a fast transmission feature — retransmitting the lost segment before its timer expires.
To allow fast transmission, we need to set some rules for the sender and the receiver.
Rule 1: as a receiver, it should always send the sequence number it expects to receive.
For example, when the receiver receives segment 1, it responses with ACK2, meaning it expects to receive segment 2 in the upcoming message.
Rule 2: as a sender, it should ignore the timer and immediately start retransmitting the lost segment after receiving 3 duplicated out-of-order ACKs.
The diagram shows an example of fast retransmission.
- After the first 2 segment transmission, segment 3 is lost.
- When receiving segment 4, the receiver sends ACK3 instead of ACK4 based on the rules. This is the 1st duplicated out-of-order ACK.
- Again, when receiving segment 5, the server still expects the retransmission of segment 3. Accordingly, the 2nd duplicated ACK3 is sent.
- Then, the 3rd duplicated ACK3 is sent.
- At this moment, the sender enters the fast retransmission and retransmits segment 3.
- After receiving the lost segment, the receiver’s ACK is back to order and expects segment 7 in the upcoming message.
But there is a problem — the sender doesn’t know if segments 5 and 6 arrive safely until the fast retransmission completes. If both segments are lost, the following retransmissions take a longer time.
The receiver can share the information with the sender with the selective acknowledgment (SACK) feature to facilitate the retransmission process.
TCP Selective Acknowledgement
SACK sits in TCP options.
In SACK, we can specify the range of data that has been received beyond the acknowledgment number.
Here is an example of SACK, indicating the data from 2872 to 3393 has been received.
With it, the sender knows it doesn’t need to retransmit any bytes between these edges.
Also, the sender could figure out other lost segments and retransmit them as soon as possible.
- Maximum Segment Size (MSS) defines the length of TCP segments.
- A timer helps TCP retransmits a lost segment.
- After receiving 3 duplicated ACKs, fast retransmission starts before a timer expires.
- Selective Acknowledgment (SACK) provides information on received out-of-order bytes to facilitate the fast retransmission process.