It is easier to understand it by looking at our day-to-day activity — visiting a web page.
Browser initializes a GET HTTP request. With the help of IP and MAC layers, the request finds the best route to the destination.
On the sender’s side, the HTTP layer builds the message data. At the TCP layer, the TCP header is attached to each segment. Next, the IP layer adds the IP header, and finally, MAC adds its frame header.
When a router receives the frames, the MAC header and IP header are detached.
With the information in headers, the router knows the destination.
After deciding where the next stop is, the router attaches the corresponding IP header and MAC header and sends frames to a switch. Then, the switch sends all to the target host.
The corresponding headers are detached on the target host when the message goes up each layer, from MAC to IP and then TCP.
At the TCP layer, the server reassembles ordered segments into a request.
Finally, the message reaches the top layer — HTTP.
During the process:
- Under TCP, the IP layer and the MAC layer chooses the route.
- Above TCP, the HTTP layer builds the request and response messages.
- TCP converts the variable-length request into segments and makes sure all parts arrive in their original order. This is the role TCP plays.
TCP Header Format
Before diving into the handshake, let’s take a quick look at its segment header format. It will make our next journey straightforward.
The first two are Source Port and Destination Port, defining the TCP connection.
Source Port indicates where the TCP segment comes from, while the Destination Port marks its destination.
Take Medium homepage, for example. medium.com
is converted to the IP address 162.159.152.4
after DNS resolution. Port 443
, by default, is omitted.
Here, the 443
is the Destination Port.
By the way, the IP address of the Medium homepage varies depending on your location. If you are interested in how DNS works, I hope this post could help.
Next are Sequence Number and Acknowledgement (ACK) Number, identifying the TCP segment. ACK Number is used to confirm the success of data reception.
TCP Options is optional, providing additional information.
Here are some typical values of TCP Options:
0
means the end of the options list.1
means that it is for aligning option fields on 32-bit boundaries for better performance. It doesn't do anything to the TCP connection.2
means the value is the Maximum Segment Size (MSS).3
is for communicating window scale.4
and5
are for Selective Acknowledgement (SACK).
TCP 3-Way Handshakes
The handshakes are about exchanging information to establish a connection. The information fall into two categories:
- Sync initial sequence numbers (ISN)
- Exchange parameters (e.g., window scale factor, Maximum Segment Size, Selective Acknowledgement algorithm)
TCP needs 3 handshakes to establish the connection:
- The client sends an SYN message
- The server replies with an SYN/ACK message
- The client responds with an ACK message
The client initializes the 1st handshake, sending over the client’s initial sequence number and params.
Two things need our attention: the client’s initial sequence number and flags.
The initial sequence number is a random one, and it cannot be zero.
Flags mark the type of the segment. Here, the SYN type is switch to 1 while others are kept 0, meaning it is an SYN segment.
Here is the 1st TCP handshake example, and we can see the ISN and the SYN flag.
After receiving the message, the server responses with the 2nd handshake.
This time we have two sequence numbers:
- the server’s ISN, and
- the ACK number
Again, the server’s ISN is a random number and different from the client’s initial one.
The ACK number is not a random one. It is based on the client’s sequence number:
server's ACK sequence number = 1 + client's initial sequence number
.
In flags, both ACK and SYN are switch to 1
. If ACK remains 0
, the ACK number is invalid.
This is the 2nd handshake example. We can see the ISN, ACK number, and SYN and ACK flags.
Finally, in the 3rd handshake, the client replies with an ACK number. Like the previous one:
the client's ACK sequence number = 1 + the server's initial sequence number
.
This time, only ACK is 1 in the flags.
In the 3rd handshake example, we can see the ACK number and ACK flag.
At this point, the TCP handshake is complete.
TCP State Transition during Handshake
Initially, both ends are in the Closed state.
To provide service, we config the server to listen to selected ports such as 80 port and 443 port. At this moment, it transits to the Listen state.
In the 1st handshake, the browser sends an SYN message. Immediately, the client’s TCP changes to the SYN-Sent state.
Usually, this state won’t last long, probably some hundreds of milliseconds before receiving the reply and entering the next state.
When the server receives the SYN message, its TCP enters the SYN-Received state. Then, the server sends the SYN/ACK message, the 2nd handshake.
Once the browser receives the message, its TCP transits into the Established state. At the exact moment, it sends the last ACK message to the server.
When receiving the 3rd handshake message, the server enters the Established state, as well.
Some hackers could attack the state transition by initializing a custom SYN message but never sending the last ACK message.
In this case, the server’s TCP connection is stuck at the SYN Received state.
By sending numerous SYN messages, the hacker flood the server with all connections in the SYN-Received state. All these connections take numerous resources stopping other users from establishing connections.
Let’s zoom in and see what is happening in the server during the process.
When the server receives the SYN message, the OS kernel will insert the data into an SYN queue. At the exact moment, the server state changes to the SYN Received state.
When the last ACK message reaches the server, the data will be retrieved from the SYN queue and inserted into the Accept queue, ready for our applications such as Nginx and Tomcat.
Remember the state attack we just mentioned? In this case, our SYN queue is full and cannot take any incoming traffic.
Takeaways
- TCP is responsible for guaranteed message delivery which other layers don’t care about.
- A connection is required for the responsibility.
- To establish a connection, TCP uses a 3-way handshake.
- The Sequence Number, the ACK Number, and flags are the key to understand the handshake.
- Each handshake changes the TCP connection state in both client and server ends.