Understanding TCP, UDP, HTTP, And WebSocket Concepts PART - 1
Naruto is a legendary series in the anime world. remember the epic fight that happened between Naruto and Sasuke at end of the series. this scene still gives a lot of excitement.
Please do watch this series if you have started watching anime and looking for a good series.
You might be thinking this is HTTP and Websocket blog and why the hell this guy is discussing anime. I thought before going to conceptual details of technical things it will be a good idea to excite the brain from anime stuff. Maybe it will help to focus better.
Introduction
These are the two main characters BOB the Server and PAUL the Client of the story.
Story Begins when BOB And PAUL Feels Lonely AF And They Tried to Communicate with each other. Two Friends are Far Away and Live in Internet Network City at different Nodes. They Tried Different Medium of Transport Layer (check OSI Model if you don't know) e.g. TCP & UDP And Application Layer e.g. HTTP And Websocket. They have faced Many challenges during Communication & also learned many things. Now they want to share the story so that everyone can know about friends' legacy.
NOTE: This story is hypothetical and assumed that every reader at least figured out that BOB is Application Server(Backend) and PAUL is Client(Frontend).
NOTE: This Story not meant to be a bedtime story for Kids. it is a bedtime Story for developers.
We are going to discuss UDP, TCP, HTTP, and WebSocket Protocol. It is an interesting topic to know the working of UDP & TCP. What are TCP, UDP, HTTP, and Websocket? Why HTTP is popular and used everywhere? Why Websocket is Awesome? Can Websocket Replace HTTP Completely in Application? We do have a lot of questions in mind. Let’s first break our blog into a separate section so that it would be easy to discuss.
Discussion Topic
- UDP And TCP Protocol
- HTTP Protocol
- Web Socket Protocol
1. UDP And TCP Protocol
UDP(User Datagram Protocol)
UDP is based on a connectionless communication model. it has no overhead on maintaining connection. there is no guarantee that the packet(data) is delivered, lost, retransmission, etc to the client. it provides minimal error checking on the packet and a simpler, faster protocol compares to TCP.
This is the simple definition of UDP. Now let us Try to Understand it using BOB The Server and Paul The Client Story.
BOB wants to send a “Hi How Are You” message to his friend PAUL.
BOB Decided to send messages using UDP(User Datagram Protocol) in the network city. As the message is sent in the form of packets. Bob converted the message data to packets(byte data).
Finally, a Message is sent from BOB to PAUL using UDP(User Datagram Protocol).
Now BOB has sent messages successfully let’s see how PAUL received data. There can be a different type of case possible on the receiver side when sending data over UDP. Covering Basic Case
Basic Case Possible On Receiver side(PAUL) -:
1. PAUL gets the message properly without data loss.
2. PAUL gets the message with data loss.
3. PAUL gets the message out of order.
Happy Case PAUL Received the Message properly
Paul Received Message with Data Loss(Packet loss). Data don’t make sense to PAUL now.
Paul Received Packets in different Orders from the way they sent. frustrating yeah!
UDP doesn’t retain message state once sent
UDP doesn’t take overhead to retain the message state once it is sent. there is no guarantee that the message will be delivered, it may be reached the client or reaches with data loss or out-of-order data.
Minimal checks related to message transmission and error checking make UDP faster than other protocols.
BOB thinking is really a message received by PAUL? there is no way BOB can know the status of the message over UDP.
If PAUL doesn’t receive a message or some packet is lost. there is no way for PAUL to send an acknowledgment of the message state.
Question of the day Why and Where UDP is used?
UDP has a lot of drawbacks so why and where the hell it is used? to get the answer first we need to think about the advantages of UDP.
UDP is Fast (like a Usain Bolt):
As UDP has no overhead of managing a connection, retaining the different states of the message, and doing minimal error checking it makes UDP very fast.
Think about the case when you are not managing anything while running obviously you will run very fast. (if you are not obese)
No retransmission Delay:
UDP is a good choice when you can’t afford retransmission delay. Suppose while sending packets some packets are lost so the client can’t make a delay to ask the server to resent it.
Let me give you a classic example VOIP(Voice over internet protocol) you are on call with your friend(buddy or GF I know you won't have any :| ) and suddenly you hear the voice is breaking on the phone. it means some packets are lost but that is fine. phone(VOIP) can’t afford retransmission delay.
Think about the online games, video streaming, audio streaming, etc UDP is used.
Suitable for Broadcast:
UDP is mainly used for broadcast messages. What the hell is a broadcast message suppose you are shouting in the middle of the road and everyone can hear it. this is a broadcast message that meant to be for a group of users.
For a Group of users usually, servers don’t do overhead of maintaining all the connections there can be millions of users at a time it consumes resources. the server doesn’t take overhead to maintain all connections instead it broadcast messages to everyone over UDP.
so the idea is to shout loud as a server and let it reach all available clients.
Note: there are many other benefits of UDP but I am only covering basic.
Use Case of UDP:
UDP is used numerously for real-time communication basic example we can cover is Domain Name System(DNS) where speed is a priority, Audio Streaming, video streaming where it is affordable to lose packet as it only degrades quality, Voice Transmission (VOIP), Games (real-time online game, multiplayer games, etc).
UDP is used(Basic Case)-:
1. DNS(Doman Name System) and common protocols like SNMP, RIP & DHCP.
2. Audio Streaming
3. Video Streaming
4. VOIP(Voice over Internet Protocol)
5. Games
Now we can move to TCP which is connection-oriented.
TCP(Transmission Control protocol)
TCP is connection-oriented(reliable connection) approach. it provides a three-way handshake, retains message state, maintains the order of packets, error checking & retransmission. it is used where delivery is a priority over time. mostly TCP is used for one-to-one communication.
In simpler words, TCP established a prior connection between client and server before communication. the connection is made by a three-way handshake(will understand about it). After the connection is established a message is sent and it retains the state of the message by acknowledgment so if the data is lost both server and client know it needs to be sent again.
It also maintains the order of packets by numbering them so no need to worry about “Hello” becomes “olelH”.
Think about the transport service where the connection is already there and all the message state is available how reliable that connection will be.
TCP takes the overhead of maintaining connection, message states & error checking so it makes it slower than UDP.
Did You Know:
Both TCP And UDP protocol exist in the transport layer. Do read about the OSI model to take a deep dive.
Let’s Take the example of BOB the Server and PAUL the Client to understand TCP Working.
Three-Way Handshake (Before Connection):
The connection is established in TCP via a Three-way handshake. BOB and PAUL were already smart enough to figure out.
A three-way handshake happens in three steps -:
1. SYN -: First Client(Paul) sends SYN(synchronize sequence number) don’t worry by name just some number e.g. x = 23,354,23 etc.
2. SYN/ACK -: Server (BOB) Send both ACK(acknowledgment) that is
x = x + 1 and SYN(synchronize sequence number) that is y.
3. ACK -: Client(Paul) send ACK(acknowledgment) that is y = y + 1.
Let's put in simpler words there are P and Q.
step1: P send 154 to Q.
step2: Q receives 154 it acknowledge it by sending 154 + 1 and send another number 256.
step3: P receives 155 (the number is right P send 154 before) and 256. it acknowledges it by sending 256 + 1.
Now both P and Q have acknowledged each other in a three-way handshake and the connection is established successfully.
BOB and PAUL have established connections successfully via a three-way handshake(bro code) now we can move to the important part that is communication.
BOB the Server and PAUL the Client communicating over TCP (after connection established via three-way handshake)
BOB The Server sends “Hello” to PAUL The Client over TCP.
Let’s not consider the happy case where data is received(no fun) suppose data loss, data error, or unordered data is received.
Data loss or error: “Hello” becomes hel, haddo, heplo, hell, pello, etc. TCP does an error or data loss checking and appropriately triggers retransmission.
Unordered data is received: “Hello” become olleH, elHlo, oellH, eolHl etc. TCP do numbering system e.g. 1,2,3,4… on the packet so it ordered it if reached unordered.
TCP managed connection so any issue happened on the network during transmission it sends data status back.
TCP is a reliable connection-oriented approach as both sender and receiver know the status of the connection & message during transmission.
Let’s list the benefit of TCP(basic one):
- Connection-Oriented Approach: sender and receiver always communicate after making a connection via three-way handshake it ensures that the connection is active.
- Connection Termination Acknowledgement: sender and receiver both get the acknowledgment of connection termination independently whatever side happens.
- Ordered Data Transfer: TCP managed the ordering of packets by numbering them. so the packet is always received ordered.
- Retransmission Lost Packet: TCP check packets through acknowledgment. if the packet is lost it trigger retransmission.
- Error Detection: Appropriate error detection and correction technique is used to avoid duplicate packet and error packet.
- Timeout Based Retransmission: TCP sends timer data on packet also so if data is not received in a given time then it triggers retransmission (so cool).
- Congestion Control: means don’t take more than what you can chew! try to put a lot of food in a mouth at the same time. so receivers also have a limit on how much data they can receive at a moment. call it a data limit. so TCP managed data limit over the network so that client doesn’t become overload with data.
Disadvantage:
- TCP is slow compare to UDP not good where time is a priority: As TCP takes a lot of overhead maintaining connection, error checking, etc it is slower it causes a delay in retransmission. it is not appropriate where time is the topmost priority like multiplayer or online games, video streaming & audio streaming, etc. User experience is bad when the game is lagging(TCP doing error checking on packets), buffering video.
- Not Suitable for Broadcast: suppose you just want to shout in the middle of the road without thinking about all the people. Think about the case where you ask every person's name on the road and then shout. it creates a lot of redundant work.
Think about the newspaper coming into your home. do newspaper company care what is your name? it just broadcast news to all the user address without making a connection.
So it gives a lot of redundant work for the server to maintain connections for broadcast messages over TCP.there can be millions of connections and it can consume a lot of resources. So if the server just wants to broadcast a message to a group of users TCP is a bad choice. - TCP is a problem in embedded communication: TCP is complex and resource-intensive for embedded where ram and memory are limited.
it is hard to use the TCP model when hardware devices have limited computation power.
Use Case of TCP:
Application is out of scope for blogs as there are unlimited. only covering basic one.
1. HTTP, Websocket, SSL, SSH are built on top of TCP. Almost everything we used in daily life uses TCP communication.
2. worldwide web uses TCP.
3. Many protocols FTP(file transfer protocol), SMTP(Simple mail transfer protocol)), Telnet Protocol, IMAP (Internet Message Access Protocol), etc rely on TCP.
TCP is used everywhere on the network where data needs to be transferred in a reliable way.
Which one is best UDP or TCP:
There is no fight between UDP & TCP as to choose which one is best. both are well suited for different use cases. there is no perfect protocol that exists or will be that has no limitations. UDP is well suited for real-time applications where the loss of packets is affordable and retransmission delay is not affordable. TCP on the other hand it makes sure that data is transported
correctly so users don't need to worry about data.
Did you know: HTTP and Websocket are built on top of TCP and it is used everywhere. Quic(Quick UDP internet connection) is a new protocol was built on UDP and it will be interesting to see TCP vs Quic in the future. it is intended to replace HTTP. I will think to cover Quic also in this blog not sure for now.
Final Note:
This is the first part of the blog and in the next part will write about the HTTP and Websocket that is built on top of TCP. it will be interesting to see how BOB the Server and PAUL the Client will explore the protocol.