Saturday, September 24, 2011

Nagle's Algorithm - Intuitive understanding.

In TCP / IP stack, at every layer some header information would be added up to make sure the information is identifiable at the corresponding layer in the destination node. For instance, at TCP Layer, TCP header would be added, which contains essential information to make sure TCP segments reach properly at the destination and congestion control is provided, etc. Without TCP header, these functionalities would be handled at the upper layer, say application layer which mainly deals with real data being sent.

Layered architecture imposes disparate responsibilities for layers. App layer deals with semantics of data. Transport deals with transport of data in a reliable way. IP to pass packets to the next hop, etc. So, real data will be treated in different ways in different layers. In Transport layer, TCP, data unit is Segment, In IP the same is Packet. This leads to varying level of treatments of the same information.

Applications can't determine how data would be transmitted to the destination. This in several cases wouldn't be a problem. For instance, file transfer would send large amount of data to TCP layer and TCP would transfer the same to the destination in FIFO, first in first out, order. TCP would internally divide the real file into several chunks whose size matches, MSS, Maximum segment size. This behavior is neat since chunks will be sent in FIFO and chunk size depends on several optimal network parameters like destination node's capability, network channel's bandwidth, etc. As long as file is being transmitted in a reliable way, this approach or algorithm is optimal. Only overhead in segment based transfer is TCP header along with IP header which would be 40 bytes in general. If MSS is 1000 bytes, this overhead is acceptable, since always segment size will be of MSS. Let us consider a Telnet session in which command data will be less than Header size. In this scenario, overhead is unacceptable as it is far bigger than real data being passed. If the real data being sent is 1 byte, then it overhead is of 4000%. In low bandwidth environment, this is a bottleneck.

The Nagle's algorithm solves this issue by buffering, called Nagling, the data to be sent. Algorithm is very intuitive. If the buffered data is enough to fit in a segment of size MSS, then it would be sent across; If all the sent data are already acknowledged, no need to buffer the current data at all.

Let us see how this algorithm works.

The issue discussed is not at all a problem in High bandwidth environments and data should be passed however low is the size of data. In several real time systems, this is an essential behaviour. In high bandwidth environments, most probably all data sent across would be acknowledged immediately. So, Nagling wouldn't happen.

In Low bandwidth environments, the data would be accumulated by TCP until the segment size of MSS, and send across. This would help to avoid overhead what we had discussed. The real issue with Nagling comes up where real time low data traffic should be sent in Low bandwidth environments. In this scenario, data transfer wouldn't happen immediately but buffered, which would lead to irritating user experience.

One more issue with Nagling is "Delayed ACK"s from the sender side which would delay the acknowledgements of Segments for certain time to optimize the network utilization.

Delayed ACKs' idea invalidates the Nagling optimization heuristic, since the latter depends on getting ACK to decide the buffering. Delayed ACK would make the Nagling happen always independent of network bandwidth but Nagling should happen only in Low bandwidth network. Nagle Algo's heuristic is not applicable in case Delayed ACK is enabled in Destination.

Almost all TCP implementations would allow to disable Nagling using TCP_NODELAY option, even though it is not suggested. In application level, the issues of Nagling could be resolved in several ways as explained in WikiPedia page,

Thanks for Reading.

No comments:

Post a Comment