Network Data Transmission Tuning

Overview

TCP/IP tuning is considered an advanced subject. The background information necessary for a complete understanding of the subject is beyond the scope of this document. If misconfigured, the product configuration options for data transmission tuning can have a negative impact on the network performance of product components as well as on the TCP/IP network.

Product components that support bulk data transfers benefit the most from the network tuning described in this section. For example, Universal Data Mover provides the ability to transfer very large files between systems, so it would be a candidate for network tuning.

The network tuning technique described in this section addresses the problem of transferring data over certain types of transmission links, primarily large bandwidth, high latency transmission links. Today's transmission links can exceed 1 Gbit/s with a round trip time of 50 ms or more. The default TCP/IP buffers are not suitable for optimized data transmission over such links.

Bandwidth Delay Product

The bandwidth delay product (BDP) measures the amount of data that a transmission link holds. The BDP is used to help tune specific TCP/IP configuration options.

The BDP is calculated as the product of the maximum bandwidth and the round trip time (RTT) of the transmission link. BDP is expressed in bytes. The maximum bandwidth of a link is limited by the slowest part, or bottleneck, of the network route. As an example, consider a network route that starts on a server with a 100 Mbit/s network interface card that is connected to a 1 Gbit/s network and ends on a server with a 1 Gbit/s network interface card. The maximum bandwidth is the slowest part of the route, which is the 100 Mbit/s network interface card. There is no reliable way to measure bandwidth in all cases. A knowledge of the network topology is required to know the slowest part of a network route. The RTT is measurable using the ping command. Simply use the ping command to "ping" the remote destination and it will report the RTT in milliseconds.

The BDP formula is shown below.

( B / 8 ) * ( T / 1000 )

where, B is the bandwidth measured in bits per second and T is the RTT measured in milliseconds.

As an example, if the maximum bandwidth is 1 Gbit/s and the RTT is 60 ms, the BDP is calculated as

( 1,000,000,000 / 8 ) * ( 60 / 1000) = 7,500,000 bytes = 7.5 MB

TCP High Performance Extensions

Originally, TCP/IP was not optimized for transmission links with a high bandwidth delay product (BDP). RFC 1323 TCP Extensions for High Performance introduced changes to the TCP protocol to improve performance over high BDP links. RFC 1323 includes a number of TCP changes, but the most relevant one for this discussion is the window scaling option.

The TCP receive window size is negotiated by the TCP implementations during the three-way handshake when the connection is opened. The window specifies the amount of buffer space the receiving TCP has available for data. The TCP sender does not send any more data then the receiver's available window. The TCP window is a form of flow control to prevent the sender from sending more data then the receiver has buffer space available.

The TCP receive window is defined in the TCP header as a 16 bit field, so the maximum window size is 65 KiB. For a transmission link with a large BDP, this is only a fraction of the amount of data the transmission link will hold. Consequentially, the transmission link will never fill to capacity and maximum bandwidth never achieved. RFC 1323 added the window scaling option so that a larger TCP window can be negotiated. The window scaling option makes the TCP window size effectively a 32-bit value, however RFC 1323 does limit it to 1 GiB. The TCP implementations on both sides of the socket connection must support window scaling for it to be used.

TCP Buffers

The TCP receive window used by the TCP on the receiving end of a connection is typically determined from the size of the TCP receive buffer used for the connection. The default TCP receive buffer can typically be configured as part of the TCP configuration. However the default is not typically large enough for high BDP transmission links. The TCP socket API provides an interface for the application to request specific TCP receive and send buffer sizes. The application can request any buffer size and TCP will determine what size it actually uses based on its configuration limits. If TCP on both ends of the socket connection support RFC 1312 window scaling, the TCP window may be as large as 1 GiB if the TCP configuration permits.

TCP Buffer Configuration

In general, the optimum TCP buffer size matches the BDP for the transmission link. However, TCP buffers are maintained by TCP in virtual storage. Very large buffer sizes may actually reduce transmission rates if the virtual storage requirements exceed the system memory capabilities.

Some product components provide configuration options to specify the TCP send and receive buffer sizes. Configuration options TCP_RECV_BUFFER and TCP_SEND_BUFFER specify the TCP receive and send buffer sizes, respectively. The product components on both ends of the TCP socket connection must be configured using the TCP_RECV_BUFFER and TCP_SEND_BUFFER options. Product components typically consist of a Manager component (such as UDM Manager) and a Server component (such as UDM Server). The actual connection between the Manager and Server is established first with the Universal Broker component. A Manager component first establishes a socket connection with the Broker which then starts the Server component and passes the socket connection to the Server. Consequentially, the Universal Broker will always require TCP buffer configuration changes in order to tune network performance of product components.