Wayne Rash has been writing technical articles about computers and networking since the mid-1970s. He is a former columnist for Byte Magazine, a former Editor of InternetWeek, and currently performs technical reviews of networking, wireless, and data center products. He is the former Director of Network Integration for American Management Systems and is one of the founders of the Advanced Network Computing Laboratory at the University of Hawaii. He is based in Washington, DC, and can be reached at firstname.lastname@example.org.
Latency is always present in any network, but it can be controlled and managed. The secrets to managing latency are known, and they can be applied to any network, but doing so effectively depends on making the proper choices. By managing latency well, your network can support cloud services, off-site backup and mirroring, and e-commerce – adding to your company’s bottom line and making your operations more efficient with happier users.
After all this time, the story still brings hushed voices and sadness. It’s a story you’ve heard before: How on a perfect September morning the Boeing 767 airliner that was American Airlines flight 11 slammed into the North Tower of the World Trade Center ending thousands of lives in an instant. High in that tower were the offices of securities trader Cantor Fitzgerald and over 700 of the company’s employees.
Yet, despite a loss that would have killed most companies, Cantor Fitzgerald was in operation two days later when the markets reopened. The company was saved by a mirrored data center in nearby Rochelle Park, New Jersey. The data center was nearby because it had to be. The physics of latency limited how far away the data center could be located and still maintain mirroring. But those same limits also meant that the surviving IT staff could operate out of the second data center and keep the company running.
The Heart of Latency
Latency is simply delay. It is the period of time required for a data packet to travel from one point to another on a network. The existence of latency is governed by the laws of physics and because of that it cannot be eliminated completely.
However, latency can be kept to a minimum, and it can be managed.
A crude measurement of latency is available to nearly anyone with a computer: Run the Ping utility and watch the results as they scroll down your screen. As this paper was being written, for example, the Ping utility on a Windows computer reported a response time of between 24 and 42 milliseconds when the Ping packet was sent to yahoo.com. If you try this, you’ll notice that each reporting of the response time is different. While not every Internet site responds to Ping, testing those that do will show that times are rarely the same. Because of the way the Internet works, latency (as measured by Ping) can vary, sometimes widely.
Latency is the result of two factors. The first is the speed of light through the medium over which the data packet travels. (Like the t-shirt says: “180,000 miles per second. It’s not just a good idea. It’s the law!”) The second is the result of all of the various impediments that the data packet must overcome on its way. Those impediments can be any number of things: overloaded routers, changes in media between copper and fiber, network congestion that causes a packet to be held briefly before being transmitted, or even a packet being lost, requiring a retransmission.
This latency creates two problems in the data center. The first is simply the time delay that affects performance and response time. Virtualized systems are sensitive to latency when performing off-site replication, and depending on the virtualized system, they will reach a point where the delay becomes too great. Likewise, mirrored systems are sensitive to delay, and again if the delay is too long, mirroring can’t happen.
Worse is the unpredictable nature of latency, especially when using the public Internet. The time delay depends on a vast number of factors, nearly all of which are outside of your control (especially the likelihood of your users calling to complain). The lack of predictability is in the very nature of the Internet where each packet is routed over what seems to be the best pathway at that instant.
Causes of Latency
The root cause of latency is the speed of light through the network medium, normally glass fiber or copper wire. However, network design also plays a role in latency. Every time a data packet is processed in some way, or every time it transitions from one medium to another, there is some delay. Likewise, latency can be caused by changing from one protocol to another such as from Ethernet to TDM. While each individual delay can be quite brief, the delays are additive.
Networks with poor design (such as inefficient routers, unnecessary media transitions, or routing paths that lead through low-speed networks) add to latency. So do communications problems on inefficient servers, overloaded routers, and even physical issues with the media (a pinched cable, for example). And, of course, latency can be made worse if the network end points (ranging from workstations to storage servers) aren’t properly configured, are poorly chosen, or are overloaded.
When the public Internet is used as part of the network path, latency can get dramatically worse. Because you can’t prioritize data packets over the Internet, those packets can be routed through very slow pathways, congested networks, or sent over pathways that are much longer than necessary. For example, it’s not uncommon for an endpoint in Denver to find its data routed through New York on its way to Seattle.
Effects of Latency
Too much latency can cost a business money. In the world of e-commerce, a number of studies show that the longer a site takes to load, the more likely customers are to abandon the site and move to another.
Google’s Webmaster Tools measure the time it takes to load a site, a process that reflects the total system latency. Google recommends designing sites so that total latency be shorter than 1.5 seconds, including the time it takes the webpage to load. Any longer, and customers start to switch off.
Perhaps more important, Google reports that latency affects the order of search results. If your business’s Internet presence is too slow, you may not make it into the first page of results. But the impact of latency goes beyond just page loading.
For example, commercial cloud services are strongly affected by latency. A slow pathway to data causes the same sort of customer attrition as slow-loading websites, and cloud services that are consistently slow can lose the business its customers. If your employees need to pull information from a cloud service (public or private) and the data doesn’t appear within a reasonable time, at the very least it wastes their time.
Some types of data communications applications are relatively insensitive to latency. While they’re not immune to extreme latency, some tasks function, albeit less efficiently, in a high latency environment. These include transferring mail from a server, downloading stored files for use in updates, or transferring data to a place where it’s more useful. The total transfer time may be longer, but the data still gets where it’s supposed to be…eventually.
But in a more interactive environment, latency significantly reduces functionality. In some cases, high latency prevents communication from taking place at all. E-commerce, for example, can function in a higher latency environment, but result in lost business as customers go elsewhere. Cloud services work over a slow connection, but cause lost productivity as employees wait for data retrieval from a cloud storage server.
Data center and server mirroring, on the other hand, can only work in a low-latency environment. Exactly how much latency is too much depends on the specific virtualization or mirroring hardware and software involved; but in either case, if a specific level of latency is exceeded, the operation does not take place. Some operations such as offsite backups or co-located storage services can handle some latency, but increased latency slows operation and degrades the overall operation.
On a practical basis, physical distance has the most impact on mirroring and virtualization. Even if all other parts of a network are optimized for the minimum latency, the effects of distance can’t be changed. For data center mirroring operations, this distance turns out to be less than 30 miles, and depending on how the remainder of the components of latency are managed, it may be much less.
Latency in networks is caused by any of several parts of the network. The most obvious is the physical distance between endpoints. The distance is a factor because of the speed of light in the transmission medium. The speed of light in either copper or fiber is approximately two-thirds the speed of light in a vacuum. That in turn translates into 0.82 milliseconds for every 100 miles of fiber, according to a report by Cisco Systems.
Added to the propagation delay (the time it takes light to travel through fiber), are processing and serialization delays. Processing delays are caused by features in the network which manage traffic such as network address translation (NAT) or quality of service (QoS) management. Serialization delay is the amount of time it requires to assemble and transmit a data packet.
Processing delays in modern Ethernet switches are quite short, but they do exist. The average processing delay turns out to be around 25 microseconds per hop or more. Each endpoint, switch, router, firewall, or other network device that touches the data packet constitutes a hop. Depending on the design of the network equipment, some hops can take much longer than 25 microseconds.
Serialization delays depend on the size of the data packet and the transmission speed. The standard 1500 byte TCP/IP data packet can experience a serialization delay of 120 microseconds at 100 mbps to 1.2 microseconds for 10 Gigabit Ethernet.
Queuing delays are the time it takes a packet to actually leave wherever it is and travel through the network. Because each device on the network can detect network congestion, the only effective way to minimize queuing delay is to engineer the network to have less congestion. When you control the entire network, this can be accomplished; but when you’re using a public network such as the Internet, avoiding congestion isn’t possible.
Finally, there are delays caused by the protocol stack in each device. It takes time to create and address a TCP/IP packet, and more time is taken by the need to create and manage overhead traffic, such as acknowledgment packets and retransmission requests. Adding capacity to a network can help reduce congestion and the queuing delays, and as a result decrease the overall network latency.
Network latency cannot be eliminated. As mentioned above, some latency will always be present in any network due to the limitations of physics. But latency can be kept to a minimum, it can be managed, and most importantly, it can be made predictable. In many cases, the lack of predictability causes more problems than latency itself.
The following guidelines help you keep latency on your network under control, so that your long distance and metro area networks can support advanced network uses such as data center mirroring. These guidelines also help keep other types of latency, such as slow websites, from being a problem.
- Do not use the Internet as a pathway for communications where latency can create problems. This is especially the case for cloud services, mirroring, or even off-site storage and backup.
- Do not allow your network to become congested. This means careful design with adequate bandwidth along all portions of the network where latency is an important issue. This also means that you must eliminate slow network components such as software-based routers which can introduce significant latency.
- Do not confuse bandwidth with latency. You can add bandwidth simply by adding another network link, but that does not necessarily reduce latency. While sufficient bandwidth is necessary, some factors such as network design introduce latency.
- Network prioritization does not reduce latency, and in fact may add to it because it adds more processing to the data packet stream. However, it can control latency in some networks for latency sensitive tasks such as VoIP.
- Have as few network hops as possible when latency is an issue. In addition, process the data stream as little as possible. Each hop and each feature, such as QoS, adds latency.
- Ensure that endpoints operate at peak efficiency so that processing delays don’t add to latency. This is a frequent problem with Web servers that become overloaded and fail to respond quickly. While some management techniques such as load balancing can add their own latency, they can more than make up for this by moving network traffic away from congestion or overloaded endpoints.
- While latency and bandwidth aren’t the same thing, controlling latency means having enough bandwidth to prevent congestion, and it means having the right type of network infrastructure. A dedicated, point to point network with enough bandwidth is necessary to prevent latency from being a problem. For example, a 10 gbps Ethernet network has significantly lower latency than a 100 mbps Ethernet network because of a significant reduction in serialization delay.
- Always monitor your overall network latency. If the latency starts to increase, then you need to find out why before it gets out of hand.
While overall network latency may seem as if it’s a problem without a solution, the fact is that latency can be managed and kept relatively low with the right choices and the knowledge about what you can change and what you can’t. Perhaps more importantly, latency can have a real impact on your bottom line if it translates into customers abandoning your e-commerce site or lost productivity for your employees.
To manage latency you must first monitor it, and then adopt a network design that eliminates as much latency as possible where it’s necessary. This may mean using a dedicated network connection for sensitive functions such as data replication, data center mirroring, or cloud services, while managing the latency for your public website or e-commerce site. While latency can never be eliminated, you can manage it so that latency does not interfere with your company’s operation.