A Matter Of Milliseconds: Optical Mesh Networks Recover Rapidly

FAYETTEVILLE, Ark. - Most people have no idea how much they rely on optical data networks in their daily lives until a cut in a cable or a faulty piece of equipment causes a split-second interruption in data flow. The massive power outage that crippled much of the northeastern United States in August demonstrated both the incredible speed that optical data networks require to recover from fault incidents and the serious consequences that an interruption of even a few seconds can have on critical systems.

To prevent catastrophic failures, network designers rely on highly redundant - and expensive - ring networks. But University of Arkansas computer engineer Kazem Sohraby has demonstrated that properly designed mesh networks can be more cost-effective and just as reliable.

"Although optical mesh networks are more cost-effective and easier to expand, the issue has been restoration time," explained Sohraby. "The restoration process is somewhat more complex and time-consuming for mesh networks than for more redundant configurations. We have shown that mesh networks with a proper link restoration design can achieve the same extremely short restoration time as today’s ring-shaped Synchronous Optical Networks (SONET)."

Sohraby, professor and department head of computer science and computer engineering, conducted his study with Kamala Murti and Ramesh Nagarajan from Lucent Technologies’ Bell Labs. He presented their findings recently at the National Fiber Optics Engineers Conference in Orlando.

A ring or mesh optical data network can cover a metropolitan area, an entire state or a larger region. Service interruptions can occur from faults in the equipment or a cut in a cable. Standard acceptable time to restore network performance is 50 milliseconds. Until now, this restoration time has limited the adoption of mesh networks.

Although there are many kinds of data networks, most are ring type like the SONET ring. This system actually has two circular, linked networks, one inside of the other. If an interruption occurs in the one of the rings, network traffic can be rapidly re-routed to the other ring, and vice versa. Ring networks essentially require building two complete networks. But since either network must be able to carry the load of both, neither can run at full capacity.

A mesh network has many interlaced routes with nodes placed at various points in the mesh. If an interruption occurs between two nodes, the network re-routes traffic according to a routing scheme. Sohraby evaluated several of these schemes to determine which would give the shortest restoration time.

The redundant equipment and limited capacity make ring networks expensive. In addition,. many of the existing networks have been in place 5-10 years and are facing the need to upgrade. But their configuration makes it difficult and expensive to expand the ring as the service area grows and demand increases. The grid-like optical mesh networks overcome many of the limitations of ring networks. Since they are not redundant, the initial cost is lower and there is less wasted capacity.

"In the beginning, it was about the ability to move information 1000 times faster," Sohraby said. "Now optical networks have become a cost issue. To remain competitive, Internet service providers and telecommunications companies must reduce expenses."

To determine if restoration times for a mesh network could be reduced to the same as a ring network, the researchers developed a simulation model based on seven real-time optical mesh networks. They then tested a variety of configurations to determine the performance of these networks and identify dependencies.

They determined that the critical parameters for mesh network performance and restoration are network topology (the distance between nodes), the number of detours required to avoid the problem area and the amount of traffic on the network. While detection of a problem is essentially the same for ring and mesh networks, the recovery methods are very different.

"The central issue is smart design. How much capacity do you allocate?" explained Sohraby. "Nodes must be spaced appropriately and be intelligent enough to know there is a problem and what to do so they can effect real-time repairs."

Contacts

Kazem Sohraby, professor and department head, computer science and compute engineering; (479) 575-6197; sorahby@uark.edu

Carolyne Garcia, science and research communication officer, (479) 575-5555; cgarcia@uark.edu

 

News Daily