
Survivable Networks
|
The 2005/06 program year for Network Systems aims at continued leadership and innovation in areas of basic network architecture, and problems related to the planning, design, evolution, operation and performance analysis of optical, MPLS, and SONET-based transport networks. Projects typically involve a mix of experimental network design or simulation studies, optimization tools such as AMPL/CPLEX and custom heuristics, and analytical or theoretical studies. The research program has received international recognition for leadership in the areas of self-organization of networks, high-availability service provisioning and network design, and origination of highly novel networking concepts such as p-cycles and distributed mesh restoration. Most recent outcomes include development of the protected working capacity envelope (PWCE) concept for dynamic automated provisioning of survivable lightpaths, and failure independent path protecting p-cycles (FIPP p-cycles), an extension of p-cycles to provide fully pre-configured and capacity-efficient end-to-end lightpath protection. High levels of expertise and ongoing research are also maintained on many basic topics of network architecture and transport technologies such as ring, mesh, and hybrids at IP, ATM, SONET and DWDM layers, evolution planning and design optimization. Protected Working Capacity EnvelopeAn exciting recent development from our program is the concept we now call protected working capacity envelope (PWCE). The concept was formally proposed to the research community in an article entitled “The Protected Working Capacity Envelope Concept: An Alternate Paradigm for Automated Service Provisioning” in the January 2004 issue of IEEE Communications Magazine. What is exciting about this concept is how greatly simplified and much more scalable it appears to be than current IETF-driven approaches to automated dynamic provisioning of protected lightpath (or other types of path) services. Under a given distribution of spare capacity, locally acting protection or restoration schemes create an “envelope” of protected working channels. Dynamic provisioning within this envelope is simplified to a shortest-path routing problem and (depending on the mode of operation) requires little or no dissemination of state changes on a per-connection basis. A key research interest in the coming year is to devise means for on-line adaptation of the capacity envelope to match evolving and uncertain demand patterns. An important property is that nothing needs to be done to arrange protection for services on the per-connection time-scale, other than routing the service itself. Arbitrarily fast-paced demand arrivals and departures can be accommodated within a static distribution of spare capacity. Adjustments to the envelope itself are required only on the time-scale on which the statistical parameters of the random demand itself change. This may provide an inherently more scalable solution that is less database-dependent than SBPP, and at the very least is an additional service modality that can be offered to customers. p-CyclesThe most recent major p-cycle development we’ve made is a new technique for optical network protection called failure independent path-protecting (FIPP) p-cycles. The method is based on an extension of p-cycle concepts to retain the property of full pre-connection of protection paths while adding the property of end-to-end failure-independent path protection switching against either span or node failures. An issue with applying the popular method of shared-backup path protection (SBPP) to an optical network is that spare channels for the backup path must be cross-connected on the fly upon failure. It takes time and signalling to make the required cross-connections but more importantly until all connections are made it is not actually known if the backup optical path will have adequate transmission integrity. Thus, speed and optical path integrity are important reasons to try to have backup paths fully pre-connected before failure. With fully pre-connected protection, not only can very fast restoration be attained but the optical path engineering can also be assured prior to failure. FIPP p-cycles support the same failure-independent, end-node activated switching of SBPP but with the fully pre-connected protection path property of p-cycles (contrary to how it is often viewed in the industry, SBPP is not a pre-connected strategy). As a fully pre-connected and path-oriented scheme, FIPP p-cycles are therefore potentially more attractive for optical networks than SBPP. Results confirm that FIPP p-cycle network designs will exhibit capacity efficiency that is characteristic of path-oriented schemes, and may be as capacity efficient as SBPP. More conclusive comparisons on larger scale networks await further study. Node-Encircling p-CyclesIn early p-cycle work, node-encircling p-cycles were introduced as a method of providing recovery from router failures in an IP network. The general idea is that any p-cycle that passes through all nodes immediately adjacent to some single node is able to protect all traffic routed through that node. Having recently revisited this basic architecture, it is quite clear that the concept can easily be extended to provide node-failure recovery (or transiting lightpaths) in an optical network as well. Since the node-encircling p-cycle is configured to pass through every neighbouring node of the node it encircles, then any lightpath that transits the failed node (i.e., doesn’t originate/terminate at it) can break into the cycle on either side of the failed node and be routed around the failure. When designing networks to be fully restorable to single span failures, some p-cycles will undoubtedly be node encircling and thereby provide protection of a limited amount of transiting lightpaths, but this won’t likely be widespread. We are now beginning to investigate how widespread such node-encircling relationship are, and what levels of node-failure restorability they provide. Imposing a requirement that some p-cycles in a network are node encircling could increase node-failure restorability levels, but we expect that a loss of optimality will likely result. Extending this requirement to the extreme, where each node has sufficient node-encircling p-cycle relationships to ensure full node-failure restorability, would be the best option from a reliability/availability point of view, but would almost certainly be cost prohibitive. One possible trade-off we can make is to re-configure the network in some way so that node-encircling p-cycles emerge when needed to respond to node failures. Options we are currently investigating include simply dissolving existing span-protecting p-cycles and re-using their capacity to configure node-encircling p-cycles, and designating peering points where several span-protecting p-cycles can be combined by activating pre-planned cross-connections within the peering points such that node-encircling p-cycles are formed. Network EvolutionIn the area of network evolution studies we are continuing to pursue ideas of value to network operators. A particularly interesting area is how to evolve from existing ring-based networks direct to a flexible and efficient multi-service network based on conversion of existing rings into p-cycles. This includes studies at both the level of basic network architecture and optimal evolution strategy as well as at the equipment level of how ADM-based rings could be converted in-situ to p-cycles, etc. Future-Proof NetworkingThe Network Systems research program is also concerned with several aspects of future proof network planning. The main problem is: How do we plan in a way that somehow maximizes the likely usefulness and future value of what we deploy now, even though we are uncertain about future technology and future demand patterns and levels? The multi-services and multi-layer themes of our research program are developing new techniques and concepts to underpin new transport service class concepts such as pre-booked versus “walk-up” dynamic lightpath services, or rearrangeable services. This line of work also contributes to methods for operators to support customer Service Level Agreements (SLAs) with greater confidence, especially regarding ultra-high availability services. Ultra-High Availability ServicesIn the area of ultra-high availability services, something we have come to understand is that mesh-based survivable networks can actually provide priority path services that have higher than 1+1 APS availability (and at much less cost to the network operator). This striking claim is realized under the “1FP-2FR” concept for operating a survivable mesh. It stands for first-failure protection, second failure restoration. This operational concept employs an embedded distributed restoration algorithm that continually self-plans current up to date protection reactions for every node against single failures. Such preplans allow a very fast “protection” reaction where all routes are known ahead of time. But upon multiple failures, the same embedded multiple path discovery protocol acts adaptively in real-time to maximize the secondary restorability of priority services. Such an adaptive second failure response gives higher than 1+1 availability to priority paths. Our research will also include simulation-based demonstrations of this concept and its integration into very efficient multiple Quality of Protection services classes. Disaster RecoveryNSERC and the Department of Public Safety and Emergency Preparedness Canada (PSEPC) recently identified the national telecommunications system as one of Canada’s “ten most critical infrastructures” and have teamed to fund the Joint Infrastructure Interdependencies Research Program (JIIRP). The stated aim is to “produce new science-based knowledge and practices to better assess, manage, and mitigate risks to Canadians from critical infrastructure.” This is an opportunity for a slightly new view on the problem of network survivability, which we will also be studying this year. While a great deal of work has already been done on survivability it usually has assumed that the potential failure scenarios are specifiable in advance (say, any single spans and nodes, or in some cases, combinations of no more than two or three spans at a time where they share a physical duct structure). In contrast, determining how the network can be dynamically reconfigured in the event of arbitrarily widespread and catastrophic failure to maximize survivability, particular of critical services, is a considerably different issue. With current restoration methods, working lightpaths unaffected by the failure remain untouched by the restoration process so as to minimize disruption to the network’s customers; only those whose lightpaths are directly struck by the failure are affected, while all others remain intact. But with catastrophic failures, the greatest concern will be for critical (i.e. emergency) communication services. All non-critical demands of lower priority can be considered preemptible, and any pre-failure traffic, including that of critical systems, would be allowed to be rerouted as needed. Network-wide reconfiguration with the goal of maximizing the post-failure survivability of critical traffic would be the ultimate goal. Distributed self-organizing responses may be particularly well suited to this type of global reconfiguration.
|

Member Login
Home
Site Map
Contact Us

Be first to hear the latest - subscribe to our eNewsletter