MPLS Traffic Engineering Design
Because of the significant bandwidth cost in Globenet's global capital expenditure (capex) budget, the ability to efficiently traffic-engineer the core network was a crucial aspect of the design. To that end, Globenet conducted a detailed study to evaluate the most appropriate bandwidth optimization technique to deploy in each region, taking into account the traffic matrix, current link utilizations, and traffic growth forecasts. The objective was twofold. Globenet wanted to efficiently optimize the bandwidth to delay any link upgrade. Also, as explained in the "Quality of Service Design" section, Globenet wanted to meet the strict QoS commitments of the five CoSs, both in steady state and under any single network element failure. It would do this through proper cooperation of traffic engineering admission control and constraint-based routing features with QoS mechanisms.The cost of bandwidth significantly differs from one region to another. Hence, in North America, where bandwidth costs are relatively cheap compared to the AsiaPac region, Globenet could afford to overengineer the network core by deploying a network made of OC-3 and OC-48 links. This helped it maintain sufficient margin before reaching any point of congestion, even during failure. Because of the North America region's bandwidth-rich topology, traffic engineering generally was not required. However, when necessary, it could be achieved by means of IGP metric optimization.In the South America, EMEA, and AsiaPac regions, more careful bandwidth provisioning could yield very significant cost savings. Hence, in those regions, Globenet elected to use DiffServ-aware MPLS Traffic Engineering (DS-TE) to ensure that the load of EF traffic is kept below 30 percent of link capacity. At the same time, the aggregate load is kept below 100 percent of link capacity (plus some overbooking on higher-speed links), even under single-failure situations. Hence, in these regions the EF traffic and the non-EF traffic are carried on separate meshes of DS-TE tunnels belonging to different DS-TE class types (CT1 and CT0, respectively) so that they can be subject to different bandwidth constraints. As discussed in the section "QoS Design in the Core Network in the EMEA, AsiaPac, and South America Regions," Globenet uses the Russian Dolls Model so that the EF tunnels (CT1) are limited to their own bandwidth constraint, BC1. The non-EF tunnels (CT0) are limited together with the EF tunnels (CT1) to a shared bandwidth constraint, BC0. Also, in South America, Globenet uses DS-TE to ensure that the EF tunnels are routed only over terrestrial links and not over satellite links.Globenet considered using MPLS Traffic Engineering within each region or across all regions. Within each region, meshes of TE tunnels would be provisioned on a per-region basis. Across all regions, a global mesh of TE tunnels between routers residing in different regions would be set up. The number of TE tunnels in a mesh equals n * (n 1), where n is the number of routers (remember that TE LSPs are unidirectional). Therefore, a per-region MPLS TE mesh was considered the optimal compromise, allowing Globenet to efficiently engineer the traffic in each region without requiring a considerable number of TE tunnels.In terms of TE tunnel provisioning and configuration, an important requirement was to rely on dynamic network-centric mechanisms to reduce any risk of human error. Furthermore, considering the traffic growth and potential increase in the number of routers in their network, the addition of new equipment was not considered a rare event. Consequently, there was a requirement to automate the creation of any new TE LSPs in the network.In each region except North America, two full meshes of TE LSPs between the P routers are required because DS-TE is used. In North America, where MPLS Traffic Engineering is used for the sole purpose of fast protection, a single full mesh of TE LSPs is required between the P routers. All these meshes of TE LSPs are automatically provisioned by means of the auto-mesh feature [see AUTO-MESH].Another key aspect of MPLS Traffic Engineering is the ability to effectively determine the required bandwidth for each TE LSP. An important property of such an international network is that each region span several time zones. This make a dynamic TE LSP sizing strategy quite attractive and efficient in terms of bandwidth consumption. Indeed, if the TE LSP sizes are dynamically adjusted based on the actual traffic demand, they will not have to be sized based on their peak demand but on the exact required bandwidth at any point in the day, rendering their overall placement significantly more efficient. Indeed, because the TE LSPs do not face their peak demand simultaneously because of the presence of different traffic types and multiple time zones, such a dynamic strategy allows for the avoidance of oversized TE LSPs, which leads to shortest TE LSP paths and consequently to a better QoS. Note that this also allows for more efficient network resource usage. This was one of the motivations for using some dynamic TE LSP size adjustment techniques for the TE LSPs (expect in North America, where zero-bandwidth TE LSPs are used for Fast Reroute purposes only). Another motivation was that this avoided the challenge of accurately predicting and tracking the traffic matrix, which is particularly difficult in some parts of the network that have a low and/or fast-growing customer base.In terms of network recovery strategy, Globenet had a rich QoS service portfolio targeting telephony, videoconferencing, and mission-critical applications. Also, its IP/MPLS backbone in North America and some places in Europe carries ATM traffic. Therefore, Globenet decided to use MPLS Traffic Engineering Fast Reroute for all its services and networks in the various regions. Some statistics gathered during the last decade on its ATM and IP networks showed that more than 90 percent of its unplanned failures were link failures. Thus, Globenet decided to deploy Fast Reroute to protect against link failures only.
Setting the Maximum Reservable Bandwidth on Each Link
The first task when enabling DiffServ-aware MPLS Traffic Engineering is to configure the bandwidth constraints on each TE-enabled link. Note that when regular TE is used, the single bandwidth constraint is also called the Maximum Reservable Bandwidth. The operator can configure these to any arbitrary value. They do not have to be identical to the actual link speed.In the case of North America, MPLS Traffic Engineering is used only for Fast Reroute. Thus, each TE LSP has a bandwidth of 0 and consequently follows the IS-IS shortest path. Although each link could have been configured with a Maximum Reservable Bandwidth of 0 (BC0=0), Globenet decided to set it to the actual link speed in case it needed to use nonzero-bandwidth TE LSPs in the future in that region.By setting the bandwidth constraint above or below the actual link speed (or the allocated service rate of the DiffServ queue corresponding to the constrained class type), the operator can enforce overbooking or underbooking policies. This over/underbooking approach is called link over/underbooking. It allows over/underbooking ratios to be fine-tuned on a per-link basis (or on a per-type-of-link basis) but not on a per-LSP basis. Note that the operator can also enforce over/underbooking by factoring an over/underbooking ratio when determining the tunnel size in relation to the traffic demand. The latter approach is called LSP over/underbooking. In contrast, it allows over/underbooking ratios to be fine-tuned on a per-LSP basis but not on a perlink basis.QoS Design in the Core Network in the EMEA, AsiaPac, and South America Regions," the bandwidth constraint (BC1) applied to the EF TE LSPs is configured to 30 percent of link speed, and no overbooking is applied on those.
Actual Link Speed | BC0 (Reservable Bandwidth for CT0 + CT1) | BC1 (Reservable Bandwidth for CT1) |
---|---|---|
Less than 2 Mbps | 1.0 * link speed | 0.3 * link speed |
2 Mbps to 40 Mbps | 1.0 * link speed | 0.3 * link speed |
OC-3 | 1.05 * link speed | 0.3 * link speed |
OC-48 | 1.10 * link speed | 0.3 * link speed |
Example 5-18. OC-3 Configuration Template
For the sake of simplicity, all the configuration examples shown in this chapter mention raw link bandwidth. However, Globenet takes into account the available payload for each medium and uses this effective bandwidth for TE purposes. For instance, the link speed of an OC-3 link is equal to 155.44 Mbps, but because of the SONET overhead, just 149.76 Mbps is available for the traffic payload. Furthermore, additional protocol overheads must be taken into account (for example, the PPP overhead) to precisely evaluate the effective bandwidth available for the traffic on each medium.
interface pos3/0
ip rsvp bandwidth 163000 subpool 46500
!
Automatic Setup and Provisioning of a Full Mesh of TE LSPs
The auto-mesh mechanism, which allows for the automatic provisioning of full meshes of TE LSPs, was introduced in Chapter 2. As a reminder, this functionality consists of three main components:Discovery process IGP extensions allow each router to discover the other members of the mesh or meshes it belongs to. Each mesh of TE LSPs is identified by a mesh group number.Local template configuration For each mesh, a TE template is locally configured that specifies the set of TE LSP attributes for each TE LSP of the mesh.Automatic TE LSP setup For every other member of the mesh the router belongs to, it triggers the automatic provisioning of a corresponding TE LSP whose characteristics are specified in the TE template.
A new generic IS-IS Type-Length-Value (TLV) named Router CAPABILITY is defined in [ISIS-CAPS]. It allows each router in a routing domain to advertise some of its capabilities within a single level or to the entire routing domain by means of a TLV leaking procedures. One use of such a TLV is to advertise the property of a router to participate in a TE LSP mesh. This is achieved by carrying an additional specific sub-TLV named TE-MESH-GROUP (defined in [ISIS-TE-CAPS]). The format of the Router CAPABILITY TLV and TE-MESH-GROUP sub-TLV are shown in Figure 5-38.
Figure 5-38. IS-IS Router CAPABILITY TLV and TE-MESH-GROUP Sub-TLV Format
[View full size image]

Region | Mesh Group Number (Non-EF Tunnels) | Mesh Group Number (EF Tunnels) |
---|---|---|
North America | 100 | |
South America | 200 | 201 |
EMEA | 300 | 301 |
AsiaPac | 400 | 401 |
Example 5-19. Configuration of Auto-Mesh TE on a Router
A similar configuration is adopted for the mesh group 201.On a Cisco router, the addition of a new member is automatically detected by every other member of the same mesh group, and the mesh of TE LSPs is adjusted accordingly. Similarly, when a member leaves the mesh group, the corresponding set of TE LSPs is removed from the mesh without requiring any manual intervention.
Router(config)#mpls traffic-eng auto-tunnel mesh
Router(config)#router isis
!
!Configuration of the set of mesh-group the router belongs to (200 in this
!example)
Router(config-router)#mpls traffic-eng mesh-group 200 loopback0
!
!Create a template interface:
Router(config)#interface auto-template 1
!
!Specifies a mesh-group that a template interface uses to signal
!tunnels for all mesh-group members
Router(config-if)# tunnel destination mesh-group 200
!
!Create the interface template for mesh-group 200
Router(config)#interface auto-template 1
Router(config-if)#ip unnumbered loopback 0
Router(config-if)#tunnel mode mpls
Router(config-if)#tunnel mpls traffic-eng autoroute announce
Router(config-if)#tunnel mpls traffic-eng priority 4 4
Router(config-if)#tunnel mpls traffic-eng auto-bw
Router(config-if)#tunnel mpls traffic-eng path-option 1 dynamic
!
!
Dynamic Traffic Engineering LSP Bandwidth Adjustment
Because of the presence of multiple time zones in some regions, sporadic traffic patterns, and the need for efficient bandwidth usage, Globenet elected to use a dynamic TE LSP size adjustment mechanism. (Bandwidth adjustment does not apply to the North America region, where zero-bandwidth TE LSP is used.) As explained in Chapter 2, the basic principle of such a mechanism is to dynamically adjust the required bandwidth of each TE LSP based on the actual traffic demand. This is determined by the observed amount of traffic sent onto the TE LSP in question. The router monitors the amount of traffic sent onto each TE LSP every X minutes (where X is configurable). Then it automatically computes the average bandwidth for that TE LSP (this is also called a bandwidth sample). A second parameter called resizing frequency determines how often a TE LSP is resized. For instance, if the resizing frequency is set to Y minutes, every Y minutes each router evaluates the new bandwidth that must be signaled for each TE LSP. The current algorithm used on a Cisco router consists of selecting the highest-bandwidth sample over the elapsed period of Y minutes and resizing the TE LSP accordingly should the new computed bandwidth be different from the current one.NoteIf no path satisfying the new bandwidth constraint can be found, the current TE LSP bandwidth is maintained, and the TE LSP is not modified. Instead, the bandwidth is reevaluated at the next resizing period.There is, of course, a delicate trade-off between accurate sizing and signaling frequency. Indeed, each router can be configured to collect samples and resize each TE LSP very frequently. Setting X to a small value has the effect of carefully monitoring the amount of bandwidth used on each TE LSP. (Note that in this case, every traffic peak is not really averaged and would be reflected in the sample.) If a small value is chosen for Y, each TE LSP is resized on a frequent basis. Thus, small X and Y values allow for an accurate bandwidth reservation of each TE LSP, reflecting the actual traffic pattern.The downside of such parameter settings is the signaling overhead and potential network instability. Consider a region composed of two meshes with 50 routers. This leads to a total of approximately 5000 TE LSPs. If X and Y are configured to 1 minute and 10 minutes, respectively, each sample is collected every minute, and every TE LSP is resignaled with the new bandwidth value every 10 minutes. Every router then resignals on average ten TE LSPs per minute, leading to an average of 500 TE LSPs per minute (about eight TE LSPs resignaled per second). Furthermore, each router resignals all its TE LSPs every 10 minutes with some possible global synchronization and its well-known network effects (there would unavoidably be bursts of TE LSPs resignaling, leading to various undesirable race conditions). This highlights the fact that more-conservative timer values should be chosen. On the other hand, these values should not be too large, because they would result in TE LSP oversizing compared to the actual demand. This might lead to tunnels following non-shortest paths, thus affecting the delay for the traffic forwarded onto those TE LSPs. Furthermore, this may also lead to TE LSP undersizing, where more traffic may flow on TE LSPs than their reserved bandwidth. That said, even if the resizing period is set to a large value, such a strategy is significantly more efficient than any other LSP sizing strategy based on the absolute or 95 percentile peak.Moreover, another benefit of such a dynamic TE LSP adjustment mechanism is that it allows traffic growth to be taken into account (which is different for the two class types) without requiring any manual intervention. Thus, Globenet made various simulations to evaluate the trade-off between sample collection frequency, resizing frequency, and bandwidth utilization efficiency and signaling load.Three sets of values for X and Y were studied, as shown in Figures 5-39, 5-40, and 5-41. The graphs in these figures show the actual bandwidth used by a TE LSP compared to the reserved bandwidth as well as the total number of TE LSPs signaled in the network.
Figure 5-39. Reserved Versus Actual Bandwidth on a TE LSP with X = 5 Minutes and Y = 15 Minutes
[View full size image]

Figure 5-40. Reserved Versus Actual Bandwidth on a TE LSP with X = 10 Minutes and Y = 60 Minutes
[View full size image]

Figure 5-41. Total Resizing Frequency in the Network with X = 10 Minutes and Y = 120 Minutes
[View full size image]

Additional Resizing Parameters
Example 5-20 shows how both the sampling and the resizing periods can be configured on a Cisco router. It is worth mentioning that other parameters exist, such as the minimum and maximum bandwidth a TE LSP can be dynamically resized to. Globenet decided not to configure these parameters to keep its TE LSP size as close as possible to the actual load.
Example 5-20. Configuration of a TE LSP with Auto-Bandwidth
In this example, the sampling and resizing frequencies equal 5 minutes (300 seconds) and 15 minutes (900 seconds), respectively.
!
Router(config)#interface auto-template 1
Router(config-if)#ip unnumbered loopback 0
Router(config-if)#tunnel mode mpls
Router(config-if)#tunnel mpls traffic-eng autoroute announce
Router(config-if)#tunnel mpls traffic-eng priority 4 4
Router(config-if)#tunnel mpls traffic-eng sub-pool 10
Router(config-if)#tunnel mpls traffic-eng auto-bw frequency 900
Router(config-if)#tunnel mpls traffic-eng path-option 1 dynamic
!
Router(config)#mpls traffic-eng auto-bw timers frequency 300
!
Additional Advantages of Dynamic TE LSP Resizing
In addition dynamically adjusting the reserved bandwidth to the actual traffic demand to efficiently use the network bandwidth, Globenet developed some internal scripts that gather the set of bandwidth samples on a few representative TE LSPs. This way, accurate traffic matrices could be collected, which helped further adjust the network parameters and make accurate forecasts. Note that such scripts could also have been developed without the use of the auto-bandwidth feature, but because auto-bandwidth computes such samples, the scripts simply have to collect the sample values.
TE LSP Path Computation
The path taken by TE LSPs is computed by means of a dynamic CSPF algorithm on every headend router. Based on various experiments, Globenet discovered that the CSPF computation time was negligible (on the order of a few milliseconds). Thus, having to perform path computation for every CT1 TE LSP once every 15 minutes and once an hour for the CT0 TE LSPs (in addition to unpredictable network element failure events) did not have any noticeable impact on Globenet's router's CPU cycles. It also allowed for a high degree of flexibility.
MPLS Traffic Engineering in North America
As mentioned, the motivation for deploying MPLS TE in North America was exclusively for the fast restoration capability upon link failure. Because MPLS TE was deployed for Fast Reroute only in this region, Globenet had the option of choosing between using one-hop tunnels (as described in Chapter 3) and using a full mesh of unconstrained TE LSPs. The latter option was chosen. This decision was motivated by the following:Considering a nonnegligible traffic growth in the North America region, Globenet did not want to preclude the use of MPLS for bandwidth optimization in case it needed to deal with network resource optimization in the future. Thus, in this case, the use of a full mesh of unconstrained TE LSPs offered an easy migration path.A full mesh of TE LSPs allowed Globenet to easily gather traffic matrix information among its POPs.
Consequently, the chosen TE design in North America consisted of deploying a full mesh of unconstrained TE LSPs between Globenet's set of P routers (30 in total because each of the 15 POPs had two P routers). Auto-mesh was used for the provisioning of those TE LSPs with the TE template shown in Example 5-21.
Example 5-21. TE Template Used for Mesh Group 100 in North America
These TE LSPs systematically follow the IS-IS shortest path because no constraints are applied during the TE LSP path computation. IS-IS then dynamically routes all the traffic, independent of the CoS, onto these TE LSPs using autoroute announce command.
Router(config)#mpls traffic-eng auto-tunnel mesh
Router(config)#router isis
Router(config-router)#mpls traffic-eng mesh-group 100 loopback0
!Create a template interface:
Router(config)#interface auto-template 1
!Specifies a mesh-group that a template interface uses to
!signal tunnels for all mesh-group members
Router(config-if)#tunnel destination mesh-group 100
!Create the interface template for mesh-group 100
Router(config)#interface auto-template 1
Router(config-if)#ip unnumbered loopback 0
Router(config-if)#tunnel mode mpls
Router(config-if)#tunnel mpls traffic-eng autoroute announce
Router(config-if)#tunnel mpls traffic-eng priority 7 7
Router(config-if)#tunnel mpls traffic-eng bandwidth 0
Router(config-if)#tunnel mpls traffic-eng path-option 1 dynamic
!
MPLS Traffic Engineering in the AsiaPac, EMEA, and South America Regions
In the AsiaPac, EMEA, and South America regions, MPLS traffic engineering is used not only for fast protection but also for bandwidth optimization. As explained in the "Quality of Service Design" section, Globenet decided to deploy MPLS DiffServ-aware Traffic Engineering (DSTE) using the Russian Dolls Model (RDM) and using two class types (CTs):CT1 for the EF traffic (VPN Voice, ATM pseudowire, and IP virtual leased line). BC1 is set to 30 percent of the actual link speed. This allows for the limitation of the proportion of EF traffic so as to bound the delay, jitter, and loss of the EF queue.CT0 for the rest of the traffic (VPN Video, VPN Business Latency, VPN Business Throughput, VPN Standard, and Internet). For each link, BC0 equals the link bandwidth (plus overbooking on higher-speed links, as discussed earlier and as summarized in Table 5-6).
As discussed in the "Quality of Service Design" section, Globenet enforces isolation between class types by using different preemption values for the two class types. TE LSPs belonging to CT1 (EF TE LSPs) have a higher preemption priority (value 3 or 4), whereas the TE LSPs belonging to CT0 (non-EF TE LSPs) have a lower preemption priority (value 7).Two full meshes between the P routers of each region are provisioned by means of auto-mesh TE: one for the EF traffic and the other for the rest of the traffic. The corresponding TE templates are provided in Examples 5-22 and 5-23.
Example 5-22. TE Template Used for the Mesh of EF TE LSPs in AsiaPac, EMEA, and South America (Mesh Groups 201, 301, and 401). Example for Mesh Group 201.
Router(config)#mpls traffic-eng auto-tunnel mesh
Router(config)#router isis
Router(config-router)#mpls traffic-eng mesh-group 201 loopback0
Router(config-router)#exit
!Create a template interface:
Router(config)#interface auto-template 2
!Specifies a mesh-group that a template interface uses to
!signal tunnels for all mesh-group members
Router(config-if)#tunnel destination mesh-group 201
Router(config-if)#exit
!Create the interface template for mesh-group 201
Router(config)#interface auto-template 2
Router(config-if)#ip unnumbered loopback 0
Router(config-if)#tunnel mode mpls
Router(config-if)#tunnel mpls traffic-eng autoroute announce
Router(config-if)#tunnel mpls traffic-eng priority 4 4
Router(config-if)#tunnel mpls traffic-eng sub-pool 10
Router(config-if)#tunnel mpls traffic-eng auto-bw
Router(config-if)#tunnel mpls traffic-eng path-option 1 dynamic
!Configure CBTS on the tunnel so that it carries traffic marked with EXP=5
Router(config-if)#tunnel mpls traffic-eng exp 5
!
!
Example 5-23. TE Template Used for the Mesh of Non-EF TE LSPs in AsiaPac, EMEA, and South America (Mesh Groups 200, 300, and 400). Example for Mesh Group 200.
TE LSPs Between PE-PSTN1 Routers" section in Chapter 4). Globenet uses one of the 32 bits associated with every link to indicate whether a link is a satellite link. Then, an additional command is included in the mesh group for EF tunnels in South America to ensure that the path computation for EF tunnels excludes all the links advertised with the "satellite" bit.To dynamically route packets to the right tunnel from the EF tunnel mesh or the non-EF tunnel mesh, depending on the packet CoS, Globenet uses the CoS-Based Tunnel Selection (CBTS) feature. For a given TE tunnel, CBTS allows the operator to configure, on that tunnel's headend, which EXP bit values that tunnel is meant to transport. Then, on the headend router, autoroute takes into account that information to perform CoS-aware routing over the set of TE tunnels. In turn, this results in autoroute's populating extended forwarding tables operating on the basis of <prefix, CoS> pairs (instead of simply on the basis of <prefix>, as per usual routing and forwarding). A <prefix, CoS> entry points to a given tunnel whenThat tunnel is the best path toward that destination prefix (according to the usual autoroute routing algorithm) andThat tunnel is configured to carry that CoS (more specifically, the EXP value for that CoS)
Router(config)#mpls traffic-eng auto-tunnel mesh
Router(config)#router isis
Router(config-router)#mpls traffic-eng mesh-group 200 loopback0
!Create a template interface:
Router(config)#interface auto-template 1
!Specifies a mesh-group that a template interface uses to
!signal tunnels for all mesh-group members
Router(config-if)#tunnel destination mesh-group 200
!Create the interface template for mesh-group 200
Router(config)#interface auto-template 1
Router(config-if)#ip unnumbered loopback 0
Router(config-if)#tunnel mode mpls
Router(config-if)#tunnel mpls traffic-eng autoroute announce
Router(config-if)#tunnel mpls traffic-eng priority 7 7
Router(config-if)#tunnel mpls traffic-eng global-pool 10
Router(config-if)#tunnel mpls traffic-eng auto-bw
Router(config-if)#tunnel mpls traffic-eng path-option 1 dynamic
!Configure CBTS on the tunnel so that it carries traffic marked
!with all EXP values which are not carried by other TE tunnels to
!the same destination
Router(config-if)#tunnel mpls traffic-eng exp default
!
Operation of CBTS in Globenet for dynamic routing of traffic over the EF TE LSP mesh and the non-EF TE LSP mesh is illustrated in Figure 5-42.
Figure 5-42. CBTS Operations Over the EF Tunnel and Non-EF Tunnel Meshes
[View full size image]

Reoptimization of TE LSPs
To determine the most appropriate reoptimization strategy, Globenet decided to conduct a study on the placement of the TE LSPs on a per-region basis. By its nature, a distributed CSPF computation leads to some degree of unpredictability in terms of TE LSP placement because of the unsynchronized TE LSP path computation by each headend router. That said, several simulation runs were performed based on predictive bandwidth requirements analysis during peak hours to get an estimated value of the number of TE LSPs that would not follow their IGP shortest paths. As expected, the simulations showed that this ratio was a function of the network utilization but would stay below 5 percent during steady state. This number would be multiplied by a factor of 1.5 during single-element failure. Furthermore, in an international network such as Globenet's, propagation delays may not be negligible and may have a significant impact on SLAs. Because the network density is not very high in some regions, a TE LSP that does not follow the current shortest path available (satisfying its constraints such as bandwidth) is highly undesirable.NoteOf course, the ratio is equal to 0 in the case of North America, because all the TE LSPs have 0 reserved bandwidth and thus follow the shortest IS-IS path.In light of these simulation results and considering the network topology and committed SLAs, Globenet decided not to delay any reoptimization should a more optimal path become available (such as following the restoration of a link). Consequently, each router has been configured to perform both timer-based and event-based reoptimization. The reoptimization timer has been set to 5 minutes. A reoptimization evaluation is triggered upon any link restoration in the network (signaled by IS-IS).Because unstable links are not uncommon in some regions of the world, event-driven reoptimization is required to be used in conjunction with a dampening mechanism. The aim of such a dampening mechanism is to ensure that an unstable link does not often trigger the origination of new IS-IS LSPs. In addition to triggering successive IS-IS shortest path first (SPF) recomputation, this would also trigger multiple MPLS TE reoptimization actions (CSPF computation and TE LSP signaling in the network). Moreover, a too-aggressive reoptimization strategy in the absence of a dampening mechanism would lead to successive traffic disruptions. Indeed, consider the case of a link experiencing a state change once every second (a situation that could happen with an unstable DWDM laser, for instance). If a new IS-IS LSP is originated upon every single link state change without dampening, every router in the network would trigger a reoptimization evaluation of all its TE LSPs. Potentially, a nonnegligible number of TE LSPs would be rerouted onto the restored link (especially if the link in question is a high-speed link). One second later, all those TE LSPs would be rerouted onto their respective backup tunnels by MPLS TE Fast Reroute, followed by a reoptimization, and so on.This shows the importance of using a dampening mechanism in conjunction with any event-driven reoptimization strategy. On a Cisco router, such a dampening mechanism could be configured at either the link or the IS-IS level. Globenet decided to use the IS-IS LSP origination dampening feature, which is discussed later in this chapter. Figure 5-43 shows the IS-IS LSP origination frequency when using an origination LSP dampening mechanism. Note that this frequency corresponds to the reoptimization evaluation frequency for each router in the network because a TE LSP reoptimization is triggered when a new IS-IS LSP reporting a linkup event is received.
Figure 5-43. Reoptimization Evaluation Frequency Upon a Flapping Link with IS-IS Dampening
[View full size image]

Traffic Engineering Scaling Aspects
Scaling aspects are always important criteria to consider when deploying any technology. As far as MPLS TE is concerned, the main parameter of interest is the total number of TE LSPs per headend and midpoint router, which affects the routers' memory consumption. (Note that the signaling processing overhead is usually negligible, especially with refresh reduction, which is what Globenet elected to use.) Therefore, Globenet estimated the total number of TE LSPs per midpoint and headend router on a per-region basis both in steady state and under network element failure conditions.Because the total number of routers in a single mesh never exceeds 50, the maximum number of TE LSPs to manage is 100 for a headend (which is negligible). Globenet estimated that the total number of TE LSPs per midpoint would be in the worst case no greater than 1000, even under failure conditions. Both numbers are far from reaching the scalability limit on most modern router platforms, including the routers deployed by Globenet.Thus, Globenet felt that the TE design did not pose any scalability issues.
Use of Refresh Reduction
Because of the low degree of connectivity in some regions, the number of TE LSPs per midpoint could potentially be nonnegligible (although quite far from raising any concern). Hence, Globenet chose to activate refresh reduction in its network to reduce the signaling refresh overhead. Moreover, because some links may not be very reliable with a nonnegligible bit error rate, the reliable messaging feature delivered with refresh reduction (see [REFRESH-REDUCTION]) was also beneficial.
Monitoring TE LSPs
Globenet decided to use dynamic and automatic mechanisms whenever possible to optimize the bandwidth usage (with auto-bandwidth) and minimize the risk of configuration errors (with auto-mesh). That said, such mechanisms must be appropriately tuned. Therefore, a network monitoring tool is required to adequately adjust the related parameters.Globenet decided to gather various sets of monitoring data:Network element failures In addition to the SNMP generic traps originated upon link and node failures, Globenet decided to produce scripts collecting the MPLS TE SNMP traps sent when a TE LSP reroute is performed because of a failure or reoptimization (because a better path has been discovered). Note that some scripts perform event correlation to determine the root cause of each failure (link versus node). For the case of an inter-AS TE LSP, the set of data exclusively relies on SNMP variables (or on any variable available via show commands) collected on a regular basis on each router originating an inter-AS TE LSP. Such information is indeed very useful to monitor the network availability for the VPOPs with respect to Globenet's availability SLA with Africa Telecom. (The design of Globenet VPOPs is discussed in detail later in this section.)TE LSP resizing As discussed in the section "Dynamic Traffic Engineering LSP Bandwidth Adjustment," parameters must be adjusted to determine the sampling and resizing frequencies. A compromise must be found between optimal bandwidth usage provided by an accurate bandwidth reservation and network stability with reasonable MPLS TE signaling overhead and traffic shift in the network. Because gathering the relevant data for all the TE LSPs and links would be nontractable, Globenet decided to select for each region a set of representative TE LSPs and links to monitor.
For each selected TE LSP, the following set of variables is collected:Counters such as the number of transmitted bytes over the TE tunnel interface are collected every 2 minutesBandwidth samples computed by auto-bandwidthHow often a TE LSP is resignaledThe number of TE LSPs that do not follow the shortest IS-IS path
This set of data helps Globenet determine whether the bandwidth sampling frequency is appropriate (compared to the actual traffic pattern). Then, the ratio of reserved bandwidth to actual traffic is computed to observe the accuracy of the bandwidth reservation, which is determined by the resizing frequency.For each selected link, the ratio between the reserved bandwidth on the link and the aggregated traffic is computed (counters are collected every 10 minutes). This last parameter, in conjunction with the number of TE LSP resizes and the proportion of TE LSPs that follow the IS-IS shortest path, is used to determine whether the TE LSP resizing frequency should be adjusted. For instance, suppose that the ratio between reserved bandwidth and actual traffic is 1.5. Suppose also that 60 percent of the TE LSPs do not follow the IS-IS shortest path and that each TE LSP is on average resized once a day. This would indicate that the sampling and resizing frequency could be increased without compromising the network stability and would allow for more optimal bandwidth usage. Furthermore, a higher proportion of the TE LSPs would likely follow the IS-IS shortest path, thus resulting in shorter propagation delays. Conversely, if the reserved bandwidth/actual traffic ratio is 1.1, and 5 percent of the TE LSPs follow the IS-IS shortest path, and the number of TE LSP resizes is too high, the sampling and resizing frequencies should probably be decreased.Note that examples of these graphs were provided in the section "Dynamic Traffic Engineering LSP Bandwidth Adjustment."It is worth reemphasizing that such analysis is conducted on a small set of representative TE LSPs and links in each region because gathering and analyzing the data just discussed requires time and network resources.
Last-Resort Unconstrained Option
Globenet's network has been dimensioned to survive any single failure without significant QoS degradation. In other words, upon any single-element failure in the network, such as a link or router failure, simulations have been done to ensure that any TE LSP can find an alternate path with the same bandwidth requirements. That said, a safe approach to cope with any unexpected multiple failures is to configure each TE LSP with a last-resort option in which the bandwidth constraint is relaxed. This guarantees that any TE LSP will get a path in any condition should the connectivity be preserved between the source and destination. In this case the TE LSPs simply follow the shortest IGP path.
TE Design for ATM Pseudowires
Globenet uses ATM pseudowires in North America and EMEA to interconnect its ATM switches. It paid specific attention to the design of MPLS TE for optimum transport of this traffic.As discussed, Globenet's network in North America relies on overprovisioning and does not require the deployment of call admission control techniques (by means of MPLS Traffic Engineering) in addition to QoS. Indeed, capacity has been provisioned (and, where necessary, the IS-IS metrics have been fine-tuned) to always limit the proportion of EF traffic (including the pseudowire traffic) below a desired ratio. Hence, in North America, the pseudowire traffic does not need any specific handling as far as MPLS TE is concerned. The pseudowire traffic is routed along the IGP path from the PE router to the next P router. Then it gets forwarded to the unconstrained TE LSP going toward the right egress P router along with the rest of the traffic.However, the case of EMEA is vastly different, so MPLS DS-TE has been deployed to ensure QoS guarantees and network resource optimization. One option would have been for IGP to route the pseudowire traffic between P router and PE router and to carry such traffic onto the DS-TE LSP for EF traffic, in the exact same way as the rest of the EF traffic is routed. The downside of such an option would have been the constraint of using shared engineering rules for the pseudowire traffic as well as for the rest of the EF traffic. More specifically:The pseudowire traffic would have been routed on the same TE LSPs as the rest of the EF traffic.The bandwidth computed for the tunnel carrying the pseudowire traffic would have been dynamically adjusted by auto-bandwidth and based on the aggregate load measured across all the EF traffic.
This was not considered entirely satisfactory by the engineering team in charge of the ATM network and making use of the ATM pseudowire service. They decided that they would rather have dedicated TE LSPs whose bandwidth is computed based on their own bandwidth requirement and constantly reserved through the core.Consequently, Globenet decided to carry the pseudowire traffic between PE routers onto dedicated TE LSPs. The bandwidth of these TE LSPs is determined based on the known port utilization between ATM switches. For instance, if two ATM switches used to be interconnected by means of a leased line of capacity B Mbps, whose peak load was estimated at X percent, the corresponding TE LSP would be signaled with a bandwidth ofX percent * B * some margin considering the traffic growthFurthermore, such a scheme allows for the configuration of higher preemption priority for the TE tunnels carrying pseudowire traffic than the DS-TE TE LSPs carrying the rest of the EF traffic.Therefore, EMEA actually has three types of TE LSPs, as shown in Figure 5-44:
Figure 5-44. Three Types of TE LSPs in EMEA
[View full size image]

Pseudowire QoS Design for ATM Trunking" section.