Definitive MPLS Network Designs [Electronic resources] نسخه متنی

MPLS Traffic Engineering Design

Quality of Service Design" section, to bound the delay, jitter, and loss to the levels required by telephony transit traffic, TK wanted to strictly enforce that the load of telephony traffic always be kept below 40 percent on any link and under any circumstances (including failure). Consequently, TK decided to deploy MPLS TE so that PSTN voice traffic could be constraint-based-routed across the MPC network and be subject to a call admission control limit of 40 percent on any link.

MPLS TE is deployed to carry only PSTN traffic. Therefore, all other traffic (such as Internet, Layer 3 MPLS VPN, and so forth) is label-switched across the MPC using the labels allocated by the LDP process and consequently follows the OSPF shortest path.

A full mesh of TE LSPs is set up between all the PE-PSTN routers (which connect the VoIP gateways, as illustrated in Figures 4-3 and 4-4). There are two TE LSPs between any two PE-PSTN routers residing in Level 1 POPs. There is a single TE LSP between any two PE-PSTN routers when at least one of them resides in a Level 2 POP (detailed reasoning for this design is provided later).

To differentiate between a Level 1 and Level 2 PE-PSTN, a naming convention for the routers was chosen in which the router's name begins with PE-PSTN1 for Level 1 and PE-PSTN2 for Level 2.

Setting the Maximum Reservable Bandwidth on Each MPC Link

To enable the TE design TK chose, each link in the MPC needed to be configured with a maximum reservable bandwidth value. This value indicates how much of the link bandwidth may be reserved for traffic engineering purposes. It can be configured to any value, regardless of the actual link speed. For example, an STM-1 link with 155 Mbps of total bandwidth may be configured with 310 Mbps of maximum reservable bandwidth. Therefore, the router may signal TE LSPs for up to 310 Mbps, which provides a bandwidth overbooking factor of 2. Conversely, the operator may choose to advertise a smaller value than the actual link speed to limit the amount of traffic carried on the link. This is the design elected by TK. Each link is configured with a maximum reservable bandwidth equal to 40 percent of the link speed. This guarantees that the bandwidth of TE LSPs established through a link for PSTN traffic never exceed 40 percent of that link bandwidth. For instance, an OC-192 link between two Level 1 POPs is configured with a reservable bandwidth equal to 0.4 * 10 Gbps = 4 Gbps. This configuration is shown in Example 4-7. It is used as a template for all OC-192 interfaces. (Similar templates exist for all the different link speeds in the MPC.)

Example 4-7. OC-192 Configuration Template


interface pos3/0
ip rsvp bandwidth 4000000
!

TE LSPs Bandwidth

One of the most challenging aspects of any MPLS TE design is obtaining a traffic matrix to appropriately configure the bandwidth of the TE LSPs. That said, in the case of the PSTN network, TK had very good knowledge of the existing public voice traffic matrix, which it acquired by means of various monitoring tools available on its telephony network during the past two decades. Because of this, several dimensioning rules have been applied to determine the initial size of the TE LSPs.

For inter-POP traffic, the traffic peak is multiplied by a factor of 0.9 to take into account the fact that the peaks do not occur simultaneously between each POP. Such dimensioning is considered conservative. TK observed that during the less-active periods the traffic could be as little as one-sixth to one-tenth of the peak and that each peak period rarely exceeded a few hours every day. Hence, the TE LSPs are sized based on 90 percent of the busiest hours.

Furthermore, the voice traffic during the weekends is generally significantly less than during weekday hours. Thus, during the weekend the observed PSTN traffic load is significantly less than the reserved bandwidth.

Although the PSTN voice traffic is relatively stable, the mobile voice traffic increases at a nonnegligible rate. The required bandwidth for the PSTN traffic can easily be derived from the number of calls that can be accepted by the VoIP gateways and by applying the inter-PSTN-POP traffic dimensioning rule just specified. However, the IP traffic generated by the mobile voice traffic must also be considered. Thus, TK decided to resize each TE LSP bandwidth once every two months. For each TE LSP, an external script collects the related SNMP data (number of bytes transmitted on each TE LSP) every hour. This allows for the collection of a very accurate traffic matrix and tracking of the traffic growth. Once every two months, each TE LSP is resized up if the observed peak value exceeds the configured bandwidth value by 5 percent for more than 5 percent of the samples. Similarly, each TE LSP is also resized down if the observed peak value is 90 percent or less than the configured bandwidth for more than 95 percent of the samples.

Path Computation

A dynamic CSPF algorithm is used to compute the shortest path for each TE LSP satisfying its constraints. (This is limited to the bandwidth constraint, except for TE LSPs between PE-PSTN1 routers where both the bandwidth and the affinity constraints must be satisfied, as discussed later.) Note that because the MPC network contains a limited number of TE nodes, the CSPF computation time is negligible (on the order of a few milliseconds). The choice to run CSPF on the TE LSP headends was made (as opposed to an offline path computation approach) for its ability to cope more rapidly with network element failures.

TE LSPs Between PE-PSTN1 Routers

The voice traffic between major cities in Kingland is significantly higher than between smaller cities. Therefore, TK decided to adopt a slightly different design for the TE LSPs between the PE-PSTN1 routers in Level 1 POPs than for the TE LSPs between PE-PSTN1 routers and PE-PSTN2 routers in Level 2 POPs. Because the TE LSPs between PE-PSTN1 routers are larger than the other TE LSPs, the design involves splitting the traffic over two TE LSPs.

The rationale behind this is that as the ratio between LSP size and link maximum reservable bandwidth increases, the likelihood of not being able to find a path satisfying the bandwidth constraint also increases, especially in failure scenarios. Hence, to minimize that risk, TK decided to load-balance the traffic between each pair of PE-PSTN1 routers across multiple TE LSPs (two in this case). Moreover, these TE LSPs are configured with a higher preemption (priority) than the TE LSPs between PE-PSTN1 and PE-PSTN2 routers as well as the TE LSPs between PE-PSTN2 routers, because (even after a split) they are still significantly larger. This circumvents the well-known issue of bandwidth fragmentation that can occur when using a distributed CSPF for the TE LSP path computation. Indeed, with distributed CSPF, there is no synchronization between routers. Each router computes the path for the set of TE LSPs it is the headend router for. Consequently, in some cases, bandwidth fragmentation may occur whereby a larger TE LSP cannot be routed because of some other smaller TE LSPs that were previously routed. RSVP-TE defines a multipriority scheme in which a TE LSP of priority X can preempt a TE LSP of priority Y if X < Y (a lower number reflects a higher priority). This preemption scheme can be used to help solve the bandwidth fragmentation problem.

For the sake of illustration, consider the example shown in Figure 4-29 (where just a limited number of TE LSPs are represented for simplicity). The following characteristics can be observed:

Figure 4-29. Bandwidth Fragmentation Solved by a Multipriority Scheme

[View full size image]

All the links are configured with a maximum reservable bandwidth of 4 Gbps (roughly 40 percent of STM-16).

A TE LSP T1 of 1.8 Gbps is established between PE-PSTN2-1 and PE-PSTN2-4. Another TE LSP T2 of 1.5 Gbps is established between PE-PSTN2-2 and PE-PSTN2-3.

The links cw1sw1 and cw2s1 have 2.8 Gbps and 2.9 Gbps of available bandwidth, respectively (because of other established TE LSPs across those links not represented in the figure).

Given the situation shown in Figure 4-29, no path could be found for a TE LSP of 3 Gbps between PE-PSTN1-5 and PE-PSTN1-6. In this situation the bandwidth is said to be "fragmented" because although the necessary bandwidth is available collectively across the multiple possible paths, it is not available on any one path. The solution is to displace T1 (the tunnel between PE-PSTN2-1 and PE-PSTN2-4 in Figure 4-29) to free up some bandwidth for T3 (the tunnel between PE-PSTN1-5 and PE-PSTN1-6), which could in turn be routed. Hence, in situations such as the one just described, T3 would preempt T1 and would in this case follow the path PE-PSTN1-5cw2c1c2s2PE-PSTN1-6. After being preempted, the TE LSP T1 would in turn be rerouted onto a different path without any manual intervention.

This also illustrates why the PSTN traffic between two PE-PSTN1 routers is split onto two TE LSPs instead of one. Doing so limits their size and consequently increases the probability of finding a path for a TE LSP. (Indeed, smaller TE LSPs are less likely to provoke bandwidth fragmentation.) Because these LSPs are still significantly larger than the TE LSPs between PE-PSTN2 and the TE LSPs between PE-PSTN1 and PE-PSTN2, they are configured with a higher preemption priority to benefit from the preemption mechanism just described.

The resulting TE LSP placement is shown in Figure 4-30.

Figure 4-30. Situation After Preemption and Rerouting of a Lower-Priority TE LSP

[View full size image]

Of course, such a multipriority scheme does not provide an absolute guarantee that bandwidth fragmentation will never occur, but it limits the risk of its occurrence.

TK ran several CSPF simulations with a random TE LSP placement. These simulations showed an extremely low risk of bandwidth fragmentation, with such an approach combining the splitting of the large TE LSPs and a multipriority scheme.

Establishing two TE LSPs between a pair of PE-PSTN routers has some other interesting properties. Provided that those LSPs are diversely routed, the impact of a single failure can be limited to a smaller proportion of the traffic between two POPs and consequently two cities.

The second positive consequence is that establishing two TE LSPs can be used to achieve more even load distribution across links. In the TK design, MPLS TE ensures that no more than 40 percent of the link speed is used by the PSTN traffic on every link. In some circumstances, it is conceivable that some links carry 30 percent of the traffic whereas other links carry only 10 percent. Although such a situation meets the TK objectives, achieving more-optimal load balancing is always desirable. This can be done when traffic is split across multiple TE LSPs. The only downside of such a strategy is the increase in the number of TE LSPs in the network. In the case of TK, such an increase is perfectly acceptable because it concerns only the TE LSPs between PE-PSTN1 routers. Thus, the number of TE LSPs is increased by 12 * 11 = 121 additional LSPs.

The solution to achieve such load balancing is to apply the concept of affinities defined by MPLS TE. In a nutshell, the idea is to use a 32-bit mask to indicate up to 32 link properties and use them as input constraints to be satisfied by a TE LSP so as to achieve a particular objective. In the example of the MPC network, the design between the VoIP gateways and the P routers residing in the Level 1 POPs is highly symmetric. Each VoIP gateway is dual-attached to two PE-PSTN1 routers that are themselves dual-attached to two P routers in the Level 1 POP. Hence, the idea is to use a color scheme for the link between PE-PSTN1 and the P routers and for the link between the P routers in the Level 1 POPs. Doing so load-balances the TE LSPs between each pair of PE-PSTN1 routers. This concept is shown in Figure 4-31.

Figure 4-31. Three-Color Scheme for Load-Balancing TE LSP Between Level 1 POPs

[View full size image]

Figure 4-31 shows that two TE LSPs (T1 and T2) are configured between PE-PSTN1-1 and PE-PSTN1-3. As just mentioned, the objective is to ensure that T1 and T2 are diversely routed when possible. Thus, three shades (light gray, medium gray, and dark gray) are used for the links between PE-PSTN and the P routers and the P routers of the same Level 1 POP. This ensures that T1 and T2 traverse a different P router to exit the source POP and to enter the destination POP. The OSPF metric of the links between the P routers has been computed such that two TE LSPs between a disjoint pair of P routers are always diversely routed end-to-end in steady state. Note that the affinity constraint is relaxed in case a PE-PSTN is incapable of finding a feasible path satisfying those constraints, which could occur in case of failure.

TE LSPs Between PE-PSTN1 and PE-PSTN2 Routers or Between PE-PSTN2 Routers

The design of the TE LSPs between two PE-PSTNs that do not both reside in a Level 1 POP is quite straightforward. There is only one TE LSP between a pair of such PE-PSTNs (no load balancing is required), and the only constraint that must be satisfied is bandwidth (no coloring scheme).

Reoptimization of TE LSPs

Chapter 5). Thus, the only case when TE LSPs should be reoptimized is upon network element restoration, upon TE LSP resizing, or upon the addition of a link or nodenone of which happens very frequently.

The MPC network is a national network with relatively short propagation delays (the propagation delay between two POPs never exceeds 15 ms, regardless of the path). Therefore, a TE LSP routed over a non-IGP shortest path does not experience significantly higher propagation delay compared to the OSPF shortest path. Thus, even when a TE LSP should be reoptimized (because a shorter path satisfying the constraints exists), the need for reoptimization should not be very critical. This is because the non-IGP shortest path offers propagation delays close to the IGP shortest path (a critical parameter for the voice traffic). Note that in some networks, the path followed by a TE LSP may experience significantly higher propagation delays than the IGP shortest path. However, this is not the case with the MPC national network.

Considering the various aspects mentioned here, TK decided to trigger a reoptimization once every 10 minutes. In this way, every headend router determines whether a more optimal (shorter) path can be found for each of its TE LSPs. If a more optimal path can be found, the TE LSP is reoptimized along the new path in a nondisruptive fashion using a make-before-break approach. Note that the CSPF computation for each TE LSP does not incur any CPU spikes considering the low number of TE LSPs per headend router. This also means that a TE LSP may follow a nonoptimal path for at most 10 minutes if a more optimal path exists because of the restoration or addition of a network element (such as a link or node).

MPLS Traffic Engineering Simulation

Before deploying MPLS Traffic Engineering, TK decided to conduct some CSPF simulations. Several objectives were set for these simulations:

Dimensioning of the network should be such that all the TE LSPs follow their OSPF shortest path (subject to the color constraints, if any) in steady state.

The average PSTN load on every link should be below 20 percent in the absence of failure.

The maximum number of TE LSPs per midpoint should be determined. Each TE LSP consumes some memory on each router it traverses. Hence, it's important to determine the maximum number of TE LSPs a node has to support both in steady state and under failure. Chapter 2 for details). The maximum number of TE LSPs per midpoint is discussed further in the next section.

The longer path that is necessary to satisfy the bandwidth constraint under failure conditions affects propagation delay. This impact should be studied.

Ensure that a path satisfying the TE LSP constraints can be found under any conditions (including the case of double failures).

The results of the CSPF simulations confirmed TK's expectations. During steady state, 100 percent of the TE LSPs follow the shortest path (or the shortest path satisfying the color constraints), and the maximum voice load on any link is below 20 percent. On the other hand, in the case of some SRLG failures, or node failure in a Level 1 POP, several TE LSPs are routed along a longer path. This meets the objective of not exceeding 40 percent of the PSTN traffic on every link. The propagation delay along those longer paths still meets the voice delay requirements.

TE Scaling Aspects

When analyzing the scaling properties of MPLS TE, several important variables must be considered:

Total number of TE LSPs There are a total of six Level 1 POPs and 20 Level 2 POPs, with two PE-PSTN routers per POP. This leads to a total of (51 * 52) + (12 * 11) (because there are two LSPs between each pair of PE-PSTN1s) = 2784 TE LSPs. Strictly speaking, the total number of TE LSPs is not the most important scalability criterion as compared to the number of TE LSPs each router would have to manage (as headend and midpoint router). However, the total number of TE LSPs is interesting from a management, monitoring, and provisioning point of view.

Number of TE LSPs per headend router The maximum number of TE LSPs per headend router is (2 * 11) + 40 = 62 for the PE-PSTN1 routers and 51 for the PE-PSTN2 routers. This number can be considered very low; indeed, modern routers can easily handle a few thousand TE LSPs as headend.

Number of TE LSPs per midpoint router This is important data to consider, because it can represent a nonnegligible proportion of the total number of TE LSPs in the network, especially in sparsely connected core networks. Hence, running a simulation to evaluate the worst-case scenario (the most loaded router in terms of the number of TE LSPs to support) both in steady state and under various failure scenarios is quite useful. In the case of the MPC network, an analysis showed that under a single failure scenario, in worst-case conditions, the most loaded router would have to handle 25 percent of the total number of TE LSPsroughly 700 TE LSPs. Again, this does not pose a problem, because most of the routers currently support tens of thousands of TE LSPs as midpoint.

In conclusion, TK felt that the MPC MPLS TE design did not pose any scalability concerns.

Use of Refresh Reduction

TK chose not to activate refresh reduction in its network, considering that the number of RSVP-TE sessions per midpoint router was not substantial.

Provisioning the Mesh of TE LSPs

TK developed a set of scripts to automate the provisioning of the TE LSPs between the PE-PSTN routers.

Monitoring

The monitoring of the MPLS TE network is, of course, of the utmost importance so as to adjust the TE design if necessary. TK decided to gather the following set of information for each TE LSP in the network:

Number of reroutes caused by network element failures This provides information about the link and node availability. A script then performs events correlation to deduce the root cause of each failure because a single failure can affect multiple TE LSPs.

Number of reroutes caused by reoptimization TK uses SNMP traps sent by the headend router to the network management system upon reoptimization.

PSTN load on every TE LSP TK monitors the actual PSTN load carried over any TE tunnel by collecting counters such as the number of bytes transmitted over the TE tunnel interface every hour. This information is used to adjust the TE LSP bandwidth when needed.

Link utilization by the PSTN traffic versus bandwidth reservation Such data is particularly interesting so as to determine the LSP sizing strategyparticularly in terms of statistical gain across the multiple POP-to-POP voice aggregates. Indeed, if it turns out that the sum of reserved bandwidth for the TE LSPs is always significantly above the actual PSTN traffic load on every link (the EF queue load), TK could readjust the formula used to compute the TE LSP bandwidth. The strategy adopted by TK consists of gathering the relevant SNMP values or variables (via scripts) for the link utilization, EF queue utilization, and total amount of bandwidth reservation for a few selected links.

Voice traffic pattern The traffic pattern (traffic fluctuation) is a key element in any MPLS Traffic Engineering design. It helps you determine the adequate bandwidth for each LSP. The traffic's burstiness is highly relevant in this case. For example, suppose that a very flat traffic pattern exists. In this case, the bandwidth estimate is quite straightforward. Conversely, in the case of very bursty traffic, sizing of the TE LSPs based on the peak load might not be optimal. It might cause some TE LSPs to follow a longer path in case of failure. (At steady state, the MPC network is dimensioned such that most TE LSPs follow their shortest OSPF path.) To study the traffic pattern, TK decided to write some scripts that would gather the amount of traffic sent on a few selected TE LSPs at a high frequency (every 5 minutes). By combining the traffic pattern data with the proportion of TE LSPs that would not follow their OSPF shortest path, TK can potentially readjust its TE LSP bandwidth size computation formula.

Last Resort Unconstrained Option

The MPC network is designed to survive any single failure. In other words, any TE LSP should be able to find an alternate path if a single network element fails. That said, a safe approach is to configure a last-resort option for each TE LSP whereby no constraint is specified to cope with any unexpected eventin particular, multiple-failure cases.

On a Cisco router, this can be achieved by means of LSP attributes. For each TE LSP, an ordered list of constraints can be specified. The headend router tries to find a path satisfying the preferred set of constraints; if no path is found, the next preferred set of constraints is tried, and so on. Hence, a safe and recommended approach is to configure a last-resort option whereby the TE LSP is configured without any constraint (no affinity, 0 bandwidth, and so on). This guarantees that in any case the headend router can always find a path to the destination, provided that there is still some connectivity to the destination. In this case the TE LSP path is no different from the OSPF path.

On TE LSPs between two PE-PSTN1 routers that use color constraints, the last-resort unconstrained option is used after the backup option, which relaxes the color constraints but not the bandwidth constraints.