Definitive MPLS Network Designs [Electronic resources] نسخه متنی

Quality of Service Design

On the access links, TK supports three user-visible classes of service (CoSs) as part of its Layer 3 MPLS VPN service:

VPN Real-Time

VPN Premium

VPN Standard

Each of these is supported by a separate queue on the access links between CE routers and mPE routers. TK also supports a fourth queue on the access links for a user-hidden CoS that handles routing traffic between CE routers and mPE routers.

A single CoS is supported on the access for the Internet service. It is identical to the Standard CoS of the Layer 3 MPLS VPN service.

In the network's core, TK decided to schedule all the VPN traffic, regardless of its CoS, in a single queue also used for Internet traffic. The reasons are detailed in the "QoS Design in the Core Network" section (including the fact that thanks to its DWDM infrastructure, TK can fairly easily provision additional bandwidth to keep aggregate load low). This queue is called the default queue (DF). However, a separate queue optimized for real-time operations, the Expedited Forwarding queue (EF), is used to transport telephony transit traffic. Finally, a third queue, the Assured Forwarding 3 queue (AF3), is dedicated to the transport of control traffic that is essential to network operations. This comprises the internal routing traffic (OSPF and MP-BGP traffic), some management traffic (such as Telnet), the MPLS signaling traffic (LDP, RSVP-TE) and the telephony signaling traffic from the PSTN soft switches.

Table 4-4 details the mapping between each type of traffic, the DSCP values, the queues on the access links, the EXP/DSCP values in the core, and the queues in the core.

Table 4-4. Mapping of Classes, DSCP, EXP, and Queues
Class of Service	DSCP on Access	Queue in Access	EXP/DSCP in Core	Queue in Core
VPN Real-Time	46 (EF)	EF	EXP=5	DF
VPN Premium	18 (AF21) in contract 20 (AF22) out of contract	AF2	EXP=2
VPN Standard and Internet	0	DF	EXP=0
VPN edge routing	48 (precedence 6)	AF3
Telephony transit	32 (precedence 4)		EXP=4	EF
Telephony transit signaling	24 (precedence 3)		EXP=3	AF3
Core control (routing, management, signaling)			DSCP=48 EXP6	AF3

For example, the VPN Real-Time CoS from the Layer 3 MPLS VPN service is marked with DSCP 46 on the access links between CE routers and mPE routers. It uses the EF queue on these links, but in the core it is marked with EXP 5 and scheduled into the DF queue.

Layer 3 MPLS VPN and Internet SLA

TK offers QoS commitments over the core (POP to POP) to all Layer 3 MPLS VPN and Internet customers.

In the core, a single level of commitment is provided to all classes of service of the Layer 3 MPLS VPN service as well as for the Internet traffic. These commitments are shown in Edge QoS Engineering" section of Chapter 2, "Technology Primer: Quality of Service, Traffic Engineering, and Network Recovery" on lower-speed links, even small queue occupancies have a significant impact on delay. The effect varies considerably and has many parameters, such as the actual link speed, the maximum packet size, the maximum load compared to the service rate for each queue, and so forth. Some of these are outside TK's control. Therefore, TK does not offer standard SLAs to Layer 3 MPLS VPN and Internet customers over the access links. Instead, it takes responsibility for configuring the DiffServ mechanisms for the three CoSs on the access link but it leaves it to the customer to use these mechanisms appropriately to achieve the desired level of QoS.

However, at the request of some customers, TK has worked to establish specific custom SLAs for the access links. For example, Table 4-6 lists the specific commitments for each CoS on the access provided as part of such a custom SLA. This custom SLA is built on a number of assumptions. These include a maximum packet size of 66 bytes in the real-time queue, a limit of 33 percent of link speed for the Real-Time CoS, a maximum load of 50 percent for the premium traffic, and the use of fragmentation and interleaving on lower-speed links and short access links (so that propagation on the access links is negligible). To establish these delay figures, TK first ran simulations that computed the 99.9 percentile delay for a perfect queuing system under these assumptions. Then TK factored in the expected deviations of the actual implementation from such a perfect queuing system. For example, this includes the effect of the transmit buffer (known on Cisco devices as the transmit ring). This is further discussed in the later section "QoS Design on the Network Edge for Layer 3 MPLS VPN and Internet."

Table 4-6. Sample Custom Per-CoS VPN SLA on Access
CoS	SLA Parameter	SLA Commitment
VPN Real-Time (in contract)	One-way delay	Access speed-dependent: 256512 kbps: 30 ms 12 Mbps: 15 ms 34155 Mbps: 5 ms
VPN Real-Time (in contract)	Loss	0.1%
VPN Premium (in-contract)	One-way delay	Access speed-dependent: 64128 kbps: 250 ms 256512 kbps: 125 ms 12 Mbps: 60 ms 34155 Mbps: 25 ms
VPN Premium (in-contract)	Loss	0.1%
VPN Standard	Bandwidth	All bandwidth unused by the VPN Real-Time and VPN Premium classes can be used by the VPN Standard class

QoS Design in the Core Network

As described previously, TK opted to handle separately the telephony traffic, the Internet/Layer 3 MPLS VPN traffic, and the control traffic in the core of the network. This allows TK to isolate the three types of traffic from one another and to apply different capacity planning policies and different sets of QoS mechanisms to each.

For the telephony transit traffic, TK wanted to provide optimum delay and jitter even under failure situations, including catastrophic failure situations such as multiple simultaneous failures or complete POP failure. To that end, TK combined the following mechanisms:

Use of strict priority queuing for the EF queue to achieve optimum delay and jitter in a short time.

Use of MPLS Traffic Engineering to transport the telephony transit traffic over tunnels that are constraint-based routed and limited by configuration to keep the telephony transit traffic load under 40 percent on any link. TK deemed this sufficiently low to bound the delay, jitter, and loss through the MPC to the required levels for telephony transit traffic.

Separate capacity planning for the telephony traffic (which is well-known and closely monitored) to make sure that link capacity is such that

- In the absence of failure, the tunnels carrying the telephony traffic all follow their shortest path. The telephony traffic load is less than 20 percent of link capacity.

- Under all targeted failure situations, including catastrophic failure scenarios, all the tunnels should fit (using a path that may or may not be the shortest path) while satisfying the constraint of keeping the load under 40 percent of link capacity.

The details of the MPLS Traffic Engineering design for the telephony traffic can be found in the "MPLS Traffic Engineering Design" section.

For Internet and Layer 3 MPLS VPN traffic, it is relatively easy and inexpensive for TK to provide additional capacity in the core when needed through the activation of additional wavelengths on its DWDM infrastructure. Also, the QoS targets are not as stringent as for telephony. For these reasons, TK elected to rely on only capacity planning to achieve appropriate QoS for that class of traffic. To that end, a capacity planning policy is followed to trigger provisioning of additional link capacity whenever one of the following is true:

The aggregate load across all traffic (including not just Layer 3 VPN and Internet traffic but also telephony transit) reaches 45 percent of the link capacity, in the absence of failure, as determined by the monitoring of interface counters.

The aggregate load across all traffic would reach 85 percent of the link capacity should one of the links, SRLGs, or nodes fail. This is determined by a centralized simulation tool that collects current network topology and traffic matrix information and assesses the theoretical load on all links resulting from failure situations.

The "Core QoS Engineering" section of Chapter 2 characterized the relationship between the maximum utilization at a large time scale and the experienced QoS levels. In accordance with this approach, TK determined that such a capacity planning policy ensures that the Layer 3 MPLS VPN and Internet POP-to-POP SLA specified previously can be met in normal situations as well as during expected failures.

QoS Models" section of Chapter 2, TK follows a 4/3/1 model because it deploys four queues on the access and three queues in the core and uses traffic engineering (TE) with a single class type. (In other words, TK uses regular MPLS Traffic Engineering but it does not use DiffServ-aware MPLS Traffic Engineering.)

Figure 4-21 illustrates the interfaces where TK applies the various QoS service policies in its network. In particular, it shows that TK applies a core egress policy to reflect the core QoS design presented previously on all the P router interfaces as well as on the core-facing interfaces of the mPE routers.

Figure 4-21. Telecom Kingland Service Policies for QoS

[View full size image]

This core egress policy is detailed in Example 4-5 for the case of an STM-16 core link.

Example 4-5. Core QoS Egress Service Policy in Telecom Kingland for STM-16 Links


int pos0/0
service-policy output Core-QoS-policy
!
class-map match-any class-Telephony
match mpls exp 4
!
class-map match-any class-Control
match dscp 48
match mpls exp 6
match mpls exp 3
!
policy-map Core-QoS-policy
class class-Telephony
priority percent 55
queue-limit 4092
class class-Control
bandwidth percent 2
queue-limit 417
class class-default
bandwidth percent 43
random-detect
random-detect exponential-weighting-constant 14
random-detect 1344 8958 1
queue-limit 17916

Telephony transit traffic is classified based on the EXP value of 4 in accordance with the QoS mapping listed in Backup Tunnel Constraints" section for a detailed explanation of this load surge and its duration.

You can see that TK expects to never actually effectively police the telephony transit traffic in normal situations as well as during failure. The policing is configured only as a safety measure to ensure that the telephony transit traffic can never hog all the link bandwidth, even under a completely unpredicted combination of events that would starve the rest of the traffic, including the control traffic, and possibly bring the network down.

The control traffic is classified based on the following:

DSCP value of 48 (which corresponds to precedence 6), because Cisco routers automatically set the DSCP to this value when generating routing packets (OSPF, BGP) as well as other essential control traffic (LDP, RSVP-TE, Telnet)

EXP value of 6, for routing packets that are MPLS encapsulated over the core, such as MP-BGP packets

EXP value of 3, for telephony transit signaling traffic (SS7)

The control traffic is granted 2 percent of the total link which represent 50 Mbps.

The DF queue carrying the rest of the traffic is allocated all the remaining bandwidththat is, 43 percent of the link bandwidth. As explained previously, TK ensures by capacity planning that the aggregate load across all traffic is kept below 85 percent, even during targeted failure. This means that the DF queue is expected to always operate with low queue occupancy, which is necessary to satisfy the tight POP-to-POP SLA commitments that TK offers to Internet and Layer 3 MPLS VPN traffic. Still, as a safety measure in case the DF queue fills up under special circumstances (such as a combination of multiple failures and exceptional traffic growth), TK activated Random Early Detection (RED) (see [RED]) in the DF queue. This facilitates smooth adjustment of TCP traffic load (which is dominant in the DF queue) to the available capacity during these special circumstances. It also keeps the delay in the DF queue at reasonable levels and avoids global synchronization of TCP flows.

RED maintains a moving average of the queue occupancy. It also defines a minimum and maximum threshold and a maximum probability denominator, which together control the random discard of packets. TK followed recommendations for fine-tuning RED parameters on high-speed links. These recommendations allow high link utilization while minimizing queue occupancy and avoiding global synchronization of TCP flows. They resulted in TK's adopting the RED drop profile shown in Figure 4-22 and discussed here.

Figure 4-22. RED Drop Profile for the DF Queue in the Core

[View full size image]

On Cisco routers, the moving average is computed as follows:

Hence n, which is the exponential weighting constant, controls how fast or slowly the moving average tracks the current queue size. The objective is for the average queue size to filter out the short-term variations in queue occupancy (thus avoiding drastic swings in the face of short time scale traffic burstiness). It does this while reacting fast enough to significant queue variations corresponding to long-term queue buildup to trigger random drop early enough.

If n is too high, the average queue occupancy varies very slowly after the current queue size varies. In case of traffic load increase, this could result in the queue's filling up and reverting to tail drop before random drop is activated. Similarly, as soon as the traffic load has decreased, this could result in random drop continuing to drop packets unnecessarily long after the congestion has disappeared.

Conversely, if n is too low, the average queue occupancy reacts very quickly to variations in current queue size. This could result in overreaction of RED and frequent unnecessary dropping of traffic.

TK configured n such that

In the case of the DF queue, 43 percent of the link bandwidth is allocated to the queue. So, on STM-16 links, B = 89583 packets. Hence, TK configured the exponential weighting constant to 13 for the DF queue on STM-16 links.

The minimum threshold should be set high enough to maximize link utilization. It also should be set low enough to ensure that random drop kicks in early enough to start slowing down some TCP sources when needed. If it's set too low, packets will get dropped unnecessarily, and traffic will be prevented from using the link capacity. If it's set too high, the queue will fill up before random drops get a chance to slow down some sources.

The difference between the minimum threshold and the maximum threshold needs to be high enough to allow the random dropping behavior to avoid global synchronization of TCP sources. But if the maximum threshold is too high, random drop may not slow down enough sources, thus allowing the queue to fill up.

The maximum probability denominator controls the proportion of dropped packets. The drop probability grows linearly from 0 (no drop), when the average queue occupancy equals the minimum threshold, to 1 divided by "maximum probability denominator" (one packet is discarded every "maximum probability denominator" packets), when the average queue occupancy equals the maximum threshold.

TK selected the following settings:

The minimum and maximum thresholds are set to 15 percent and 100 percent, respectively, of the pipe size, where

pipe size = RTT * queue-bandwidth / (MTU * 8)

The maximum probability denominator is set to 1

Thus, on STM-16 links, and assuming a 100-ms RTT, TK uses a minimum threshold of 1344 and a maximum threshold of 8958 for the DF queue.

Note

RED and Weighted Random Early Detection (WRED) operations depend on many parameters, such as the mix of TCP and non-TCP traffic, the flows' RTTs, and each flow's reaction to traffic drop. Therefore, fine-tuning these operations is a difficult task that depends on the actual environment. Thus, TK monitors RED/WRED operations in its network to assess whether parameter adjustments are necessary for best performance.

Finally, TK decided to limit the instantaneous queue size of each of the three queues. When the queue size reaches this limit, all subsequent packets are dropped. This ensures that buffers can never be hogged by a particular queue that receives an unexpected and excessive amount of traffic. In turn, this avoids buffer starvation on line cards and protects other queues and interfaces. Finally, it places a hard bound on delay and jitter through that hop.

For the EF queue, the queue limit is configured so that it corresponds to an absolute worst delay through that hop of 3 ms for the real-time traffic. On an STM-16 link where up to 55 percent of link bandwidth can be used by the EF queue, and assuming a packet size of 126 bytes (because TK uses G.711 codecs at a 10-ms sampling interval, which means a payload of 80 bytes plus an IP/UDP/RTP header of 40 bytes and a PPP header of 6 bytes), this means an EF queue limit of 4092 packets. For the control traffic queue, the queue limit is set so that up to 100 ms worth of traffic can be buffered. On an STM-16 link where 2 percent of the link bandwidth is allocated to this queue, this represents 417 packets (assuming a packet size of 1500 bytes). Because random early detection is used inside the DF queue, all packets get dropped as soon as the moving average for the queue size reaches the maximum threshold. However, because of the lag between the instantaneous queue size and its moving average, it is possible that the instantaneous queue fills up beyond the maximum threshold before the random drop is effective. To limit the instantaneous queue size without interfering with random early detection, TK configured the queue limit to be twice that of the maximum threshold.

Some of the routers in TK's network have a distributed architecture. These routers comprise multiple line cards that support the various interfaces and that are interconnected inside the router by an internal switch fabric. A packet that is received on a given line card and that needs to be forwarded to an interface attached to another line card transits through the switch fabric. The switch fabric on TK's routers is nonblocking so that its sustained throughput is higher than the combined throughput of all interfaces. Therefore, there will not be any sustained contention across ingress line cards needing to send their packets across the fabric. However, because of possible synchronization of traffic bursts across multiple line cards or in some scenarios of DoS attacks, it is conceivable that packets on an ingress line card may have to be buffered for short periods of time before being transmitted across the fabric. Thus, to ensure the highest possible QoS, TK also deployed a QoS policy on the ingress line cards to handle buffering toward the switch fabric. On the Cisco routers used by TK, this is referred to as activating a "To-Fab" QoS policy. The To-Fab QoS policy deployed by TK is very similar to the egress QoS policy discussed earlier. It ensures that the telephony traffic is handled with strict priority and that some percentage of the bandwidth is allocated to the control traffic. Note that this policy is effectively applied to each virtual output queue on every ingress line card.

Note

Assume that an ingress line card simply maintained a single queue toward the fabric for transmission of all traffic toward all egress line cards, in case traffic to a given egress line card had to be held. (For example, this might happen because the egress line card had already used up more than its fair share of fabric bandwidth.) In that case, all traffic behind would get held, even if it were destined for different egress line cards that are entitled to use fabric bandwidth. This is called head-of-line blocking. To prevent head-of-line blocking and avoid any wastage of fabric bandwidth, the Cisco routers used by TK maintain, on every ingress line card, a separate virtual queue for traffic going to each different egress line card. The To-Fab QoS policy applies to each of these virtual output queues.

Figure 4-23 illustrates the virtual output queues toward the switch fabric. Thus, it shows the application points for the To-Fab QoS policy applied by TK in its routers that have a distributed architecture. The figure also shows the application of the core egress QoS policies on the egress interfaces.

Figure 4-23. To-Fab QoS Policy Over Virtual Output Queues

[View full size image]

Guaranteeing quality to the telephony transit traffic was the highest-priority objective for the core QoS design deployed by TK. This motivated the decision to handle the telephony transit completely separately in the core from any other traffic. However, more-critical voice applications and services are starting to be offered as native VoIP services. Also, the MPC link capacity keeps increasing to cope with data traffic that is growing at a much faster pace than the total voice and telephony traffic. Finally, the Class 4 switch replacement project is demonstrating daily that the MPC can reliably satisfy the demanding QoS requirements of telephony. For these reasons, TK will be investigating possible evolutions to the QoS core design. This may include handling the Real-Time VPN CoS jointly with the telephony transit traffic so that it also benefits from the same absolute quality of service as telephony. It may also include handling the premium traffic separately from the standard VPN and Internet traffic in the core. In that case the in-contract premium traffic would be given strong preferential treatment over the out-of-contract premium traffic through the use of WRED.

QoS Design on the Network Edge for Layer 3 MPLS VPN and Internet

As illustrated in Figure 4-21, enforcing the edge QoS design involves applying different QoS service policies at different points:

CE egress QoS policy Responsible for performing detailed traffic classification (including custom classification), marking to the TK DSCP values, metering and policing the contracted rate for each CoS, and enforcing the PHBs for each CoS to manage link bandwidth from the CE router to the mPE router.

PE ingress QoS policy Responsible for mapping DSCP values to EXP values and hence controlling the mapping of traffic classes into the core queues.

PE egress QoS policy Responsible for metering and policing contracted rates and for enforcing the PHBs for each access CoS to manage the link bandwidth from the mPE router to the CE router.

CE Router Egress Policy

Consider a VPN site with Frame Relay access at 256 kbps. The user has requested that 33 percent of this access bandwidth be allocated to the Real-Time CoS (to accommodate three simultaneous voice calls using G.729 with 20-ms packetization time, each requiring 26.4 kbps). The user also wants 50 percent to go to the Premium CoS. Figure 4-24 illustrates the hierarchy between the physical interface bandwidth, aggregate access rate, and minimum bandwidth guaranteed to each CoS.

Figure 4-24. Hierarchy of Bandwidth Commitments

Note

The percentage of bandwidth that TK allocates to the Routing CoS depends on the access speeds and on whether the number of prefixes dynamically advertised on the access link is small, medium, or large. TK has precomputed a value for each combination so that the estimated time to advertise all the prefixes is on the order of a few seconds.

Example 4-6 presents the corresponding CE egress QoS policy template. Each component of this template is discussed next.

Example 4-6. CE Egress QoS Service Policy Template for a VPN Site with Three CoSs


interface serial0/0
tx-ring-limit 2
frame-relay traffic-shaping
!
interface serial0/0.1 point-to-point
ip address CE-PE-link-subnet CE-PE-link-subnet-mask
frame-relay interface-dlci 100
class map-class-CE-to-PE-256
!
!identifies Routing Traffic
access-list 110 permit tcp any eq bgp any
access-list 110 permit tcp any any eq bgp
!identifies SAA Traffic
access-list 111 permit ip any IP-address-of-SAA-shadow-router mask
!identifies Premium traffic
access-list 112 permit ip any host 10.10.20.1
!
class-map match-any class-RealTime
match ip dscp 40
match ip dscp 46
!
class-map match-all class-RealTime-without-SAA
match class class-RealTime
match not ip access-group 111
!
class-map match-any class-Premium
match ip dscp 24
match ip access-group 112
!
class-map match-all class-Premium-without-SAA
match class class-Premium
match not ip access-group 111
!
class-map match-any class-Routing
match ip access-group 110
!
policy-map police-RealTime-without-SAA
class class-realTime-without-SAA
police cir percent 33 bc 20 ms conform-action set-dscp-transmit 46
exceed-action drop
!
policy-map police-Premium-without-SAA
class class-Premium-without-SAA
police cir percent 50 conform-action set-dscp-transmit 18
exceed-action set-dscp-transmit 20
!
policy-map CE-to-PE-QoS-policy
class class-RealTime
priority
service-policy police-RealTime-without-SAA
class class-Premium
bandwidth percent 50
random-detect dscp-based
random-detect exponential-weighting-constant 3
random-detect dscp 18 11 33 1
random-detect dscp 20 4 11 1
service-policy police-Premium-without-SAA
class class-Routing
bandwidth percent 4
set ip dscp 48
class class-default
bandwidth remaining percent 100
set ip dscp 0
!
map-class frame-relay map-class-CE-to-PE-256
frame-relay cir 256000
frame-relay mincir 256000
frame-relay bc 2560
frame-relay be 0
frame-relay fragment 320
service-policy output CE-to-PE-QoS-policy
!
rtr responder
!

The functional components of this CE egress policy and their respective ordering are illustrated in Figure 4-25.

Figure 4-25. CE Egress Policy Functional Components

[View full size image]

The first component is the classifier, which identifies which packets belong to which CoS. You can see in Example 4-6 that the customer indicated that

The real-time traffic must be classified based on a premarked DSCP of 46 (EF) and 40 (precedence 5).

The premium traffic must be classified based on a premarked DSCP of 24 (precedence 3) and based on a destination IP address of 10.10.20.1.

The routing traffic is identified by matching the TCP port numbers that identify the BGP protocol.

The next component is a per-CoS policy composed of separate policing, marking, and scheduling (or a subset of those) for each CoS. TK enforces systematic policing on the Real-Time class to its contracted rate instead of conditional policing (which would drop traffic only if there were congestion). This delivers a service that is perceived by end users as highly predictable. The Real-Time class can always properly carry a given number of voice calls (and never more). This is opposed to a service in which the number of voice calls that can be properly carried varies depending on what happens in other classes. This would be the user perception if conditional policing were used. In Example 4-6, the burst tolerance configured in the real-time policer is set to 20 ms, which is large enough to accommodate the simultaneous burst of one packet from each of the three targeted simultaneous calls. (The packet size with G.729-20-ms calls is 66 bytes so that the maximum burst could be 3 * 66 = 198 bytes, which fits within 20 ms at a rate of 33 percent of 256 kbps.)

With respect to scheduling, the VPN Real-Time CoS is given strict priority over any other traffic to achieve optimum delay and jitter. The queues for the VPN Premium CoS and the Routing CoS are allocated a minimum bandwidth guarantee of 50 percent and 4 percent, respectively. The Standard CoS is allocated the remaining bandwidth. Note that these bandwidth settings are minimum guarantees that each queue gets in case of contention across the multiple queues. But if any of the queues is not using its minimum guarantee, the other queues can use the leftover bandwidth and consequently use more than their minimum guarantee. WRED is used in the VPN Premium queue to avoid global synchronization of TCP flows and to enforce selective dropping of out-of-contract premium traffic over in-contract premium traffic in case of congestion in the premium queue. For fine-tuning of the WRED profile to apply on the incontract traffic, TK followed rules optimized for RED operations over lower speeds (as encountered in the access):

The exponential weighting constant n is such that

2ⁿ = 1 / B, where B = bandwidth / (MTU * 8)

and MTU = 1500 bytes

The minimum and maximum thresholds equal 100 percent and 300 percent of B, respectively.

The maximum drop probability is set to 1.

With an access rate of 256 kbps and 50 percent of bandwidth allocated to the premium queue, B = 11, n = 3, the minimum threshold = 11, and the maximum threshold = 33.

For the WRED profile to apply to the out-of-contract traffic, TK applied more-aggressive minimum and maximum thresholds of 30 percent and 100 percent of B, respectively (hence, 4 and 11).

These WRED drop profiles for the premium queue are illustrated in Figure 4-26.

Figure 4-26. WRED Drop Profiles in the Premium Queue on Access

[FRF.12]) in the case of Frame Relay access and the segmentation mechanism built into Multilink PPP (see [MLPPP]) (but used on a single link) in the case of PPP access links. In Example 4-6, you see that FRF.12 fragmentation is activated with a fragment size of 320 bytes (which represents 5 ms on a 512-kbps interface). Note that in the case of Frame Relay, the rate that is meaningful for computing the serialization time of a packet or fragment is actually the rate of the underlying physical interface (not the PVC CIR), because this rate dictates the serialization time. Although reducing the fragment size further and further would reduce the delay and jitter of the voice traffic accordingly, it would also increase the processing impact in similar proportions. Thus, tradeoffs are necessary. This is why TK selected a fragment size of 320 bytes in that case.

For implementation reasons, after the router scheduler has selected the next packet to be transmitted on the wire, this packet is handed over for actual transmission to the framing and transmission logic via a small buffer. The role of this buffer is to ensure that the transmission logic is supplied with an uninterrupted flow of packets to transmit (assuming that there are indeed packets to transmit). Therefore, no transmission cycle is wasted in accessing packets at transmission time; hence, line-rate transmission can be achieved. On Cisco routers this buffer is called the transmit ring (or Tx-ring for short). Because this buffer is a pure first-in, first-out (FIFO) buffer, a real-time packet just selected by the scheduler for transmission has to wait until the packets already in the Tx-ring buffer are transmitted before it is transmitted. Thus, the Tx-ring effectively adds an additional delay and jitter component that is equal to the time it takes to empty a full Tx-ring. Although Cisco routers automatically adjust the size of the Tx-ring buffer depending on the interface speed, the operator can further fine-tune it if needed. Reducing its size reduces the delay and jitter introduced by the Tx-ring buffer. However, Cisco recommends reducing the Tx-ring only when needed to satisfy specific delay/jitter requirements. Cisco also recommends never reducing it below two packets; otherwise, the router may no longer be able to achieve line rate transmission. As you can see in Example 4-6, TK elected to reconfigure its size to two packets. Because fragmentation is also used and limits the fragment size to 320 bytes (which represents 5 ms at a 512-kbps interface rate), the Tx-ring now introduces a maximum delay and jitter of only 10 ms.

As discussed in the "SLA Monitoring and Reporting" section that follows, TK uses Cisco SAA active measurement to monitor the QoS actually experienced in each CoS over the access links. This involves traffic samples being generated by an SAA shadow router in the POP toward the CE router and then being sent back to the SAA shadow router by the CE router. To perform measurement for each CoS, separate samples are generated for each CoS with the corresponding DSCP marking. To make sure these samples experience the same QoS as the real-time traffic and the in-contract premium traffic, TK needs to make sure the samples are not dropped by the Real-Time CoS policer or marked as out-of-contract by the Premium policer. This is why TK uses hierarchical policies with a parent policy applying the scheduling policy to all the traffic of a given CoS and with a child policy underneath to police only the subset of traffic that is not SAA traffic. Hierarchical policies are ideally suited to this sort of application because they allow the application of a service policy to a class that is itself part of a higher-level policy. This effectively allows for the definition of nested policies.

mPE Router Ingress Policy

Because TK manages the CE routers and thus can trust them to perform accurate classification, marking, and policing, the mPE routers do not need to perform those functions on input interfaces connecting the CE routers.

The default behavior of Cisco PE routers is to copy the 3-bit Precedence field of the IP header into the 3-bit EXP field of any MPLS label stack entry pushed on a packet received from the CE router. Because this default behavior achieves exactly the EXP value mapping desired by TK and listed in PE Router Ingress Policy" section of Chapter 5.

mPE Router Egress Policy

TK applies an egress QoS policy on the mPE router to manage the link toward the CE router that is very similar to the CE router egress QoS policy. The main difference is that classification can be performed directly on the DSCP values because all the traffic has already been classified and marked by the ingress CE router.

QoS Design on the Network Edge for Voice Trunking

The telephony soft switches used by TK are configured so that the media streams generated by the VoIP trunk gateways (the packets carrying the packetized voice) are all marked with DSCP value 32. The telephony signaling traffic to be transported over the packet backbone is marked with DSCP value 24.

QoS Design on the Network Edge for Layer 3 MPLS VPN CsC

TK supports the three user-visible CoSs (Real-Time, Premium, and Standard) as well as the Routing CoS to manage possible congestion on the access links of the Carrier's Carrier (CsC) service (the links between the CSC-mPE routers and the CSC-CE routers). In the core, the CsC traffic is handled in exactly the same way as the rest of the VPN traffic and benefits from the same SLA commitments.

Because TK manages the CSC-CE routers, it implements an egress service policy on the link toward the CSC-mPE router. It is similar to the service policy applied on regular VPN CE routers, but with a few adjustments to cope with the fact that all the end-user traffic is label-switched (as opposed to IP-routed) between the CSC-CE router and the CSC-mPE router:

Classification for the Real-Time CoS and the Premium CoS is performed based on the EXP value in the topmost entry of the MPLS label stack after label imposition (or label swapping in the case of hierarchical VPNs) by the CSC-CE router. Real-time traffic is identified by EXP value 5 and premium traffic by EXP value 2.

While the real-time traffic is policed (with dropping of the excess), the premium traffic is not policed on the CSC-CE router. TK currently uses a single EXP value (of 2) for the premium traffic in the MPC, and no EXP value is defined to identify the out-of-contract premium traffic. If in the future TK enhances the QoS design to support differentiated treatment of in-contract and out-of-contract premium traffic in the MPC, a second EXP value will be defined and could be used by policing on the CSC-CE router to mark out-of-contract traffic.

Customer-specific classification is not supported. It is up to the end customer to make sure the packets reach the CSC-CE router with the appropriate marking. Because TK relies on the default EXP marking behavior on the CSC-CE router, this means real-time packets must arrive with a DSCP value whose 3 precedence bits are set to 5 (or with an EXP value of 5 in the case of hierarchical VPN). Also, premium packets must arrive with a DSCP value whose 3 precedence bits are set to 2 (or with an EXP value of 2 in the case of hierarchical VPN).

The BGP routing traffic between the CSC-CE router and the CSC-mPE router is classified via the same IP access list as with regular VPN CE routers because the BGP traffic exchanged between the CSC-CE router and the CSC-mPE router is not encapsulated in MPLS.

The rest of the traffic has its EXP value remarked to 0.

Figure 4-27 illustrates the location where the CsC service policies are applied as well as marking in DSCP and EXP fields for packets belonging to the Real-Time CoS.

Figure 4-27. CsC QoS Service Policies

[View full size image]

As with the regular Layer 3 VPN service, TK does not need to actually activate any input QoS service policy on the CSC-mPE routers because

TK manages the CSC-CE routers and can trust its marking.

The default behavior of the CSC-mPE router achieves the right EXP marking for packets transmitted toward the core (because it copies the EXP value from the topmost entry of the incoming label stack into the swapped label entry and any pushed label entry).

Again, because of the MPLS encapsulation of all end-user traffic, the egress service policy applied on the mPE router over the link to the CE router is specific to the CsC service. This policy is the mirror image of the egress QoS policy applied on the CSC-CE router and just described.

SLA Monitoring and Reporting

TK performs ongoing active measurement using Cisco Service Assurance Agent (SAA) to establish actual performance and to report to its Layer 3 MPLS VPN and Internet users against SLA performance commitments.

Note

Cisco SAA is an embedded performance-monitoring agent in Cisco IOS software that performs active measurement. This means that it generates synthetic packets mimicking various types of packets of interest (for example, short voice packets marked as belonging to the Real-Time CoS or longer TCP packets marked as belonging to the Premium CoS). It also measures the actual performance metrics experienced by these packets when transiting the network (from one SAA agent to another SAA agent), such as delay, jitter, and loss. A router can behave as an SAA generator or an SAA responder, which only responds to SAA probes sent by the generator. An SAA agent can generate probes and perform corresponding measurements at regular intervals. Measurement results can be collected via the command-line interface (CLI) or SNMP. An SAA agent can also generate events asynchronously when measured performance levels cross certain configured thresholds.

As illustrated in Figure 4-28, SAA shadow routers are deployed in every POP, while the SAA responder function is activated on CE routers.

Figure 4-28. Telecom Kingland SLA Measurement

[View full size image]

Each SAA shadow router performs ongoing measurements between itself and every other SAA shadow router at 2-minute intervals. The corresponding traffic samples are generated with a DSCP value of 0 so that they get mapped to EXP 0 and get treated as Layer 3 MPLS VPN and Internet traffic in the core. These measurements are used to compute a POP-to-POP matrix of round-trip time, jitter, and loss.

SAA shadow routers also perform ongoing measurements between themselves and CE routers at 5-minute intervals. Separate measurements are performed for each user-visible CoS, each using a DSCP value of 46, 18, or 0 and a packet size of 64 bytes, 128 bytes, and 128 bytes, respectively. These measurements are used to compute a one-way delay (by dividing the round-trip time in half) and a loss for a given site.

Actual performance values are computed in the following ways:

Ten sample packets are generated at every measurement interval.

All measurements in a given sample are averaged into a "sample value."

The sample values are averaged over the hour into an "hourly value."

The worst hourly value of the day is retained as the "daily value."

The daily values are averaged over the month into a "monthly value."

Based on these computed values, TK provides a number of SLA reports to its customers through a web interface that includes the following:

Real-time POP-to-POP report This provides the POP-to-POP matrix of current hourly values for round-trip time, jitter, and loss.

Real-time site report This provides, for a given VPN site and for each CoS, the current hourly values for one-way delay and loss.

Monthly POP-to-POP report This provides the POP-to-POP matrix of monthly values for round-trip time, jitter, and loss for comparison against the POP-independent values committed in the VPN and Internet SLA.

Monthly site report This provides, for a given VPN site and for each CoS, the monthly values for one-way delay and loss, as well as the number of bytes and packets transmitted in each direction and the site availability for the month.

To control the end-to-end quality of service experienced by telephony transit traffic, TK also performs separate end-to-end active measurement from VoIP gateway to VoIP gateway for delay, jitter, and loss.

RTP Control Protocol (RTCP) (see [RTP]) lets TK monitor the quality of service experienced by the voice media stream over the Real-Time Protocol (RTP) (see [RTP]) by performing ongoing measurement of some statistics (packet loss, jitter) during a voice call. Such statistics are collected by the telephony VoIP trunk gateways in TK's network and then recorded as part of the Call Detail Record (CDR) established for every phone call and collected by a central server for applications such as billing. TK developed an application that accesses the CDR QoS statistics on the server and analyzes them to confirm operations within the targeted QoS objectives.

Definitive MPLS Network Designs [Electronic resources] نسخه متنی

Quality of Service Design

Table 4-4. Mapping of Classes, DSCP, EXP, and Queues

Layer 3 MPLS VPN and Internet SLA

Table 4-6. Sample Custom Per-CoS VPN SLA on Access

QoS Design in the Core Network

Figure 4-21. Telecom Kingland Service Policies for QoS

Example 4-5. Core QoS Egress Service Policy in Telecom Kingland for STM-16 Links

Figure 4-22. RED Drop Profile for the DF Queue in the Core

Figure 4-23. To-Fab QoS Policy Over Virtual Output Queues

QoS Design on the Network Edge for Layer 3 MPLS VPN and Internet

CE Router Egress Policy

Figure 4-24. Hierarchy of Bandwidth Commitments

Example 4-6. CE Egress QoS Service Policy Template for a VPN Site with Three CoSs

Figure 4-25. CE Egress Policy Functional Components

Figure 4-26. WRED Drop Profiles in the Premium Queue on Access

mPE Router Ingress Policy

mPE Router Egress Policy

QoS Design on the Network Edge for Voice Trunking

QoS Design on the Network Edge for Layer 3 MPLS VPN CsC

Figure 4-27. CsC QoS Service Policies

SLA Monitoring and Reporting

Figure 4-28. Telecom Kingland SLA Measurement