Quality of Service in MPLS Networks
QoS ensures that packets receive appropriate treatment as they travel through the network. This helps applications and end users have an experience that is in line with their requirements and expectations and with the commitments contracted by the customer with the network operator.This section first discusses traffic requirements and Service Level Agreements (SLAs). Then it covers mechanisms available to enforce QoS in the network. It finishes by discussing QoS design approaches on the edge and in the core involving different combinations of these QoS mechanisms.
Traffic Requirements and Service Level Agreements
A number of relevant metrics characterize traffic requirements, as discussed in the following sections.
Application Requirements
As with other statistical multiplexing networks such as Frame Relay and ATM, when you're trying to characterize the effect that an MPLS/IP network has on a data stream, it is natural to consider the following metrics:Which packets arrive and which do not. This characterizes packet loss.Out of the packets that did arrive, how long did it take them to get there, and did this time vary? This characterizes packet delay and delay variation.When is such service available? This defines service availability.
The IP Performance Metric Working Group (IPPM) of the Internet Engineering Task Force (IETF) has developed formal specifications for these metrics.If a given packet arrives at its destination measurement point and does so within a "reasonable" time (meaning that it is still useful to the application), the packet is not lost (see [IPPM-LOSS]). Otherwise, it is considered lost. Loss in a network may be caused by packets being dropped from a device's egress queue when this queue builds up too much because of congestion and packets have to be discarded. We will revisit this topic in more detail in the section "[IPPM-OWDELAY]) is the amount of time between when the last bit of the packet is transmitted by the source measurement point and when the first bit of the packet is received at the destination measurement point.The round-trip delay (see [IPPM-RTDELAY]), more commonly called the round-trip time (RTT), is the amount of time between when the last bit of the packet is transmitted by the source measurement point and when the first bit of the packet is received at the source measurement point, after it has been received and immediately sent back by the destination measurement point.The delay variation between two packets (see [IPPM-DELVAR]) is the difference between the one-way delay experienced by these two packets.In advanced high-speed routers, the switching delay is of the order of tens of microseconds and is therefore negligible. Thus, the one-way delay in a network is caused by three main components:Serialization delay at each hop This is the time it takes to clock all the bits of the packet onto the wire. This is very significant on a low-speed link (187 milliseconds (ms) for a 1500-byte packet on a 64-kbps link) and is entirely negligible at high speeds (1.2 microseconds for a 1500-byte packet on a 10-Gbps link). For a given link, this is clearly a fixed delay.Propagation delay end-to-end This is the time it takes for the signal to physically propagate from one end of the link to the other. This is constrained by the speed of light on fiber (or the propagation speed of electrical signals on copper) and is about 5 ms per 1000 km. Again, for a given link, this is a fixed delay.Queuing delay at each hop This is the time spent by the packet in an egress queue waiting for transmission of other packets before it can be sent on the wire. This delay varies with queue occupancy, which in turns depends on the packet arrival distribution and queue service rate.
In the absence of routing change, because the serialization delay and propagation delay are fixed by physics for a given path, the delay variation in a network results exclusively from variation in the queuing delay at every hop. In the event of a routing change, the corresponding change of the traffic path is likely to result in a sudden variation in delay.The availability characterizes the period during which the service is available for traffic transmission between the source and destination measurement points (usually as a percentage of the available time over the total measurement period).Although many applications using a given network may each potentially have their own specific QoS requirements, they can actually be grouped into a limited number of broad categories with similar QoS requirements. These categories are called classes of service. The number and definition of such classes of service is arbitrary and depends on the environment.In the context of telephony, we'll call the delay between when a sound is made by a speaker and when that sound is heard by a listener as the mouth-to-ear delay. Telephony users are very sensitive to this mouth-to-ear delay because it might impact conversational dynamics and result in echo. [G114] specifies that a mouth-to-ear delay below 150 ms results in very high-quality perception for the vast majority of telephony users. Hence, this is used as the design target for very high-quality voice over IP (VoIP) applications. Less-stringent design targets are also used in some environments where good or medium quality is acceptable.Because the codec on the receiving VoIP gateway effectively needs to decode a constant rate of voice samples, a de-jitter buffer is used to compensate for the delay variation in the received stream. This buffer effectively turns the delay variation into a fixed delay. VoIP gateways commonly use an adaptive de-jitter buffer that dynamically adjusts its size to the delay variation currently observed. This means that the delay variation experienced by packets in the network directly contributes to the mouth-to-ear delay.Therefore, assuming a delay budget of 40 ms for the telephony application itself (packetization time, voice activity detection, codec encoding, codec decoding, and so on), you see that the sum of the VoIP one-way delay target and the delay variation target for the network for high-quality telephony is 110 ms end to end (including both the core and access links).Assuming random distribution of loss, a packet loss of 0.10.5 percent results in virtually undetectable, or very tolerable, service degradation and is often used as the target for high-quality VoIP services (see [SLA]).For interactive mission-critical applications, an end-to-end RTT on the order of 300400 ms is usually a sufficient target to ensure that an end user can work without being affected by network-induced delay. Delay variation is not really relevant. A loss ratio of about 0.51 percent may be targeted for such applications, resulting in sufficiently rare retransmissions.For noninteractive mission-critical applications, the key QoS element is to maintain a low loss ratio (with a target in the range of 0.10.5 percent) because this is what drives the throughput via the TCP congestion avoidance mechanisms. Only loose commitments on delay are necessary for these applications, and delay variation is irrelevant.
Service Level Agreement
A Service Level Agreement (SLA) is a contractual arrangement between the operator and the customer formalizing the operator's commitments to address the customer-specific QoS traffic requirements.SLAs offered by operators typically are made up of the following elements:Traffic Conditioning Specification (TCS) This identifies, for each class of service (CoS) and each site, the subset of traffic eligible for the class of service commitment. For example, for a given site this may indicate the following:
- - VoIP traffic from that site up to 1 Mbps is to receive the "Real-Time" class QoS commitments, and VoIP traffic beyond that rate is dropped.
- - SAP traffic up to 500 kbps is to receive the QoS commitment from the "Mission-Critical" class, and the SAP traffic beyond that rate is transmitted without commitment.
- - The rest of the traffic is to receive the QoS commitments of the "Best-Effort" class.
SLS commitments are statistical and the corresponding SLS reporting is based on active measurement. Sample traffic is injected into the network between measurement points, recording the QoS metrics (delay/jitter/loss) actually experienced by these samples. The SLS must specifyHow a single occurrence of each metric is measuredWhat series of samples the metric is measured on and at what frequencies the series are generatedHow statistics are derived based on the measured metric over the sample series
Multiple statistics can be defined, such as percentiles, median, and minimum, as done in [IPPM-OWDELAY]. However, because SLS must be kept simple enough for easy communications between operator and customers, and because the tighter the commitment, the harder it is for the operator to meet it, SLS offered today generally focuses on an average of the measured metric over a period of time, such as a month.The SLS must also indicate the span of the commitment. With unmanaged CE routers, this is from POP to POP. With managed CE routers, this is most commonly based on a point-to-cloud model. The SLS commitments are expressed separately over CE-to-POP, POP-to-POP, and POP-to-CE. However, with some classes, such as VoIP, it may sometimes be based on a point-to-point model so that SLS commitments may be expressed as CE-to-CE. This provides a more accurate end-to-end QoS indicator for the end user than the concatenation of commitments over multiple segments. These two models are illustrated in Figure 2-1.
Figure 2-1. Point-to-Cloud and Point-to-Point SLS Models
[View full size image]

QoS Mechanisms
The previous sections showed that the prime QoS metrics are delay, delay variation, and loss. We can also observe that the delay (excluding its component imposed by serialization and propagation), the delay variation, and the loss all result purely from egress queuing (in the absence of topology change). This explains why the QoS mechanisms we will now discuss are all fundamentally designed to contribute in different ways to reducing egress queue occupancy for traffic requiring high QoS.Mechanisms that accelerate network recovery after topology changes, and hence that reduce the loss and delay variation induced in such situations, are discussed in the "Core Network Availability" section.
The Fundamental QoS Versus Utilization Curve
Fundamental queuing theory (see [QUEUING1] and [QUEUING2]) teaches that, if you consider a queuing system, the queue occupancy and the steady state depend on the actual arrival distribution and on the service pattern. But if you define the utilization as the ratio between the average arrival rate and the average service rate, you observe that, on high-speed links, regardless of those distributions, the following points are true:If the utilization is greater than 1, there is no steady state, and the queue keeps growing (or packets keep getting dropped when the queue limit is reached), and QoS is extremely bad.If the utilization is sufficiently less than 1, the queue occupancy remains very small, and QoS is very good.As the utilization increases toward 1, queue occupancy increases and QoS degrades in a way that is dependent on the packet arrival distribution.
Therefore, the fundamental QoS versus utilization curve looks like Figure 2-2. This curve is a fictitious one intended to show the general characteristics. The exact shape depends on multiple parameters, including the actual arrival distribution, which is notoriously hard to characterize in IP networks.
Figure 2-2. The Fundamental QoS Versus Utilization Curve
[View full size image]

The fundamental data path mechanisms used in MPLS/IP networks are those of the Internet Engineering Task Force (IETF) DiffServ model and the extensions to MPLS for the support of DiffServ.The control plane mechanisms used in MPLS/IP networks are IGP metric tuning, MPLS traffic engineering (MPLS TE), and DiffServ-aware MPLS traffic Engineering (MPLS DS-TE), which are described later in this chapter in the sections "Traffic Engineering" and "DiffServ-Aware MPLS Traffic Engineering." Their characteristics of interest from a QoS perspective are briefly compared in Table 2-1.
IGP Tuning | Traffic Engineering | DiffServ-Aware MPLS Traffic Engineering | |
---|---|---|---|
Mode | Connectionless | Connection-oriented | Connection-oriented |
Constraints | Optimize on a single metric | Optimize on one of multiple metrics. Satisfies multiple arbitrary constraints, including an aggregate bandwidth constraint. | Optimize on one of multiple metrics. Satisfies multiple arbitrary constraints, including a per-class bandwidth constraint. |
Admission Control | No | On an aggregate basis. Can be used to limit aggregate utilization. | On a per-class basis. Can be used to independently limit the utilization for each class. |
The IETF DiffServ Model and Mechanisms
The objective of the IETF DiffServ model is to achieve service differentiation in the network so that different applications, including real-time traffic, can be granted their required level of service while retaining high scalability for operations in the largest IP networks.Scalability is achieved bySeparating traffic into a small number of classes.Mapping the many applications/customer flows into these classes of service on the edge of the network so that the functions that may require a lot of state information are kept away from the core. This mapping function is called traffic classification and conditioning.It can classify traffic based on many possible criteria, may compare it to traffic profiles, may adjust the traffic distribution (shape, drop excess), and finally may mark a field in the packet header (the Differentiated Services or DS field) to indicate the selected class of service.Having core routers that are aware of only the few classes of services conveyed in the DS field.
You can ensure appropriate service differentiation by doing the following:Providing a consistent treatment for each class of service at every hop (known as the per-hop behavior [PHB]) corresponding to its specific QoS requirementsAllowing the service rate of each class to be controlled (by configuring the PHB) so that the utilization can be controlled separately for each class, allowing capacity planning to be performed on a per-class basis
These key elements of the DiffServ architecture are illustrated in Figure 2-3.
Figure 2-3. The DiffServ Architecture
[DS-FIELD] and [DIFF-TERM]). The remaining 2 bits are used for Explicit Congestion Notification (ECN; see [ECN]).The DS field is used at the edge of the network by the traffic classification and conditioning function to encode the Differentiated Services Codepoint (DSCP) value. This value is used at every hop by DiffServ routers to select the PHB that is to be experienced by each packet it forwards.The 3 most significant bits of the superseded IPv4 type of service octet are used to represent the Precedence field, which was intended for use in a more limited but similar way to the DS field. The DS, DSCP, and Precedence fields are illustrated in Figure 2-4.
Figure 2-4. The DS Field and DSCP

Figure 2-5. The DiffServ Traffic Classifier and Traffic Conditioner

Figure 2-6. Sample WRED Profiles for the AF1 Class
[View full size image]

MPLS Support of DiffServ
[MPLS-STACK] for future use. This field is shown in Figure 2-7.
Figure 2-7. MPLS Header and Exp Field

Figure 2-8. E-LSP Example for EF and AF1
[View full size image]

Figure 2-9. L-LSP Example for EF and AF1
[View full size image]

As illustrated in Figure 2-10, at the boundary between the IP and MPLS network, the Label Edge Router (LER) first identifies the PHB to apply to an incoming IP packet and then selects the outgoing LSP based on the packet destination and, possibly, on the PHB. Finally, the LER sets the EXP field to indicate the PHB to be applied.
Figure 2-10. DiffServ Label Edge Router
[MPLS-DIFF] specifies the signaling extensions to LDP (see [LDP]) and RSVP-TE (see [RSVP-TE]) for setup, maintenance, and teardown of E-LSPs and L-LSPs. However, E-LSPs relying on predefined mapping between EXP values and PHBs do not require the use of any of these signaling extensions because by definition the necessary information is preconfigured.Even though the way to convey the PHB to a router is different in an MPLS network compared to an IP network, the actual PHBs applied are strictly the same (Default, EF, AFij, and so on). They can be instantiated via the exact same packet scheduling and active queue management mechanisms. No MPLS-specific scheduling mechanism (such as per-label queuing) is involved in supporting DiffServ over MPLS. Consequently, a pure DiffServ service supported over an MPLS cloud is indistinguishable from the DiffServ service supported over an IP network. Note, however, that a DiffServ service over MPLS may be enhanced via additional MPLS mechanisms such as TE or DS-TE.Production deployment of DiffServ over MPLS today uses E-LSPs with preconfigured mapping between EXP values and PHBs. (The exception is label switching-controlled ATM MPLS networks, where only L-LSPs are applicable because the EXP field is invisible to ATMLSRs.) This allows for very simple deployment in the core with very smooth introduction, because no resignaling of LSPs is required when deploying DiffServ. This involves only reconfiguring the PHBs on routers so that they can classify packets based on the EXP values in the MPLS header to apply the necessary PHB. L-LSPs may be used in the future, if and when more than eight PHBs are needed in the MPLS core.
Combining Tools to Support SLA
Traffic Engineering" and "DiffServ-Aware MPLS Traffic Engineering" sections) may be combined to control traffic utilization and in turn control QoS. The following section examines how this is achieved.
Core QoS Engineering
To help understand the challenge involved in core QoS design, we must briefly discuss the time scale of traffic distribution. Operators closely monitor the traffic utilization on their links with a typical measurement period of about 5 to 10 minutes. This provides excellent information on traffic trends over the day, week, or month.However, the time scale at which it is important to understand traffic distribution to be able to apply queuing theory or perform a simulation to predict queue occupancy (and therefore QoS) is on the order of the millisecond.The challenge is that, as of today, the traffic characteristics of large traffic aggregates in an IP core at small time scales are not well understood. In other words, it is difficult to estimate the small time scale distribution of packets that have to be queued on a link simply by knowing the long-term utilization. Two main schools of thought can be identified:One suggests that traffic tends to smooth with aggregation so that traffic aggregates can be considered smooth and Markovian, so M/M/1 queuing theory applies. See [QUEUING1] or [QUEUING2] for an introduction to the M/M/1 model and associated formulas.The other suggests that traffic does not smooth with aggregation and that traffic aggregates are bursty and self-similar. Therefore, limited theoretical results are available to characterize queue occupancy.
Depending on their assumptions about aggregate traffic characteristics, papers on the subject conclude different values for what is the maximum large time scale utilization that can be achieved on core links to maintain specific levels of QoS (say, for VoIP) (see, for example, [TRAFFIC1], [TRAFFIC2], [TRAFFIC3], and [TRAFFIC4]). However, it is more and more widely accepted that very high peak utilizations at a large time scale may be achieved on very high-speed links while maintaining very good delay/jitter/loss characteristics (more than 90 95% on OC-48 and OC-192). On lower-speed links, the maximum utilization that can be achieved while offering very good QoS is significantly lower, suggesting that enforcing an aggregate maximum utilization across all traffic to ensure high QoS for just a subset of the traffic (such as VoIP) may involve more significant bandwidth wastage.To determine the maximum utilization to enforce in a given network to achieve a given level of QoS, network operators use a combination of theoretical analysis, simulation, heuristics based on real-life observation, and safety margin.In light of this relationship between the utilization measured at a large time scale and QoS performance levels, we see that selecting a QoS design for the core is mainly a trade-off betweenCapital expenditure involved in link/port capacity increaseOperational expenditure involved in deploying QoS mechanisms (engineering, configuration, monitoring, fine-tuning, and so on)Level of QoS performance targeted
In other words, it is a trade-off between throwing bandwidth at the QoS problem and throwing mechanisms at the QoS problem.Where provisioning extra link capacity doesn't involve significant capital expenditure or lead time and only very high-speed links are used, it is generally not worth enforcing different maximum utilization per class. Therefore, aggregate capacity planning is the most effective approach. Thus, operators in such environments rely exclusively on capacity planning for both normal operations and failure situations. They may compensate for the inherent shortcomings of capacity planning (such as unexpected traffic growth or unanticipated failure) through an additional safety margin. A typical example of capacity planning policy is to maintain an aggregate maximum utilization below 4050 percent in the normal case and below 70 percent in failure situations. Alternatively, to mitigate the capacity planning shortcomings, some operators resort to the use of MPLS Traffic Engineering to remove local congestion by redistributing the traffic load to avoid having to factor in an excessive safety margin, which could result in significant overengineering and bandwidth wastage.Other operators believe that although capacity planning is the key tool, DiffServ can ideally address its shortcomings. In the face of an unplanned situation or event, the capacity planning rules may be breached for some period of time. DiffServ can easily be used to ensure that the resulting degradation affects only some classes (for example, best effort) and has no noticeable impact on important classes (such as VoIP).Finally, some other networks may use lower-speed links (for example, DS-3 or OC-3) in their core so that the maximum utilization that can be achieved for different classes is significantly different, meaning that aggregate capacity planning would result in significant capacity wastage. Also, in some parts of the world, link bandwidth increase may represent significant capital expenditure or long lead times. In these cases, extensive use of DiffServ even in the absence of failure is likely to be the most cost-effective approach. In that case capacity planning is performed on a per-class basis. Where fine or very fine optimization of link utilization is sought, TE or DSTE can complement DiffServ and capacity planning through their constraint-based routing and admission-control capabilities so that traffic load is optimally redistributed on links.
Edge QoS Engineering
The network edge is where the boundary between the customer and the network operator lies. Hence, when DiffServ is used in the network, all the traffic classification and conditioning functions necessary to reflect the SLA traffic conditioning specification must be deployed on the edge (on the PE router and/or on the CE router).Because they are dedicated to a single customer and run right to his or her premises, access links generally cannot be easily provisioned to sufficient speeds to ensure that congestion will never occur. Thus, the access links are often the most stringent bottlenecks in the end to end path. For this reason, finer grain QoS is usually supported on the access links with a higher number of classes of service. Multiple PHBs/queues corresponding to the offered classes of service are typically activated on CE routers and PE routers on the access links.Also, because the serialization delay on low-speed access links can be high for long packets, fragmentation and interleaving mechanisms may be used to allow packets from real-time flows, such as VoIP, to be transmitted without waiting for complete transmission of a long packet.Because of the significant serialization delays and because small queue occupancy has a bigger impact on QoS metrics on low-speed links, the QoS metrics provided in the SLS on the access are dependent on the access speed. They also involve (for the classes of service with high QoS commitments) strict traffic conditioning such as policing, shaping, or noncommitments for traffic exceeding an agreed-upon traffic profile.
QoS Models
In summary, the design models discussed for MPLS networks can be referred to as N/M/P, whereN is the number of queues on accessM is the number of queues in coreP is the number of TE/DS-TE class types, where
- - 0 means that MPLS TE is not used
- - 1 means that MPLS TE is used
- - 2 (or more) means that DS-TE is used with two (or more) class types
The most common QoS models are listed in Chapter 3, "Interexchange Carrier Design Study," uses the 1/1/0 QoS model (or the 3/1/0 model if the customer selects the corresponding access option). Telecom Kingland, described in Chapter 4, "National Telco Design Study," deployed the 4/3/1 model. Globenet, described in Chapter 5, "Global Service Provider Design Study," relied on the 5/3/0 model in North America and on the 5/5/2 model in other parts of the world. Eurobank, described in Chapter 6, "Large Enterprise Network Design Study," selected the 1/6/0, 3/6/0, and 6/6/0 models (depending on their access links).