Using iproute2
The iproute2 package ships
with most Linux distributions, often under the name iproute . You
can also obtain it from , its official home. This package includes several commands, two of
which are covered here: ip and tc .
Using ip
The iproute2 command that's
used for manipulating routing tables and rules is ip . This program relies
on several of the suboptions of IP: Advanced Router in the kernel
configuration, as described earlier. The program is used as follows: ip command [list | add | del] selector action
You can specify any of several commands. One
of the most important of these is rule . You can use this command to add ( add ), delete ( del ), or
display information on ( list ) specific routing rules. You specify a rule with the selector ,
which itself is composed of several items: [from addr ] [to addr ] [tos TOS ] [dev device-name ] [pref number ]
The from and to elements
allow you to specify IP addresses, tos lets you specify a
TOS value (which is a number, such as 4 ; this requires a kernel
option that's described shortly), dev specifies the network device name
(such as eth0 ), and pref signifies a preference number. These items collectively tell Linux
how to identify packets to which a given rule applies. The ip rule command links these to an action , which has several components: [table table-id ] [nat address ] [prohibit | reject | unreachable]
The table-id is a number
identifying a particular routing table, nat lets you specify a
new source address for the packet, and prohibit , reject , and unreachable are codes to indicate various methods of completely rejecting the packet.Putting this all together, you might enter an
ip command that resembles the following: # ip rule add from 172.20.24.128 dev eth0 table 2
This rule tells the system to use routing
table 2 for all traffic from 172.20.24.128 on eth0 . What, though, is
routing table 2? An ordinary Linux installation uses the route command
to create the routing table, and there's precisely one routing table on such a
system. The advanced routing features allow you to use multiple routing tables,
which you set up with the ip
route command. You can then quickly switch between
different routing tables for handling different types of traffic, using other
routing tools. This command is more complex than the normal route , but its
features are mostly a superset of the normal route command. Thus, you
can use ip route much as you would route , as described in href="http:// /?xmlid=0-201-77423-2/ch02#ch02"> Chapter 2 . One
extension is particularly important, though: You can specify the routing table
number with the table table-id option. For instance, you might use the following command to add a
route to routing table 2: ip route add 10.201.0.0/16 dev eth1 table 2
Aside from the leading ip and the
trailing table 2 , this command works just like an equivalent route command.
Specifically, it tells the system to pass all data for the 10.201.0.0/16
network over eth1 without sending it to another router. (In this case, eth1 should
have an address on the 10.201.0.0/16 network.)
Using tc
The tc utility is what
utilizes the QoS and/or Fair Queueing kernel configuration options. You can use
it to manage outgoing network bandwidth, in order to prevent one class of
traffic from monopolizing the available bandwidth. For instance, suppose your
organization has two subnets, each corresponding to an office with a dozen
users. If a user from one of these offices begins using some very
bandwidth-intensive task, this action may degrade network performance for users
in the other office. You can use tc to provide a partial fix by
guaranteeing a certain amount of bandwidth for each subnet.NOTE

It's important to remember that a TCP/IP
router (or any computer on a TCP/IP network) can only control its outgoing traffic. Thus, tc can only
adjust outgoing bandwidth. This works in a router because a sender will slow
its transmission of TCP packets when it sees that your router is saturated,
even if that saturation is created through a QoS policy. (This won't
work for UDP packets, though.)
The basic syntax of tc is as
follows: tc [ options ] object command
Each of the parameters has certain possible
values: options This can be -statistics (or -s ), -details (or -d ), or -raw (or -r ). object This can be qdisc , class , or filter . The qdisc sets the queueing discipline a
specific rule. The class defines a set of packets that fit a category (such as one of the
two offices). The filter brings these together to generate a filter rule. command The command is a set of parameters that define precisely what tc does with
the object . What goes into a command is quite varied and object -specific.To use tc , you use it to
generate a series of rules that together define the networks to which the
computer is connected and how the available bandwidth should be allocated among
these networks. For instance, suppose you want to implement a 50/50 split of
100Mbps of outgoing bandwidth between two offices. The Internet at large is on eth0 , and both
offices are on eth1 , although one uses the 192.168.1.0/24 IP address subnet and the
other uses 192.168.2.0/24. To begin the process, use tc to
initialize a queueing discipline on eth1 : # tc qdisc add dev eth1 root handle 10: cbq bandwidth 100Mbit \ avpkt 1000
This command can be broken down into several
parts: add dev eth1 This tells the system that you're adding a queueing discipline for
eth1 . root Some disciplines arrange themselves in virtual trees that branch
off of a "root." This parameter tells tc that you're creating a
new root for the tree. handle 10: This parameter defines a label ( handle )
for the discipline. cbq You must tell the system which queueing method to use. The
Class-Based-Queueing (CBQ) method is a common one. This entry should correspond
to the name of a specific option in the QoS and/or Fair Queueing kernel
configuration menu. bandwidth
100Mbit You must tell the system how much
bandwidth is available on the network. In the case of a router with differing
bandwidth on its separate ports, this will normally be the lesser bandwidth
value; you don't want to overschedule the bandwidth that's actually available. avpkt 1000 Network packets vary in size, but to schedule bandwidth use, the
system must have some idea of what the average packet size will be. One
thousand is a reasonable first guess, but it might be higher or lower on
particular networks.Now it's time to define classes for the
network as a whole and for each of the subnets whose bandwidth you want to
guarantee. You can do so with commands like the following: # tc class add dev eth1 parent 10:0 classid 10:1 cbq \ bandwidth 100Mbit rate 100Mbit allot 1514 weight 10Mbit \ prio 8 maxburst 20 avpkt 1000
This command is very much like the previous
one, but it sets up a class that defines one of the two subnets. Note that it
sets up the class to use the entire 100Mbps available bandwidth, because this
particular class corresponds to the root; subsequent commands subdivide this
bandwidth. This command has a few extra parameters and other differences,
compared to the previous tc command: class Rather than qdisc , this command uses class to define the class. parent 10:0 You specify the parentthe root of the treewith this parameter. Note
that you add 0 to the handle specified with the previous command. classid 10:1 This is the identifier for this particular class. allot 1514 This is the MTU value (plus a few bytes overhead) for the network. weight 1Mbit This is a tuning parameter, and may need to be adjusted for your
network. prio 8 This is a priority number. The higher the priority number, the more
priority the rule gets.The rules for the individual subnets look
very much like the last one: # tc class add dev eth1 parent 10:1 classid 10:100 cbq \ bandwidth 100Mbit rate 50Mbit allot 1514 weight 5Mbit \ prio 5 maxburst 20 avpkt 1000 bounded # tc class add dev eth1 parent 10:1 classid 10:200 cbq \ bandwidth 100Mbit rate 50Mbit allot 1514 weight 5Mbit \ prio 5 maxburst 20 avpkt 1000 bounded
These commands are nearly identical; they
differ only in their classid settings. Both refer to the root class as a parent, and both set up
a 50Mbps bandwidth allotment. (You can create an asymmetrical allotment if you
likesay, 60Mbps and 40Mbps.) The bounded option tells Linux to not give
more than the allotted bandwidth to a network class under any circumstances. This
is often inefficient, because if one office isn't using its full allotment, the
other can't use the unused amount. Omitting the bounded option gives
Linux the flexibility to let one office "borrow" bandwidth if the
other isn't using it, while enforcing a 50/50 split if both want bandwidth.Now it's necessary to associate a queueing
discipline with each of the two classes: # tc qdisc add dev eth1 parent 10:100 sfq quantum 1514b \ perturb 15 # tc qdisc add dev eth1 parent 10:200 sfq quantum 1514b \ perturb 15
These commands are similar to the original queueing
discipline assignment. They tell Linux to use the Stochastic Fairness Queueing
(SFQ) discipline to schedule traffic within
each office's subnet. SFQ is popular for this purpose because it requires
little CPU power, but other disciplines can be used if desired.The commands to this point haven't provided a
means for the kernel to differentiate traffic from the two offices
(192.168.1.0/24 and 192.168.2.0/24). The final two commands accomplish this
goal: # tc filter add dev eth1 parent 10:0 protocol ip prio 100 u32 \ match ip dst 192.168.1.0/24 flowid 10:100 # tc filter add dev eth1 parent 10:0 protocol ip prio 100 u32 \ match ip dst 192.168.2.0/24 flowid 10:200
These commands are similar to the preceding
ones, but they set up a filter rule to move traffic destined towards ( dst ) each of
the two networks through the appropriate classes. Each rule is given an equal
priority, and is matched using the u32 algorithm, which
works on IP address blocks.The preceding rules control the flow of data
from the Internet to the local systems. To be complete, you must create a
similar set of rules that control data passing in the opposite direction. These
rules would resemble the preceding ones, but they would refer to eth0 (the
external interface) rather than eth1 (the internal interface), and the
final two filter commands would use src rather than dst to
indicate that they control traffic originating from a local source rather than
a destination.