January 7, 2013

As told to Paul E. McKenney and Dave Täht by the bufferbloat list.

Introduction

Bufferbloat severely degrades Internet response times on the internet edge, particularly for low-bandwidth but latency sensitive traffic such as voice-over-IP (VOIP), gaming, or web browsing. Although unintelligible VOIP connections and sluggish web-page loads are perhaps the most familiar symptom of bufferbloat, it also impedes other important types of traffic, including TCP connection establishment, DNS lookups, DHCP packets, ARP packets, and routing packets. Because timely delivery of these packets is critical to network operation for all types of traffic, bufferbloat affects everyone. The Internet is literally drowning in its own buffers.

Fortunately, Kathleen Nichols and Van Jacobson have provided an important weapon in the fight against bufferbloat, namely the CoDel queueing algorithm, as noted on page 9 of the Internet Society's Bandwidth Management Technology Roundtable Series. And just as fortunately, Eric Dumazet's and Dave Täht's Codel implementation appeared in version 3.5 of the Linux kernel as net/sched/sch_codel.c. However, in his IETF presentation, Van Jacobson recommended the use of FQ-CoDel, which combines stochastic fairness queueing (SFQ) with CoDel:

“FQ-CoDel provides great isolation... If you've got low-rate videoconferencing and low rate web traffic they never get dropped. A lot of the issues with iw10 go away, because all that other traffic sees is the front of the queue and you don't know how big its window is and you don't care because you are not affected by it. And: FQ-CoDel increases utilization across your entire networking fabric especially for bidirectional traffic... If we're sticking code into boxes to deploy CoDel, don't do that. Deploy FQ-CoDel. It's just an across the board win.” – Van Jacobson

Eric and Dave were ahead of this game as well, and their FQ-CoDel implementation also appeared in v3.5 as net/sched/sch_fq_codel.c. Of course, Alexey Kuznetsov implemented SFQ itself as net/sched/sch_sfq.c back in the 1990s.

So how does FQ-CoDel differ from SFQ on the one hand and from pure CoDel on the other? The remainder of this article addresses this question as follows:

  1. SFQ Overview.
  2. FQ-CoDel Overview.
  3. CoDel Overview.
  4. Configuring FQ-CoDel.
  5. Effectiveness of FQ-CoDel.
  6. Remaining Challenges.
  7. Summary and Advice.

This is of course followed by the Answers to Quick Quizzes.

SFQ Overview

My purpose in inventing SFQ back in the late 80s was straightforward: With high probability, isolate “hog” sessions from other sessions so that the “hog” sessions bear the brunt of any packet dropping that might be required. To this end, an example SFQ data structure might look as follows:

SFQ.png

The horizontal row of boxes labeled A, B, C, and D represent a hash table, where each hash bucket contains a queue. Each incoming packet is enqueued based on a hash of its “quintuple”, namely its source address, source port, destination address, destination port, and IP protocol (e.g., 6 for TCP or 17 for UDP). The default number of hash buckets in the Linux kernel implementation is 128, but the figure above shows only four buckets for clarity. As shown in the diagram, each hash bucket is a queue that can hold a number of packets (denoted by empty boxes) in doubly linked lists. In the Linux kernel implementation, a given queue can hold at most 127 packets.

Each non-empty bucket is linked into a doubly linked list, which in this example contains buckets A, C, and D. This list is traversed when dequeueing packets. In this example, the next bucket to dequeue from is D, indicated by the dot-dashed arrow, and the next bucket after that is A.

Each non-empty bucket is also linked into a doubly linked list containing all other buckets with the same number of packets. These lists are indicated by the dashed arrows. These lists are anchored by the array shown on the left-hand side of the diagram. In this example, the buckets with one packet are A and D. The other list contains only C, which is the sole bucket having three packets.

There is also an index into this array that tracks the buckets with the most packets. In the diagram, this index is represented by the arrow pointing to array element 3. This index is used to find queues to steal packets from when the SFQ overflows. This approach means that (with high probability) packets will be dropped from “hog” sessions. These dropped packets can then be expected to cause the “hog” sessions to respond by decreasing their offered load, for example, due to TCP's end-to-end congestion control. This is the major purpose of the SFQ: To preferentially cause “hog” sessions to decrease their offered load, while allowing low-bandwidth sessions to continue undisturbed. This will in theory result in fair allocation of packet transmissions at network bottlenecks, at least for some probabilistic definition of “fair”.

There clearly will be some list maintenance required as packets are enqueued and dequeued, and readers interested in that sort of detail are referred to the SFQ paper.

Of course, it is possible that a low-bandwidth session will, though sheer bad luck, happen to hash to the same bucket as a “hog” session. In order to prevent this from becoming permanent bad luck, SFQ allows the hash function to be periodically perturbed, in essence periodically reshuffling the sessions. This can be quite effective, but unfortunately interacts poorly with many end-to-end congestion-control schemes because the rehashing often results in packet drops or packet reordering, either of which can cause the corresponding session to unnecessarily decrease offered load. Nevertheless, SFQ works well enough that it is often configured as a “leaf” packet scheduler in the Linux kernel.

Quick Quiz 1: But mightn't tricky protocol designers split their “hog” sessions over multiple TCP sessions? Wouldn't that defeat SFQ's attempt to fairly allocate bottleneck link packet transmissions?
Answer

CoDel Overview

CoDel is described in the LWN article, the ACM Queue paper, the CACM article, and Van Jacobson's IETF presentation. The basic idea is to control queue length, maintaining sufficient queueing to keep the outgoing link busy, but avoiding building up the queue beyond that point. This is done by preferentially dropping packets that remain in the queue for “too long” As such, FQ-CoDel is the first of a new class of active queue management (AQM) algorithms based on delay rather than on queue length. A key advantage of delay-based AQM algorithms over their queue-length predecessors such as random early detection (RED) is that the former require much less configuration and can be used with multiple queues, as will be seen in the next section.

When each new packet arrives, it is marked with its arrival time. Later, when it is that packet's turn to be dequeued, CoDel computes its sojourn time (the current time minus the arrival time). If the sojourn time for packets being dequeued exceeds the target time for a time period of at least interval, a packet will be dropped in order to signal the source endpoint to reduce its send rate. If the sojourn still remains above the target time, additional packet drops will occur on a schedule computed from an inverse-square-root control law until either (1) the queue becomes empty or (2) a packet is encountered with a sojourn time that is less than the target time. This target time, is normally set to about five milliseconds, and the interval is normally set to about 100 milliseconds. This approach has proven to be quite effective in a wide variety of situations.

This process is illustrated in the following (not to scale) diagram:

CoDelDrop.png

Here time increases from left to right, and the curve gives the sojourn time of the packet at the head of the CoDel queue as a function of time. As you can see, the sojourn time rises significantly above the target, requiring CoDel to react so as to bring the sojourn back below target at the right-hand end of the diagram.

As noted earlier, CoDel reacts by dropping or ECN-marking packets, and the second and subsequent vertical dot-dashed lines correspond to single dropped (or ECN-marked) packets. This means that even during the time that the sojourn time is greater than the target time, most packets are being transmitted rather than being dropped. The reason for this is that it can take on the order of 100 milliseconds for the fact of the packet drop to reach the traffic source. CoDel's design therefore must allow for this delay, which it does by scheduling packet-drops at an interval that is sufficiently large to allow the traffic source time to react. If the sojourn time remains above the target for an extended time period, CoDel drops at progressively decreasing intervals of time until a proper estimate of the round-trip time is obtained and the flow is brought under control.

However, one drawback of CoDel is that it controls only a single queue. As a result, packets from low-bandwidth sessions (such as VOIP sessions) can be delayed by packets from high-bandwidth upload sessions, for example, to dropbox or the scp command.

FQ-CoDel Overview

It would be better to allow the low-bandwidth time-sensitive VOIP packets to jump ahead of the hogs, but not to the extent that the hog stream is in any danger of starvation—or even in danger of significant throughput degradation. One way to do this is to combine CoDel with SFQ, resulting in FQ-CoDel. This combining requires significant rework of SFQ, but of course Eric Dumazet was up to the job.

Quick Quiz 2: What does the FQ-CoDel acronym expand to?
Answer

A rough schematic of FQ-CoDel is shown below:

FQ-CoDel.png

The most significant attribute of SFQ remains, namely that packets are hashed into multiple buckets based on their quintuple. However, each FQ-CoDel bucket contains a CoDel-managed queue instead of SFQ's FIFO queue. The group of sessions that hash to a given FQ-CoDel bucket is called a flow. In addition, the FQ-CoDel source code uses “flow” to denote the per-bucket data structure, including the queue.

Perhaps the next most significant change in that there are now two lists linking the flows together instead of just one. The first list contains flows A and D, namely the flows that with high probability contain packets from low-bandwidth time-sensitive sessions. The next flow to be dequeued from is indicated by the dash-dotted green arrow referencing bucket D. The second list contains all other non-empty flows, in this case only flow C, which with high probability contains “hog” sessions.

Quick Quiz 3: But mightn't flow C instead just contain a bunch of packets from a number of unlucky VOIP sessions? Wouldn't that be needlessly inflicting dropouts on the hapless VOIP users?
Answer

Quick Quiz 4: Suppose that a number of “hog” sessions are passing through a given instance of FQ-CoDel. Given the stochastic nature of FQ-CoDel, what guarantees fair treatment of the “hogs” with respect to each other?
Answer

FQ-CoDel operates by preferentially dequeueing from the low-bandwidth flows on the new_flows list. If a given flow has too much traffic for too long, it is presumed to contain a “hog” session, and is thus moved to the old_flows list. If there are no new_flows flows, FQ-CoDel dequeues from the first flow on the old_flows list.

The resulting migration of flows is shown more completely and precisely in the following state diagram:

flowstate.png

All flows are initially empty. When a packet arrives at an empty flow, that flow is classified as low bandwidth, and is thus added to the new_flows list in the implementation. Of course, this means that a new “hog” session will initially be misclassified. However, a “hog” session is likely to persist for some time, so the fraction of time that it spends misclassified is usually insignificant.

A flow on the new_flows list is guaranteed to eventually either become empty or exceed the quantum. If it exceeds its quantum, it is moved to the end of the old_flows list.

Quick Quiz 5: Why couldn't packets arrive at just the right rate so that the flow never emptied, but at the same time, the packets being dequeued were always recent arrivals? Couldn't that result in starvation of the old_flows?
Answer

Interestingly enough, if a new_flows flow becomes empty, it is also moved to the end of the old_flows list. This is extremely important, as it prevents a constant drizzle of packets from low-bandwidth sessions from starving the old_flows list. Sooner or later, all flows on the new_flows list would move to the old_flows list, allowing the old_flows list to be serviced.

Once the new_flows list empties, the flow at the head of the old_flows list will be dequeued from until it either one of its packets exceeds its quantum (in which case it is moved to the end of the old_flows list), or until it empties, in which case it is removed from the old_flows list, returning it to its initial state, namely the box labeled “Empty&rdquo in the above diagram.

Another important FQ-CoDel change is that it drops packets from the head of the queue, rather than the traditional drop from the tail, a tradition that SFQ adhered to. To see the benefit of dropping from the head rather than the tail, keep in mind that for many transport protocols (including TCP), a dropped packet signals the sender to reduce its offered load. Clearly, the faster this signal reaches the sender the better.

Similarly, with VOIP, the contents of a single dropped packet is easily extrapolated given the previous and subsequent packets. In general, it is more important to maintain VOIP delay and jitter below 10 milliseconds than it is to guarantee 100% reliable VOIP packet delivery.

If we drop from the tail of a long queue, this signal must propagate through the queue as well as traversing the network to the receiver and then (via some sort of acknowledgement) back to the sender. Furthermore, with a naive tail drop strategy, TCP global synchronization can occur.

In contrast, if we drop from the head of a long queue, the signal need not propagate through the queue itself, but needs only traverse the network.

This faster propagation enables the transport protocols to more quickly adjust their offered load, resulting in faster reduction in queue length, which in turn results in faster reduction in network round-trip time, which finally improves overall network responsiveness, as illustrated in the following diagram.

headtaildrop.png

In addition, dropping from the head instead of the tail means that older packets are preferentially dropped, which is helpful in cases where faster propagation of newer information is more important than slower propagation of older information.

Another difference between SFQ and FQ-CoDel is that the array on the left-hand side of the diagram is simply an array of ints in FQ-CoDel, as opposed to SFQ's array of list headers. This change was necessary because FQ-CoDel does its accounting in bytes rather than packets, which allows the benefits of byte queue limits (BQL) to be brought to bear. But because there is an extremely large number of possible packet sizes, blindly using the SFQ approach would have resulted in a truly huge array. For example, assume an MTU of 512 bytes with a limit of 127 packets per bucket. If the SFQ approach were used, with a separate array entry per possible bucket size in bytes, the array would need more than 65,000 entries, which is clearly overkill. In addition, because transmission of a 1,500-byte packet would require that the queue be moved 1,500 entries down the array, breaking SFQ's guarantee that all operations be O(1).

Instead, for FQ-CoDel, the left-hand array has one entry per bucket, where each entry contains the current count of bytes for the corresponding bucket.

When it is necessary to drop a packet due to filling the queue completely, FQ-CoDel scans this array looking for the largest entry. Because the array has only 1024 entries comprising 4096 contiguous bytes, the caches of modern microprocessors make short work of scanning this array.

Yes, there is some overhead, but then again one of the strengths of CoDel is that it manages the queue length dynamically—overrunning the size of the queue is reasonably infrequent.

Finally, FQ-CoDel does not perturb the hash function at runtime. Instead, a hash function is selected randomly from a set of about 4 billion possible hash functions when the queue discipline is initialized on a given interface.

The overall effect is that FQ-CoDel delivers low latency and high reliability on the one hand and high bandwidth— with a queue length accurately computed for the available bandwidth—on the other.

Quick Quiz 6: But is FQ-CoDel fair?
Answer

Configuring FQ-CoDel

Because FQ-CoDel makes use of a number of Linux-kernel networking features, it is usually not sufficient to simply enable it via the CONFIG_NET_SCH_FQ_CODEL kernel parameter. In addition, your Ethernet network driver must be instrumented to support packet scheduling, for example as shown in this patch for the SMSC911x Ethernet driver. This instrumentation is provided via calls to netdev_completed_queue(), netdev_sent_queue, and netdev_reset_queue. Note that you must build your kernel with the CONFIG_BQL kernel parameter enabled, because otherwise these three functions are no-ops. In addition, some FQ-CoDel configurations also require that the CONFIG_NET_SCH_HTB kernel parameter be enabled.

Quick Quiz 7: What if my network driver does not yet have the needed calls to netdev_completed_queue(), netdev_sent_queue, and netdev_reset_queue?
Answer

In addition, it is necessary to configure FQ-CoDel using the tc command. An excellent howto is available on the OpenWRT web site, which you should read carefully. That said, if “tc -s qdisc show dev eth0” does not show fq_codel in its output, you do not have FQ-CoDel properly configured for eth0.

In short, to use FQ-CoDel, you must:

  1. Ensure that your network driver has been modified to support packet scheduling.
  2. Build your kernel with both the CONFIG_NET_SCH_FQ_CODEL and CONFIG_BQL kernel parameters, and perhaps also the CONFIG_NET_SCH_HTB kernel parameter.
  3. Use the tc command to configure FQ-CoDel on the desired networking devices, as described in the OpenWRT howto.

Of course, if you deploy FQ-CoDel in production, you will want to make sure that it is started automatically at boot and during network-device hotplug operations. An example setup may be found here.

Effectiveness of FQ-CoDel

To demonstrate the effectiveness of FQ-CoDel, Dave Täht and David Woodhouse ran a test concurrently running four TCP uploads, four additional TCP downloads, along with four low-bandwidth workloads, three of which used UDP while the fourth used ICMP ping packets. The graphs below show the throughputs of the TCP streams and the latencies of the low-bandwidth workloads. The graph to the right uses FQ-CoDel, while that to the left does not.

data.2012.11.23a/plots/nofq.svg data.2012.11.23a/plots/fq.svg

Here, “BE” is best-effort (no marking), “BK” is bulk (class selector 1 (CS1) marking), “EF” is expedited forwarding, and “CS5” is class selector 5 (which is higher precedence/priority than CS1).

As you can see, FQ-CoDel is extremely effective, improving the low-bandwidth latency by roughly a factor of four, with no noticeable degradation in throughput for the uploads and downloads. Note also that without FQ-CoDel, the latency is closely related to the throughput, as can be seen by the step-up behavior when first the downloads and then the uploads start. In contrast, the FQ-CoDel latency is not affected much by the throughput, as is desired.

The jumps in throughput near the beginnings and ends of the tests are likely due to streams starting and finishing early. The smaller per-session spikes in throughput during the tests require a bit more explanation. The key point is that both CoDel and FQ-CoDel exert control by dropping packets. These packet drops can force individual sessions to sharply reduce their offered load momentarily, in accordance with TCP's end-to-end congestion control. The sessions recover quickly and sometimes also overshoot when slow-starting, resulting in the spikes. Note that the overall average throughput, indicated by the black trace, does not vary much, so the aggregate bandwidth is quite steady.

Remaining Challenges

Although FQ-CoDel is quite effective, there is still ample room for improvement.

One pressing problem is that of low-bandwidth links. To see this, consider a 1 Mbit/s link, which requires more than 12 milliseconds to transmit a 1536-byte packet. Unfortunately, this time is more than double FQ-CoDel's typical target of 5 milliseconds, which in turn prevents FQ-CoDel from distinguishing between low-bandwidth and “hog” sessions. This problem might be addressed by reducing the MTU or by increasing FQ-CoDel's quantum to (say) 30 milliseconds, for example, by using the target argument to the fq_codel discipline (see Dan Siemon's script for sample usage). However, both of these conflict with FQ-CoDel's creators' desire that FQ-CoDel remain parameterless, requiring no configuration. But perhaps a compromise can be reached where FQ-CoDel automatically configures itself based on the expected bandwidth of the network device.

Quick Quiz 8: But what if FQ-CoDel is configured on a high-bandwidth device such as 100 Mbit/s Ethernet, which then feeds into a low-bandwidth ADSL line? In that case, shouldn't FQ-CoDel configure itself to the ADSL line's bandwidth instead of that of Ethernet?
Answer

An even sterner challenge is posed by WiFi, which offers widely varying bandwidths depending on who else is using it and the pattern of traffic. Furthermore, most WiFi devices have lots of internal queueing that these devices use to optimize bandwidth by aggregating short packets destined for the same device, which makes FQ-CoDel's head dropping less effective. Although FQ-CoDel can still help when used with WiFi, optimally addressing bufferbloat in the presence of WiFi is still largely an unsolved problem.

In addition, although FQ-CoDel works extremely well near endpoints, ISPs and core routers may need to use other approaches, especially if they are using shared hardware to handle both leased-line and Internet traffic. Other proposals have been put forward to handle these sorts of situations, however, it is quite possible that FQ-CoDel can work well in non-endpoint network locations, for example, by modifying FQ-CoDel's hashing to use something other than the quintuple.

Finally, high-speed network devices, for example, 40 Gbit/s Ethernet, often use multiple transmit queues to reduce contention among CPUs for the device registers. The interaction of FQ-CoDel with multiple transmit queues is the subject of ongoing work.

Despite all these challenges, FQ-CoDel as it is implemented today is extremely useful in the fight against bufferbloat, and needs to be deployed rapidly and widely.

Quick Quiz 9: So, what happens if someone comes up with a type of traffic that it does not handle very well? Trust me, this will happen sooner or later.
Answer

Summary and Advice

FQ-CoDel combines the best of CoDel and SFQ, making a few needed changes along the way. Testing thus far has shown that it works extremely well for current Internet traffic. Therefore, FQ-CoDel is an important weapon in the fight against bufferbloat in today's Internet.

So, when should you use FQ-CoDel? As we have seen, it is most effective when you have time-sensitive low-bandwidth traffic competing with high-bandwidth sessions. Of course, given TCP connection establishment, DNS, DHCP, ARP, and routing packets, everyone has some low-bandwidth traffic. FQ CoDel can be very effective at preventing the high-bandwidth traffic from degrading the response time of the low-bandwidth traffic. However, in the short term, the direction of the high-bandwidth flows is important. For example, installing FQ-CoDel on your home computer would keep large uploads from unduly interfering with your VOIP sessions. However, it would not do much to help prevent such interference from large downloads: The offending queues would be building up at your ISP rather than on your home computer. This situation underscores the importance of applying FQ-CoDel or similar algorithms throughout Internet.

In addition, although FQ-CoDel will allocate bandwidth fairly among competing high-bandwidth sessions, it is not likely to improve the aggregate throughput. At some point, bigger pipes are required.

Again, FQ-CoDel is an important weapon in the fight against bufferbloat, and I very much look forward to its rapid deployment!

To Probe Further

  1. Jim Gettys's blog
  2. CeroWrt (OpenWRT-based project addressing bufferbloat)
  3. CoDel wiki
  4. Best Practices for Benchmarking CoDel and FQ-CoDel
  5. Battling Bufferbloat: An experimental comparison of four approaches to queue management in Linux by Toke Høiland-Jørgensen (Master module project for Roskilde University).

Acknowledgments

We all owe thanks to Kathleen Nichols and Van Jacobson for CoDel, and to Eric Dumazet and Dave Täht for FQ-CoDel and for the Linux-kernel implementation of both CoDel and FQ-CoDel. I am grateful to Dave Siemon for his (see script demonstrating FQ-CoDel, and to Jim Gettys, Dave Täht, Kathleen Nichols, Bob Briscoe, Andrew McGregor, Toke Høiland-Jørgensen, David P. Reed, Rick Jones, David Lang, Michael Richardson, Greg White, Jonathan Morten, Eric Dumazet, and Richard Brown for many insightful discussions. Dave Täht in particular encouraged me to start this article in the first place, and provided much-needed encouragement along the way. We all are indebted to Richard Brown for helping to make this article human-readable. Finally, I thank Jim Wasko for his support of this effort.

Legal Statement

This work represents the view of the author and does not necessarily represent the view of IBM.

Linux is a registered trademark of Linus Torvalds.

Other company, product, and service names may be trademarks or service marks of others.

Answers to Quick Quizzes

Quick Quiz 1: But mightn't tricky protocol designers split their “hog” sessions over multiple TCP sessions? Wouldn't that defeat SFQ's attempt to fairly allocate bottleneck link packet transmissions?

Answer: Indeed it might, because the separate TCP sessions would probably occupy different buckets, each getting a separate share of the bandwidth. However, web browsers already use multiple TCP sessions to attain better performance, and SFQ nevertheless seems to work reasonably well. If this sort of thing becomes pathological, which it might, given that multiple TCP sessions defeats slow-start, there are ways to deal with it. And there will no doubt be ways of abusing the resulting modified SFQ. Hey, I never promised you that life would be easy! ;-)

Back to Quick Quiz 1.

Quick Quiz 2: What does the FQ-CoDel acronym expand to?

Answer: There are some differences of opinion on this. The comment header in net/sched/sch_fq_codel.c says “Fair Queue CoDel” (presumably by analogy to SFQ's expansion of “Stochastic Fairness Queueing”), and “CoDel” is generally agreed to expand to “controlled delay”. However, some prefer “Flow Queue Controlled Delay” and still others prefer to prepend a silent and invisible "S", expanding to “Stochastic Flow Queue Controlled Delay” or “Smart Flow Queue Controlled Delay”. No doubt additional expansions will appear in the fullness of time.

In the meantime, this article focuses on FQ-CoDel concepts, implementation, and performance, leaving naming debates to others.

Back to Quick Quiz 2.

Quick Quiz 3: But mightn't flow C instead just contain a bunch of packets from a number of unlucky VOIP sessions? Wouldn't that be needlessly inflicting dropouts on the hapless VOIP users?

Answer: Indeed it might. However, given that FQ-CoDel by default uses no fewer than 1024 hash buckets, the probabilty that (say) 100 VOIP sessions will all hash to the same bucket is something like ten to the power of minus 300. Thus, the probability that at least one of the VOIP sessions will hash to some other flow is very high indeed.

But what is the probability that each of the 100 VOIP sessions will get its own flow? This is given by (1023!/(924!*1024^99)) or about 0.007, which although much more highly probable than ten to the power of minus 300, is still not all that probable.

Fortunately, the probability rises sharply if we are willing to accept a few collisions. For example, there is about an 86% probability that no more than two of the 100 VOIP sessions will be involved in any given collision, and about a 99% probability that no more than three of the VOIP sessions will be involved in any given collision. These last two results were computed using Monte Carlo simulations: Oddly enough, the mathematics for VOIP-session collision exactly matches that of hardware cache overflow.

The alert and historically savvy reader might note that the original SFQ failed to implement session segregation. There are two reasons for this: (1) I didn't think of it back then, and (2) It might not have been a winning strategy for the low-clock-rate 68000 CPUs that I was using at the time.

Back to Quick Quiz 3.

Quick Quiz 4: Suppose that a number of “hog” sessions are passing through a given instance of FQ-CoDel. Given the stochastic nature of FQ-CoDel, what guarantees fair treatment of the “hogs” with respect to each other?

Answer: Unfairness among “hogs” is indeed possible, for example, if two “hogs” hash to the same flow, they will receive less bandwidth than “hogs” having their own flow. Of course, the probability of excessive collisions between “hog” sessions is just as low as that for VOIP sessions.

Nevertheless, SFQ addresses this by allowing the hash function to be periodically perturbed. Providing a similar perturbation capability for FQ-CoDel is ongoing work.

Back to Quick Quiz 4.

Quick Quiz 5: Why couldn't packets arrive at just the right rate so that the flow never emptied, but at the same time, the packets being dequeued were always recent arrivals? Couldn't that result in starvation of the old_flows?

Answer: No, it cannot. The reason is that FQ-CoDel keeps dequeueing from the same flow until that flow either becomes empty or exceeds the quantum.

Back to Quick Quiz 5.

Quick Quiz 6: But is FQ-CoDel fair?

Answer: Given the many different meanings of “fairness” in networking, you can make a case for pretty much any answer you wish. Andrew McGregor argues that FQ-CoDel is weighted delay jitter fair, in other words, individual sessions are only permitted to inflict limited amounts of jitter onto other sessions. Although theoretical analysis of FQ-CoDel is at best in its infancy, I hope that future analysis provides many interesting insights into the principles of its operation.

Back to Quick Quiz 6.

Quick Quiz 7: What if my network driver does not yet have the needed calls to netdev_completed_queue(), netdev_sent_queue, and netdev_reset_queue?

Answer: First, check to see if there is a recent patch that adds these functions. If not, and if you are willing and able to do some hacking, feel free to try adding them to your driver, testing the result and submitting the patch upstream. Finally, there is an ongoing effort to add these functions, spearheaded by Dave Täht, but help is always welcome!

Back to Quick Quiz 7.

Quick Quiz 8: But what if FQ-CoDel is configured on a high-bandwidth device such as 100 Mbit/s Ethernet, which then feeds into a low-bandwidth ADSL line? In that case, shouldn't FQ-CoDel configure itself to the ADSL line's bandwidth instead of that of Ethernet?

Answer: Indeed, that could be a problem. Worse yet, suppose that the system is simultaneously communicating not only with systems across the ADSL line, but also with local systems connected to the Ethernet.

One way to solve this problem is to install FQ-CoDel on the Ethernet hubs/switches and on the ADSL modem. This would allow systems connected to Ethernet to use FQ-CoDel with the standard 5-millisecond target, while the ADSL modem could use a larger target matched to the available ADSL bandwidth. in time.

Of course, getting AQM mechanism (such as FQ-CoDel) installed on all networking infrastructure will take some time. Furthermore, more complex topologies will likely pose additional challenges. Then again, nothing is perfect, and we must never allow imagined perfection to crowd out real improvement.

Back to Quick Quiz 8.

Quick Quiz 9: So, what happens if someone comes up with a type of traffic that it does not handle very well? Trust me, this will happen sooner or later.

Answer: When it happens, it will be dealt with—and even now, FQ-CoDel workers are looking at other AQM schemes (for example, CeroWrt) to see if FQ-CoDel can be further improved. However, FQ-CoDel works well as is, so we can expect to see it deployed widely, which means that we should soon reap the benefits of improved VOIP sessions with minimal impact on bulk-data downloads.

Back to Quick Quiz 9.