Getting SQM running right on Jim Reisert's network with CeroWrt

-- Dave Taht

This was the end result of a bit of tuning of Cerowrt's Smart Queue Management (SQM) system for a cable modem. The SQM system (which works on any linux derived system) uses HTB + fq_codel underneath to give low latency to competing streams, and the codel AQM system to keep overall queue lengths short.

During a videoconference and screen sharing session over skype, we saturated the network with a rrul test for 5 minutes.

Download and upload speeds remained high, latency remained low, and there was no observable effect on the video conference. It was *perfect*.

It took 4 tries (and 5 minutes) to get a setting that worked well though! After installing the latest cerowrt, and leaving SQM off Jim allowed me in to run the rrul test remotely. This was how his cable connection behaved, with the usual 1-2 seconds worth of induced latency common to (and bedeviling!) current cable deployments: Note that the up and download figures for these tests are reversed as I was running rrul from a remote server, not from within his network, as is normally done.

While awesome to be able to run this test over native ipv6, 1.2 seconds of latency left something to be desired. (the latency problem has nothing to do with ipv6, or ipv4, but bufferbloat in the modem and CMTS).

The early spike here of extra bandwidth is due to speedboost kicking in for 10 seconds and providing some extra bandwidth, but even as it begins to kick in latencies are already skyrocketing.

So taking a guess at the bandwidth from the averages (the black line * 4) on the up/down graphs, we tried setting setting cerowrt's Smart Queue Management system (SQM) to 38mbits down and 8 up. (well, actually I looked at the graphs and goofed, 7*4 = 28, not 38). Note also that the black lines do not correctly add in the bandwidth used up by the tcp acks in the opposite direction. On some systems you need to factor in ~1/40th the bandwidth used in the opposite direction for a more correct estimate.

A little better, but still including a side jaunt to the moon!

Taking another guess, we tried, 24mbit down and 6 up

Much better! But given the increase in latency and the average where it was, it was apparent that 6 mbit up was still too much, so we knocked that down to 4400, and got this:

A total increase of observable latency over the baseline of 65ms of 10 milliseconds (vs 1.2 seconds! A 110x improvement... ) and good sharing between streams and good throughput. And thus, we declared victory, and then talked for an hour doing various other tests while the videoconference continued to rock.

Notes:

These tests were on Cerowrt against a Comcast connection with IPv6 enabled, taken with a Motorola SB6141 cablemodem running firmware SB_KOMODO-1.0.6.10-SCM00-NOSH. OpenWrt's Qos-scripts use similar techniques to CeroWrt's SQM system, but are not ipv6 compatible, neither are most versions of wondershaper. It is unknown to what extent other smart queue management systems (gentoo, ipfire, streamboost, gargoyle) handle ipv6 at present. (and CeroWrt gets the same good results with any combination of ipv4 and ipv6)

Update 2014-5-17

I reran the plots to clean up the plotting bug that we'd had in an earlier version. Since this test series was first run the netperf-wrapper tool has gained the ability to compare multiple test runs. We were well aware, that by disabling powerboost as we currently do, to get consistent latency, we were leaving some bandwidth on the floor. How much was kind of an unknown.

Now we know. The speedboost algorithm is fairly well documented, and we do think that with some tweaks and fixes to the htb rate limiter to allow for more burstyness we can keep latencies reliably low and get closer to the full bandwidth available from a cable modem, all the time.

(but we have no funding, and we're focused on fixing wireless next)

Losing that initial bit of bandwidth, in light of always getting good latency, seems like the bigger win, presently.