Difference between revisions of "BLT"
Gigs Taggart (talk | contribs) (→BLT1) |
Alexa Linden (talk | contribs) |
||
(10 intermediate revisions by one other user not shown) | |||
Line 11: | Line 11: | ||
If you confirm a bug under one of the test scenarios listed below, the following format should be used in the reproduction in a Jira comment: | If you confirm a bug under one of the test scenarios listed below, the following format should be used in the reproduction in a Jira comment: | ||
<pre> | |||
Incoming Latency: | |||
Outgoing Latency: | |||
Packet Loss Parameters: | |||
</pre> | |||
Followed by the step-by-step reproduction of the bug, under the BLT conditions listed. | Followed by the step-by-step reproduction of the bug, under the BLT conditions listed. | ||
== Rules == | == Rules == | ||
* A partial fix should be noted as such in the Jira comments, along with which scenarios still fail. | * A partial fix should be noted as such in the Jira comments, along with which scenarios still fail. | ||
* High percentage reproductions of a bug justify placing that bug on the upcoming triage schedule, if it is not already imported into private Jira. | * High percentage reproductions of a bug justify placing that bug on the upcoming triage schedule, if it is not already imported into private Jira. | ||
Line 34: | Line 31: | ||
http://linux-net.osdl.org/index.php/Netem | http://linux-net.osdl.org/index.php/Netem | ||
Note that you will be missing the " | Note that you will be missing the "ifb" module on Ubuntu boxes, unless you have 7.10. They just screwed up and forgot to compile it for several versions, it's a standard kernel module that should be compiled on most distros. | ||
For scanarios that require incoming packet adjustments: | For scanarios that require incoming packet adjustments: | ||
< | <pre> | ||
modprobe ifb | |||
ip link set dev ifb0 up | |||
tc qdisc add dev eth0 ingress | |||
tc filter add dev eth0 parent ffff: \ | |||
protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0 | protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0 | ||
</ | </pre> | ||
Replace eth0 with your interface. This is required for any scenario that mentions "ifb0" | Replace eth0 with your interface. This is required for any scenario that mentions "ifb0" | ||
NOTE: wherever it says "add dev" replace that with "change dev" if you are already running one scenario. | NOTE: wherever it says "add dev" replace that with "change dev" if you are already running one scenario. | ||
== Alternate Test Setups == | |||
These alternate test setups may be used to reproduce bugs also, but the results may or may not correspond exactly to netem setups: | |||
One alternative to consider for those not inclined to set up a Linux box is [http://m0n0.ch/wall/ m0n0wall], which is a nice distribution which includes web administration of DummyNet. The nifty thing about m0n0wall is that it can be run in two configurations that are potentially easier for Windows developers: | |||
# [http://chrisbuechler.com/m0n0wall/vmware/ As a VMWare image], which can be run with the [http://www.vmware.com/products/player/ free (as in beer) VMWare player] | |||
# [http://m0n0.ch/wall/hardware.php with a cheap (~$250), low-power/profile Soekris box] for those that are going to do a lot of testing and don't want to bog down their main machine. Not to mention, the Soekris boxes are fun toys for many other uses. | |||
== BLT Scenarios == | == BLT Scenarios == | ||
NOTE: wherever it says "change dev" below replace that with "add dev" if it is your first scenario. | NOTE: wherever it says "change dev" below replace that with "add dev" if it is your first scenario. Change eth0 to reflect whatever interface you use. | ||
* Asymmetric Latency | |||
<pre> | <pre> | ||
tc qdisc change dev eth0 root netem delay 300ms | tc qdisc change dev eth0 root netem delay 300ms | ||
</pre> | </pre> | ||
This adds 300ms of latency on top of whatever was already there. This is an asymmetric latency, it is only on outgoing packets. | |||
* Symmetric latency. | |||
<pre> | <pre> | ||
tc qdisc change dev eth0 root netem delay | tc qdisc change dev eth0 root netem delay 500ms | ||
tc qdisc change dev ifb0 root netem delay | tc qdisc change dev ifb0 root netem delay 500ms | ||
</pre> | </pre> | ||
This is good for reproducing a bug when all else fails. Try this one first, then try the lesser scenarios. If the lesser scenarios still repro it, mention them instead. | |||
150ms symmetric is similar to some far-flung international user's connections. Note that due to bandwidth*delay product, this scenario also constrains TCP bandwidth in nearly all cases. MS Vista, with dynamic window sizing up to 8megs, and linux, with generally better window settings, may perform better than other OS (especially old windows, which had terrible window settings) on this scenario, so beware. | |||
Symmetric 75ms latency is similar to europe (or SF to East Coast), symmetric 40ms latency is similar to East Coast to TX. | |||
Variable Latency | * Variable Latency | ||
<pre> | <pre> | ||
tc qdisc change dev eth0 root netem delay 200ms 200ms 50% | tc qdisc change dev eth0 root netem delay 200ms 200ms 50% | ||
tc qdisc change dev ifb0 root netem delay 200ms 200ms 50% | tc qdisc change dev ifb0 root netem delay 200ms 200ms 50% | ||
</pre> | </pre> | ||
This adds variable latency from 0- | This adds variable round trip latency from 0-400ms. | ||
* Emulated Satellite internet | |||
Satellite internet | |||
<pre> | <pre> | ||
tc qdisc change dev eth0 root netem delay | tc qdisc change dev eth0 root netem delay 350ms 50ms 50% | ||
tc qdisc change dev ifb0 root netem delay | tc qdisc change dev ifb0 root netem delay 350ms 50ms 50% | ||
tc qdisc change dev eth0 root netem loss 0.3% 33.33% | tc qdisc change dev eth0 root netem loss 0.3% 33.33% | ||
tc qdisc change dev ifb0 root netem loss 0.3% 33.33% | tc qdisc change dev ifb0 root netem loss 0.3% 33.33% | ||
Line 112: | Line 99: | ||
In reality, satellite internet isn't THIS bad, because they do use TCP acceleration which converts TCP into a sort of UDP-esqe stream to avoid bandwidth-delay product problems that severely constrain bandwidth in this scenario. The latency and loss figures, are however accurate. | In reality, satellite internet isn't THIS bad, because they do use TCP acceleration which converts TCP into a sort of UDP-esqe stream to avoid bandwidth-delay product problems that severely constrain bandwidth in this scenario. The latency and loss figures, are however accurate. | ||
* Packet Loss | |||
Low bursty packet loss | Low bursty packet loss | ||
Line 119: | Line 106: | ||
tc qdisc change dev ifb0 root netem loss 0.3% 33.33% | tc qdisc change dev ifb0 root netem loss 0.3% 33.33% | ||
</pre> | </pre> | ||
High bursty packet loss | High bursty packet loss | ||
Line 126: | Line 112: | ||
tc qdisc change dev ifb0 root netem loss 3.0% 33.33% | tc qdisc change dev ifb0 root netem loss 3.0% 33.33% | ||
</pre> | </pre> | ||
Insane packet loss | Insane packet loss | ||
Line 136: | Line 121: | ||
[[Category:Quality Assurance]] | [[Category:Quality Assurance]] |
Latest revision as of 15:53, 21 May 2012
Bandwidth and Latency Testing (BLT) Protocol
Latency, packet loss, and low bandwidth related bugs are a major problem in Second Life. One huge barrier to fixing these bugs is a lack of testing protocol to reproduce these bugs that will rarely show up for someone who lives in San Francisco, or even in America. The BLT Protocol aims to create a standardized set of test conditions that can be referred to in reproductions, and tested against during QA.
This meta-bug contains all bugs that might be related to latency, bandwidth, or packet loss: MISC-506
Reproducing a bug
If you confirm a bug under one of the test scenarios listed below, the following format should be used in the reproduction in a Jira comment:
Incoming Latency: Outgoing Latency: Packet Loss Parameters:
Followed by the step-by-step reproduction of the bug, under the BLT conditions listed.
Rules
- A partial fix should be noted as such in the Jira comments, along with which scenarios still fail.
- High percentage reproductions of a bug justify placing that bug on the upcoming triage schedule, if it is not already imported into private Jira.
Test Setup
Testing is done with the linux kernel traffic shaping and WAN emulation modules.
http://linux-net.osdl.org/index.php/Netem
Note that you will be missing the "ifb" module on Ubuntu boxes, unless you have 7.10. They just screwed up and forgot to compile it for several versions, it's a standard kernel module that should be compiled on most distros.
For scanarios that require incoming packet adjustments:
modprobe ifb ip link set dev ifb0 up tc qdisc add dev eth0 ingress tc filter add dev eth0 parent ffff: \ protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
Replace eth0 with your interface. This is required for any scenario that mentions "ifb0"
NOTE: wherever it says "add dev" replace that with "change dev" if you are already running one scenario.
Alternate Test Setups
These alternate test setups may be used to reproduce bugs also, but the results may or may not correspond exactly to netem setups:
One alternative to consider for those not inclined to set up a Linux box is m0n0wall, which is a nice distribution which includes web administration of DummyNet. The nifty thing about m0n0wall is that it can be run in two configurations that are potentially easier for Windows developers:
- As a VMWare image, which can be run with the free (as in beer) VMWare player
- with a cheap (~$250), low-power/profile Soekris box for those that are going to do a lot of testing and don't want to bog down their main machine. Not to mention, the Soekris boxes are fun toys for many other uses.
BLT Scenarios
NOTE: wherever it says "change dev" below replace that with "add dev" if it is your first scenario. Change eth0 to reflect whatever interface you use.
- Asymmetric Latency
tc qdisc change dev eth0 root netem delay 300ms
This adds 300ms of latency on top of whatever was already there. This is an asymmetric latency, it is only on outgoing packets.
- Symmetric latency.
tc qdisc change dev eth0 root netem delay 500ms tc qdisc change dev ifb0 root netem delay 500ms
This is good for reproducing a bug when all else fails. Try this one first, then try the lesser scenarios. If the lesser scenarios still repro it, mention them instead.
150ms symmetric is similar to some far-flung international user's connections. Note that due to bandwidth*delay product, this scenario also constrains TCP bandwidth in nearly all cases. MS Vista, with dynamic window sizing up to 8megs, and linux, with generally better window settings, may perform better than other OS (especially old windows, which had terrible window settings) on this scenario, so beware.
Symmetric 75ms latency is similar to europe (or SF to East Coast), symmetric 40ms latency is similar to East Coast to TX.
- Variable Latency
tc qdisc change dev eth0 root netem delay 200ms 200ms 50% tc qdisc change dev ifb0 root netem delay 200ms 200ms 50%
This adds variable round trip latency from 0-400ms.
- Emulated Satellite internet
tc qdisc change dev eth0 root netem delay 350ms 50ms 50% tc qdisc change dev ifb0 root netem delay 350ms 50ms 50% tc qdisc change dev eth0 root netem loss 0.3% 33.33% tc qdisc change dev ifb0 root netem loss 0.3% 33.33%
In reality, satellite internet isn't THIS bad, because they do use TCP acceleration which converts TCP into a sort of UDP-esqe stream to avoid bandwidth-delay product problems that severely constrain bandwidth in this scenario. The latency and loss figures, are however accurate.
- Packet Loss
Low bursty packet loss
tc qdisc change dev eth0 root netem loss 0.3% 33.33% tc qdisc change dev ifb0 root netem loss 0.3% 33.33%
High bursty packet loss
tc qdisc change dev eth0 root netem loss 3.0% 33.33% tc qdisc change dev ifb0 root netem loss 3.0% 33.33%
Insane packet loss
tc qdisc change dev eth0 root netem loss 15.0% 33.33% tc qdisc change dev ifb0 root netem loss 15.0% 33.33%