This issue is another optimization (number 6) for congestion control commented at #40130 (comment 2823615):
Note that also 20MB was just my test file value. It is probably better to make the POST extremely large (ie: as large as the largest possible sbws download is currently, or larger), and then just have a number of megabytes to cut it off at, after SS=1 SS=0. This number of megabytes can be calculated similar to how sbws chooses a file size for GET.
We have several issues to change the size and duration of the downloads, so instead of just implementing the same algorithm that the downloads code is using, we might prefer to think about a better algorithm for uploads.
I know it's a bit risky to change the algorithm while also changing the code, but in this case the uploads are already behaving a bit different.
This is the current algorithm that sbws is using to choose the size of the downloads:
At onbasca#58 it was suggested to start measuring with a bigger size than 16KB, which is (what sbws initally tries, even if the maximum is 1GiB.
We can choose the initial size to try to upload (or download) based on the consensus weight that the relay report.
What it would be missing then, is which should be the target duration of the download/upload (to multiply the consensus bandwidth by the duration). See paragraphs below.
Also, instead of trying several downloads/uploads, we can now monitor the speed every X bytes (though not X seconds, as commented in the issue).
At onbasca#65 teor was also proposing to initially request more bytes than just 16KiB, in concrete, 1MiB. We could choose this size (or the 1.5MB that @mikeperry suggested) in the case the relay consensus bandwidth is 0 (or 1?).
At onbasca#64 teor proposed a duration target of 8secs and maximum of 16.
At onbasca#85 teor proposed a 11 second download, due to the possible removal of the tor self-test. We're not going to remove the self-test for now, but maybe we should think about why (the misterious) 11 secs (i don't remember and legacy/trac#22453 (moved) is long).
Because we are measuring bandwidth and not throughput, i think that it would be fine to take the maximum speed we have seen in every callback during the whole upload. I mean, we don't need to calculate when the speed "stabilizes", which i imagine would be something equivalent to calculate when the "acceleration" of the speed tends to 0.
A possible algorithm for sbws uploads with the current changes at !141 (closed) could be:
@mikeperry, @gk: what should be the target duration of the upload and what do you think about rest?
@mikeperry commented in IRC that after the slow start, the changes in speed should be smaller and the maximum bandwidth could be measuring buffering burst, so an average should be better for balancing.
Also commented that we could create consensus parameters for this, as:
chunk size
duration of the upload (after slow start, i guess)
I am inclined to say "let's at least at the upcoming meeting in .ie" talk about it. Here are my additional notes from IRC:
12:43 < GeKo> changing that algorithm on top of all the 1.6 changes makes me a bit nervous12:43 < GeKo> and, recall sbws is effectively in maintenance mode...12:45 < GeKo> so i fear we get some fun things to debug which could be avoidable if we stick to something closer to what we have in sbws for now12:45 < GeKo> (modulo the bare minium changes we actually want/need fo the optimization)12:49 < GeKo> *for
I am inclined to say "let's at least at the upcoming meeting in .ie" talk about it.
Indeed, we can wait to create this patch ~2 weeks.
changing that algorithm on top of all the 1.6 changes makes me a bit nervous
I understand this.
so i fear we get some fun things to debug which could be avoidable if we stick to something closer to what we have in sbws for now
Alright, at #40130 (closed) we have already changed several download attempts of different sizes and target duration of 6 seconds to one upload of 1.5MiB (once SS=0 is received). What is missing in the previous downloads activity diagram is that later on, sbws calculates the average of the successful attempts.
We can leave the target duration of 6secs. and instead of doing several upload attempts with different sizes, take the average of the bandwidth calculated at each chunk size. If we'd calculate the chunk size in a similar way that the data size is calculated in every download attempy, we'd need to take into account the bandwidth of the previous chunk, but i think this might be unneeded complexity and we could take chunks of ~1MiB or something like the total target size to download (weight * target duration) divide per 10.
We can leave the target duration of 6secs. and instead of doing several upload attempts with different sizes, take the average of the bandwidth calculated at each chunk size. If we'd calculate the chunk size in a similar way that the data size is calculated in every download attempy, we'd need to take into account the bandwidth of the previous chunk, but i think this might be unneeded complexity and we could take chunks of ~1MiB or something like the total target size to download (weight * target duration) divide per 10.