Change initial upload data to be large and cut it off after X bytes, in a similar way as download size is calculated when downloading.

changed milestone to %sbws: 1.6.x-final

added Major Next S61-O2-Maybe - FINISHED Scanner Sponsor 61 - FINISHED labels

assigned to @juga

mentioned in issue #40130 (closed)

changed the description

6h * 1.5 (moderate uncertainty) = 9h

changed time estimate to 9h

added Q3 label

marked this issue as related to #40130 (closed)

We have several issues to change the size and duration of the downloads, so instead of just implementing the same algorithm that the downloads code is using, we might prefer to think about a better algorithm for uploads. I know it's a bit risky to change the algorithm while also changing the code, but in this case the uploads are already behaving a bit different.

This is the current algorithm that sbws is using to choose the size of the downloads:

At onbasca#58 it was suggested to start measuring with a bigger size than 16KB, which is (what sbws initally tries, even if the maximum is 1GiB. We can choose the initial size to try to upload (or download) based on the consensus weight that the relay report. What it would be missing then, is which should be the target duration of the download/upload (to multiply the consensus bandwidth by the duration). See paragraphs below. Also, instead of trying several downloads/uploads, we can now monitor the speed every X bytes (though not X seconds, as commented in the issue).

At onbasca#65 teor was also proposing to initially request more bytes than just 16KiB, in concrete, 1MiB. We could choose this size (or the 1.5MB that @mikeperry suggested) in the case the relay consensus bandwidth is 0 (or 1?).

At onbasca#64 teor proposed a duration target of 8secs and maximum of 16.

At onbasca#85 teor proposed a 11 second download, due to the possible removal of the tor self-test. We're not going to remove the self-test for now, but maybe we should think about why (the misterious) 11 secs (i don't remember and legacy/trac#22453 (moved) is long).

Because we are measuring bandwidth and not throughput, i think that it would be fine to take the maximum speed we have seen in every callback during the whole upload. I mean, we don't need to calculate when the speed "stabilizes", which i imagine would be something equivalent to calculate when the "acceleration" of the speed tends to 0.

A possible algorithm for sbws uploads with the current changes at !141 (closed) could be:

@mikeperry, @gk: what should be the target duration of the upload and what do you think about rest?

marked this issue as related to onbasca#58

marked this issue as related to onbasca#65

marked this issue as related to onbasca#64

marked this issue as related to onbasca#85

mentioned in issue onbasca#128

added Doing label and removed Next label

@mikeperry commented in IRC that after the slow start, the changes in speed should be smaller and the maximum bandwidth could be measuring buffering burst, so an average should be better for balancing.

Also commented that we could create consensus parameters for this, as:

chunk size
duration of the upload (after slow start, i guess)
how long to average over

I am inclined to say "let's at least at the upcoming meeting in .ie" talk about it. Here are my additional notes from IRC:

12:43 < GeKo> changing that algorithm on top of all the 1.6 changes makes me a bit 
              nervous
12:43 < GeKo> and, recall sbws is effectively in maintenance mode...
12:45 < GeKo> so i fear we get some fun things to debug which could be avoidable if 
              we stick to something closer to what we have in sbws for now
12:45 < GeKo> (modulo the bare minium changes we actually want/need fo the 
              optimization)
12:49 < GeKo> *for

I am inclined to say "let's at least at the upcoming meeting in .ie" talk about it.

Indeed, we can wait to create this patch ~2 weeks.

changing that algorithm on top of all the 1.6 changes makes me a bit nervous

I understand this.

so i fear we get some fun things to debug which could be avoidable if we stick to something closer to what we have in sbws for now

Alright, at #40130 (closed) we have already changed several download attempts of different sizes and target duration of 6 seconds to one upload of 1.5MiB (once SS=0 is received). What is missing in the previous downloads activity diagram is that later on, sbws calculates the average of the successful attempts.

We can leave the target duration of 6secs. and instead of doing several upload attempts with different sizes, take the average of the bandwidth calculated at each chunk size. If we'd calculate the chunk size in a similar way that the data size is calculated in every download attempy, we'd need to take into account the bandwidth of the previous chunk, but i think this might be unneeded complexity and we could take chunks of ~1MiB or something like the total target size to download (weight * target duration) divide per 10.

Let's talk about it this soon.

I am inclined to say "let's at least at the upcoming meeting in .ie" talk about it.

In particular, I like to get a better handle on how we can minimize the can of worms/risks we open while maximizing the benefits...

We can leave the target duration of 6secs. and instead of doing several upload attempts with different sizes, take the average of the bandwidth calculated at each chunk size. If we'd calculate the chunk size in a similar way that the data size is calculated in every download attempy, we'd need to take into account the bandwidth of the previous chunk, but i think this might be unneeded complexity and we could take chunks of ~1MiB or something like the total target size to download (weight * target duration) divide per 10.

Sounds good to me.

mentioned in issue #40146 (closed)

I created a new issue #40146 (closed) to visualize upload bandwidth, as @gk and I commented in IRC.

mentioned in merge request !147 (closed)

mentioned in merge request !150 (closed)

mentioned in merge request !151 (closed)

mentioned in merge request !152 (closed)

mentioned in merge request !153 (closed)

added 40h of time spent

added Backlog label and removed Doing label

removed Q3 label

added Q1 label

mentioned in issue #40152 (closed)

added Roadmap::Future label and removed Backlog label

removed S61-O2-Maybe - FINISHED label

removed Sponsor 61 - FINISHED label

removed Q1 label

We will work on this at onbasca#128

closed

added S61-O2-Maybe - FINISHED Sponsor 61 - FINISHED labels

Adding back s61 labels, removed by mistake, as we already worked on this during s61.

mentioned in issue analysis#80

Change initial upload data to be large and cut it off after X bytes, in a similar way as download size is calculated when downloading.

Designs

Child items ...

Activity