arti-bench: do not allocate individual receive buffers for every receiver
When we're running a huge number of arti-bench
timing cases in parallel, it's annoying to have to allocate (e.g.) 10MiB for each test case just for the the received
buffer in run_timing. It would probably be better to allocate at most 4-16k in run_timing, and then to just read in chunks.
We should still compare the result to our receive
input.
Found while doing #87.