OnionSproutsBot is becoming ready to be deployed in production. It will be maintained by the anti-censorship team. Can we have a place to deploy it?
Should get a new user in the gettor VM? Or to deploy it in another VM? The rest of gettor is going to be integrated in rdsys in the near future, so OnionSproutsBot will be the last piece of gettor that has its own life. Might make sense to create a new VM for it so you can retire the existing gettor VM (using buster) once we move gettor to rdsys?
Is a python program with few dependencies, I guess we can deploy it in a virtualenv or if you have better ideas I'm happy to hear about it.
I forgot to mention OnionSproutsBot will not require any external port open, apache to work, ip address or public domain pointing to it. It does connect to the telegram API and doesn't have any directly visible service. It just needs internet access.
Sidenote: I would say that there is a minimum space requirement of around 5-10 GBs in storage, all the way up to ~140 GBs (for every single binary in the latest version, in every language and available platform) to mitigate a hypothetical attack vector. Most of that space will not be used 99.99% of the time.
would make sense to make a new VM if gettor will be deprecated eventually. can it talk to rdsys over the network securely? (ie. does it need to be on polyanthum?)
Sidenote: I would say that there is a minimum space requirement of around 5-10 GBs in storage, all the way up to ~140 GBs (for every single binary in the latest version, in every language and available platform) to mitigate a hypothetical attack vector. Most of that space will not be used 99.99% of the time.
would make sense to make a new VM if gettor will be deprecated eventually. can it talk to rdsys over the network securely? (ie. does it need to be on polyanthum?)
This bot doesn't need to talk with rdsys or any other polyanthum service. Is pretty much autonomous, only talks with the telegram API and the tor browser downloads json and tgzs (to update the binaries it provides when there is a new version).
onionsprouts-01? gettor-02? gettor-telegram-01?
I don't have a strong preference, reading the rfc I guess any of those will make sense, so whatever you think is more clear in your naming convention.
core memory (standard is 8GB)
8GB is plenty, I'm pretty sure we can live with 4GB.
virtual CPU cores (standard is 2)
1 should be enough, is async python, the main process is going to be single threaded.
SSD disk space (standard is 10G system, 20G /srv)
That should be enough.
HDD disk space (standard is "none", although we also have SAS-only VM if you don't need fast storage)
none is good
any networking limitations / requirements (e.g. open ports, etc)
No open ports or public IP address needed. Just being able to do outgoing connections to 443 will be enough.
would make sense to make a new VM if gettor will be deprecated eventually. can it talk to rdsys over the network securely? (ie. does it need to be on polyanthum?)
This bot doesn't need to talk with rdsys or any other polyanthum service. Is pretty much autonomous, only talks with the telegram API and the tor browser downloads json and tgzs (to update the binaries it provides when there is a new version).
I designed the bot so that the files downloaded by the bot are stored in the temporary directory of the operating system it's running on. In this case, it would be better if as much space as possible went towards /tmp instead. I can make accommodations if needed, but any urgent problems should go away with a simple reboot without any further adjustments on any mainstream Linux system (presumably Debian) and the thing I described in the beginning of the thread regarding a max of 120 GBs isn't as urgent.
the thing I described in the beginning of the thread regarding a max of 120 GBs isn't as urgent
We should not need 120GB if we delete each file after being uploaded to telegram. Tor Browser binaries are ~80MB, in 20GB you could be downloading 500 at the same time, that should be enough, but maybe I'm missing something here.
We should not need 120GB if we delete each file after being uploaded to telegram. Tor Browser binaries are ~80MB, in 20GB you could be downloading 500 at the same time, that should be enough, but maybe I'm missing something here.
Tor Browser binaries are ~80MB, in 20GB you could be downloading 500 at the same time, that should be enough, but maybe I'm missing something here.
I'm curious to know why this service would need several hundred copies of the same binary, is it because each one needs to be encrypted for individual recipients?
@lavamind Well, it's not the same binary, and as things are right now, there's no E2EE. More on that later. The bot is able to provide downloads for every locale and every operating system. (Linux 32-bit, Linux 64-bit, etc.) Although the files are stored in a temporary directory right now so that they will be removed later (this does not happen yet from the bot itself, but implementing this is trivial). However, someone could hypothetically request every single binary that the bot is able to provide, particularly shortly after an upgrade.
There's no E2EE; That may be the case in the future and that's something I would like to work on, but not now, as not using it (with the con of Telegram being supposedly able to know what you requested, but if you try to initiate a E2EE chat, it will know that as well) allows people to spread awareness about the bot within the platform easily by forwarding messages.
The bot is able to provide downloads for every locale and every operating system. (Linux 32-bit, Linux 64-bit, etc.) Although the files are stored in a temporary directory right now so that they will be removed later (this does not happen yet from the bot itself, but implementing this is trivial).
I'm wondering then if the bot really needs to maintain a directory archive separate from dist.torproject.org. For example, we could probably just read-only mount https://dist.torproject.org/torbrowser/ inside that machine so that it wouldn't have to deal with adding/removing binaries at all, and new versions would be available immediately as they get released. What do you think?
Specifically, I was thinking of we could use https://packages.debian.org/bullseye/httpdirfs to mount the directory listing as a filesystem. It even supports caching downloaded files so if the file was retrieved previously, it doesn't have to fetch it from the remote filesystem again and again.
I'd also add, for the record, that a complete Tor Browser release, all locales and operating systems, is currently around 33G. So unless you want users to be able to request old or alpha versions, we shouldn't need storage space much beyond this.
For example, we could probably just read-only mount https://dist.torproject.org/torbrowser/ inside that machine so that it wouldn't have to deal with adding/removing binaries at all, and new versions would be available immediately as they get released. What do you think?
I could make it so that it would prioritize looking from a specific directory before it goes on to download a file online, and use the online option as a fallback. However, my implementation here is tightly knit to that API to the core. I was thinking about this problem, actually, because of a similar implementation for Signal that will end up requiring storing the files. I barely have time as it is, so all of these new details for the project that I have been sporadically working on over the pandemic make me think that the development progress is around ~35%...-ish, instead of the 75%-80% that I had in the back of my head (but this is like the third time this is happening and the final product is getting progressively better, so that's fine). I'll open a new issue for this on my repo and ask for more details there.
I'd also add, for the record, that a complete Tor Browser release, all locales and operating systems, is currently around 33G.
Thanks for correcting me, that makes much more sense and I am not sure where my calculations went wrong. I had some previous version that added a timestamp to the local version, so it's very likely that I downloaded multiple copies of everything.
Actually, I'll disclose this: There's this edge case that all of the binaries available may be requested shortly after an update. Ensuring the integrity of binaries was on my list anyways, but if someone does that, even if I intend for it to work with as minimal resources as possible, countering that gets tricky. I have had a couple of designs for a "priority queue" in the back of my head that could counter this, but since I have nothing to show for it at the moment, I am solely acting based on the (most likely incorrect) assumption that it's impossible to make this work.
This question will not have an impact on anything, but just asking out of curiosity: In the event of an emergency, can the space be temporarily bumped up? (I am already planning to implement safety nets so that nobody will have to intervene, as leaving the bot running in the background and having to do close to no maintenance whatsoever is a primary goal of my project, but still.)
The anti-censorship people will also be able to delete the files if needed, but we are neither 24/7, but at some point we should add it to our monitoring system to be alerted if something fails.
the development progress is around ~35%...-ish, instead of the 75%-80% that I had in the back of my head
I'm moving this request to Backlog for now. Please let me know once we you think the development if far enough along to warrant working on a production deployment.
Hi, just to clarify: I initially thought that this feature was something that was to be done as an obligation, but after further investigation, I have come to realize that this method is far from productive right now. Apart from one minor bug that I have to fix, it's ready to go.
I wouldn't like to pose a burden to your limited resources, unless if there's a way for anyone to dynamically adjust the partition hosting the files dynamically using something like LVM in the event of an emergency or increased demand.
If the bot runs out of computational resources (heavy demand, e.g. async is basically an abstraction layer for threads on Linux AFAIK), async will take care of it and just respond to requests whenever it has the capacity to do so. The bot can handle connection disruptions, and if a download/upload gets completely interrupted, it will carry on later. The only plausible time when this hypothetical risk of the disk space running out would be only around the time when there is an update to the browser, and after a file is uploaded once, it can be safely removed as it will have been cached on Telegram's servers. I have stress-tested the bot on a Gigabit connection under strained resources, and had tested it with medium-sized groups of people with 10 GB Of available storage. It's fine, the worst thing that has happened is that it would respond more slowly or that an error would pop up and the user would have to try again (or later).
As far as I have been able to tell, there's no urgent cause for concern whatsoever, I am just very worried about a theoretical attack vector because the design of my bot has some very minor flaws underneath that I willingly chose to ignore to get the job done. I will probably not be sure if that would be a problem, let alone to the point of the sustainability of the project coming to question, until this gets stress-tested. I doubt it, but I am expressing that I am not 100% confident about that. If you have nothing to do with the space right now, I would propose to just slap the bot on it and then monitor how much space it actually consumes in a real-world scenario and lower down the space accordingly.
In a bad scenario, the bot will have to be temporarily brought down and the storage space will have to be emptied out manually. In the worst case scenario, the SQLite database storing the IDs for cached versions of the uploaded files may have to be modified (or just completely nuked for convenience) and the bot will have to start over again as if there was a new update to the browser.
There's only one line that I have temporarily toggled so that temporary files won't get removed so that they get uploaded, and I did that because I was looking for a way to experiment with some UI for making signature verification easier for the user, but I can fix it within a couple of minutes:
There's only one line that I have temporarily toggled so that temporary files won't get removed so that they get uploaded, and I did that because I was looking for a way to experiment with some UI for making signature verification easier for the user, but I can fix it within a couple of minutes:
Jeez, I just had to fix the indentation and writing this entire block of text took me longer than making files delete themselves after an upload. Yeah, the risk of me trying to boost efficiency by taking as many concurrent uploads as possible is minimal. That said risk is now dependent on how many threads there are available, the available disk space and whether there was an update to any of the available downloads recently. I am absolutely not worried about it, it's not worth discussing any further unless if it becomes an actual problem, somehow, later.
@n0toose great, thanks for the changes. How much space do you think we should give it now? Will we be fine with the 'TPA standard 20GB'? Should we have a bit more just in case 40? 100?
I propose that we just proceed with 100GB on the iSCSI HDD storage cluster. It won't be as fast as SSD storage but if I understand correctly it should be fine considering the workload of this service, and at least we have plenty of disk space there.
We started the installation process but we hit some snags with the installer. It hasn't been tested extensively with the new iSCSI storage backend and still needs a few tweaks. Hopefully next week we'll have the machine deployed.
@meskio I've deployed telegram-bot-01.torproject.org. It has a telegrambot role account from which I assume a Python virtualenv will be set up, and the daemon will execute. Please refer to the procedure to deploy systemd user services to set that up.