Research issueshttps://gitlab.torproject.org/tpo/web/research/-/issues2021-10-07T13:22:19Zhttps://gitlab.torproject.org/tpo/web/research/-/issues/31590List page template for research ideas2021-10-07T13:22:19ZirlList page template for research ideasNeed to add a layout for research ideas. We also need some text to add to research ideas like: "if you want to get started with this then these are things we can/cannot help you with and here is how to contact us".Need to add a layout for research ideas. We also need some text to add to research ideas like: "if you want to get started with this then these are things we can/cannot help you with and here is how to contact us".https://gitlab.torproject.org/tpo/web/research/-/issues/29465Build a torbib2022-01-31T18:41:31ZirlBuild a torbibThis is a subset of anonbib that would then be maintained by the Tor research team as part of research-web.This is a subset of anonbib that would then be maintained by the Tor research team as part of research-web.https://gitlab.torproject.org/tpo/web/research/-/issues/16659Add research idea for Linux TCP Initial Sequence Numbers may aid correlation2021-10-07T13:13:35ZTracAdd research idea for Linux TCP Initial Sequence Numbers may aid correlationTCP Sequence Numbers seem to be one more way to leak the host clock on GNU/Linux systems. Its the last major vector in the literature thats not addressed yet.[1] The kernel embeds the system time in microseconds in TCP connections. Some ...TCP Sequence Numbers seem to be one more way to leak the host clock on GNU/Linux systems. Its the last major vector in the literature thats not addressed yet.[1] The kernel embeds the system time in microseconds in TCP connections. Some opinions say the TCP ISNs are salted hashes and can't be abused but my impression from Steve Murdoch's papers are that its feasible and already carried out in his tests. [2][3]
There is no sysctl option to disable it and it must be patched upstream [4][5]
Nick has done exceptional work to get OpenSSL upstream to throw out mandatory timestamping in the protocol. TAILS and Whonix disable TCP Timestamps in the kernel sysctl. TCP Timestamps are a different vector from TCP ISNs discussed here - it would be great if upstream kernel disables this as well so all distros have it.
[1]https://www.cl.cam.ac.uk/~sjm217/papers/ccs06hotornot.pdf
[2]http://caia.swin.edu.au/talks/CAIA-TALK-080728A.pdf
[3]http://www.cl.cam.ac.uk/~sjm217/papers/ih05coverttcp.pdf
[4]https://stackoverflow.com/a/12232126
[5]http://lxr.free-electrons.com/source/net/core/secure_seq.c?v=3.16
**Trac**:
**Username**: sourcehttps://gitlab.torproject.org/tpo/web/research/-/issues/16520Add research idea to Run some onion services to observe crawling trends2021-10-07T13:13:36ZRoger DingledineAdd research idea to Run some onion services to observe crawling trendsWe know some research groups that are doing full crawling of onion services. We also know that Ahmia et al are doing it. I keep hearing these days about big security companies selling "onion intelligence" or the like.
What are the chara...We know some research groups that are doing full crawling of onion services. We also know that Ahmia et al are doing it. I keep hearing these days about big security companies selling "onion intelligence" or the like.
What are the characteristics of these crawls? Are many of them one level deep, or k levels deep, or full crawls? Do they obey robots.txt? Do they identify themselves by their user agent? Do they visit urls that are embedded in html comments that humans would never find? Do they de-obfuscate urls and visit those? Do they get suckered by web tarpits that produce infinite pages? Are the crawling trends going up quickly or slowly?
We should consider running a couple of onion services with various characteristics, and monitor their usage and see if we learn anything.https://gitlab.torproject.org/tpo/web/research/-/issues/6473Add research idea for bandwidth related anonymity set reduction2021-10-07T13:13:38ZproperAdd research idea for bandwidth related anonymity set reductionAttack:
* The target hosts a hidden service.
* A linguist determines, the target is living in country X.
* Or it's a blog about things in country X.
* Thus, the assumption that the target's hidden service is running in country X has ...Attack:
* The target hosts a hidden service.
* A linguist determines, the target is living in country X.
* Or it's a blog about things in country X.
* Thus, the assumption that the target's hidden service is running in country X has a high probability to be true.
* Easy to research (example): the fastest A Mbps line is only available in a very few parts of the country. Maybe only in one city. Most people have B Mbps and a few one still an old contract with the slow C Mbps.
* The adversary buys lots of servers in different countries, installs Tor on those servers and uses Tor as a client.
* The adversary can build now lots of circuits from geographical diverse places and probes the server by connecting to it's hidden service. The adversary can now accumulate how much down/upload speed the hidden service can provide.
* Thus, the adversary knows now something more about his target and if A Mbps is only available in a few places he has nailed down the amount of suspects.
Another unrelated open question:
* Preliminary consideration: Unless stream isolation is used, exit relays can correlate different activity from one user.
* Can exit nodes differentiate "This is the user who keeps on reading some.site with a A Mbps line vs this is the user who keeps reading some.site with a C Mbps line line?"?https://gitlab.torproject.org/tpo/web/research/-/issues/5830Write tool to automate web queries to Tor; and use Stem to track stream/circ ...2022-01-31T18:41:39ZRoger DingledineWrite tool to automate web queries to Tor; and use Stem to track stream/circ allocation and resultsAs part of legacy/trac#5752 we need to know how many circuits we're making now, how many we're discarding early because a stream didn't work, etc.
This is a two-part project: first is a tool to automatically make a series of requests to...As part of legacy/trac#5752 we need to know how many circuits we're making now, how many we're discarding early because a stream didn't work, etc.
This is a two-part project: first is a tool to automatically make a series of requests to Tor, in a repeatable way, and second is a Tor controller script, probably using Stem, that watches stream and circuit events (and maybe more), and tracks which streams get allocated to which circuits, how many total circuits are made, how quickly results return, and other statistics. Then we would change the underlying Tor, replay the same set of requests, and know what circuit behaviors to expect.
I expect we'll also discover that we don't export enough info via the control protocol to make good conclusions; in that case we'll also want to modify Tor to export this info.