gathering the node data associated with the circuit,
storing this information.
This should really move to some dedicated backend, which also controls requesting new circuits, or other tor circuit related operations.
We would also want it to track information about what bridge is in use, as part of this. Right now gConnectionPane in "about:preferences" is having to do a similar but separate approach to gTorCircuitPanel to determine whether a bridge is in use, and which one.
Besides separating concerns, it could also address some problems we have currently:
We gather circuit node data but we never free it, or know when it is safe to free it. This is because the tor process itself does not store the circuit data after it has already expired, but in the browser we still need that information to tell the user what circuit was used to show the current page. We might be able to address this by integrating the circuit node information with the firefox request process by storing it within some relevant object, so it only sticks around for as long as it is in use. I don't know much about the firefox process, but maybe something like the page's loadInfo?
We don't have a clean way to determine whether the current page is establishing the initial circuit, or requesting a new circuit, or will never get a circuit (like a data: uri). We do some guess-work to get around this, but it also means we cannot implement designs like !587 (comment 2888415) or !587 (comment 2892297). Again, if we had an integrated backend, it could track the exact state.
gTorCircuitPanel is not shared between windows. As a result, a web page split across two windows can be missing information if the other window hasn't picked up the circuit event. E.g. if you open a web page in a new tab and open the same page in a new window, then the new window will be missing the circuit display.
gConnectionPane currently only collects circuit data while "about:preferences" is loaded. So it will not know which bridge is in use until it receives a circuit event.
The separate backend could be helpful also for the current bridge in use case.
I've just remembered I had to implement another event watcher in browser/components/torpreferences/content/connectionPane.js because I could not access the data the circuit display already collected.
We will have to rework it for Arti for sure, which will directly give us the data of the bridge in use, hopefully.
At that point, having a unified backend would be a great thing.
Last week I had some downtime because I'm waiting for my 115 MRs to be reviewed.
So, I took the occasion to finally rework Torbutton and maybe demolish that commit ️.
One of the steps I found is removing direct calls to the controller (#41844 (closed)), for two reasons:
I can modernize/refactor the control port authentication without worrying of its consumers (except for a few ones I can easily control)
adding a middle layer that adds some abstraction will be useful for having multiple backends, like Arti (#41843 (closed)).
We didn't have many direct consumers. Basically, only the onion authentication, the bridge section of the control panel, and the circuit display.
So, I took this issue to address it in TorMonitorService, that has a persistent connection to the control port and already listens for several events.
My first attempt won't be the definitive solution.
But at least I hope it contributes to getting a broader knowledge of the tor-related patches and making them clearer.
So, for now I'm taking the issue, doing some changes, but probably we'll have to leave it open for further work.
The strategy I came up is to collect some data for circuits and streams.
For circuits, add it to a map CircuitID: Node fingerprints in the BUILT event, and then remove it on its CLOSE event.
For streams, create a map StreamID: {CircuitID, SOCKS Credentials}, add entries on the SUCCEEDED event (we can also use SENTCONNECT, if we think it's better) and remove them on the CLOSED event.
As a proof of concept, I've created a third map (SOCKS Credential: StreamID), for lookups on the stream map without going through all of it. But it might be not needed, eventually.
(Again, it's a duplication of data to do direct lookups. Worth it, or premature optimization?).
We would also want it to track information about what bridge is in use, as part of this. Right now gConnectionPane in "about:preferences" is having to do a similar but separate approach to gTorCircuitPanel to determine whether a bridge is in use, and which one.
gConnectionPane currently only collects circuit data while "about:preferences" is loaded. So it will not know which bridge is in use until it receives a circuit event.
This seem to be the easiest part, but it has some caveats.
We constantly get data about the circuits that are built.
Theoretically, we can just look at the first node and it should be our bridge.
However, when we set bridge lines, Tor checks them, to decide which one to use, and it opens a lot of circuits to do so.
We have several possibilities:
update the current first node only after seeing stream events, with the side effect of (possibly) delaying the update in the settings (this is the current strategy)
with a persistent backend this should unlikely happen; maybe it's more likely to when you're bootstrapping and looking at the bridge cards at the same time
least complex solution: we keep only the ID of the currently connected bridge
display the bridge that appears in the highest number of circuits as the currently connected one
empirically, I've seen that bridges that are only tested and never used appear in only one circuit
we already collect the data with my proposal, but checking all the circuits every time a new one is added seems silly. We could rather add a map NodeID: Number of circuits for each bridge, and maybe a string with the highest key.
as a proof of concept, I've tested this, and it seems to work. I'm not sure the higher complexity is worth it.
do a mix of the two: rely on circuit events only until we see a stream event.
most of the complexity. We should do it only if the delay of 1 is unbearable and 2 produces results we're not satisfied with (but why use it for the initial stage, at that point?)
It's worth mentioning that tor keeps a secondary bridge, and we can detect it by looking at the circuits.
Should we ever display it?
Probably something to discuss with UX + AC (and maybe even net team).
We gather circuit node data but we never free it, or know when it is safe to free it. This is because the tor process itself does not store the circuit data after it has already expired, but in the browser we still need that information to tell the user what circuit was used to show the current page.
This problem (aka #40982 (closed)) is a hard one.
Streams are closed quickly, and we need least one with a certain pair of credentials to get the related circuit.
We could completely skip the streams, and only update a map Credentials: CircuitID when we see streams.
Then, when a circuit is closed, we can iterate and remove all the unneeded circuits.
Circuits generally live longer than streams, but a user could be interested in seeing the circuit much later than the circuit has been closed, for any reason.
I've seen the circuit display has a WeakMap<MozBrowser, BrowserCircuitData> to the last used circuit.
Maybe we could change it to WeakMap<MozBrowser, Map<Document URI, BrowserCircuitData>>, instead, to show the (stale) data that has been used to actually load the page?
Populating/updating this map should be very quick: either we have the data already, or we cannot update it. No communication on the control port should be involved at this stage.
Then, the lifetime of each browser will also determine when the data will be cleaned.
Do we risk of accumulating too much data?
Or, can we add custom data to History only for the privileged side?
gTorCircuitPanel is not shared between windows. As a result, a web page split across two windows can be missing information if the other window hasn't picked up the circuit event. E.g. if you open a web page in a new tab and open the same page in a new window, then the new window will be missing the circuit display.
The WeakMap risks of not solving this problem, if it runs in chrome JS.
History data, on the other hand, is usually kept also when moving a browser.
I will have to investigate more.
We might be able to address this by integrating the circuit node information with the firefox request process by storing it within some relevant object, so it only sticks around for as long as it is in use. I don't know much about the firefox process, but maybe something like the page's loadInfo?
Is it living enough? Or is it deleted after the loading has ended?
A good thing about this object is that we can access it directly from the domain isolator.
We register a filter function with ProtocolProxyService.registerChannelFilter. This function takes the channel (which includes a loadInfo) and the original proxy data that we change.
Anyway, I think this (changes at the Firefox internal level) is more involved then what I'd like to do now, so I'd leave it for a second revision.
On the other hand, changing the domain isolator without changing Firefox's interface is doable also for this revision.
We don't have a clean way to determine whether the current page is establishing the initial circuit, or requesting a new circuit, or will never get a circuit (like a data: uri). We do some guess-work to get around this, but it also means we cannot implement designs like
when you request a new circuit for a website, if it is slow enough to reload then there is a period of time where the old circuit is lost, but information about the new circuit is unknown. So we have an empty circuit:
I'm a little concerned that users may be alarmed, or the empty state could flash. Could we try a more explicit working state instead?
The old code was kind of hacked on top, rather than integrated. Basically, it just harvested all the circuit data it could find, and we try and map the currently shown <xul:browser> to one of these circuits. And if none is found, the page has no circuit, but there is no information to discern whether it will have one, whether it used to have one, or whether a circuit is not applicable for the page (e.g. a "data:" URI like data:text/plain,hello).
So, right now I could only guess when we are requesting a new circuit for the currently shown page. But there could be certain conditions that break this.
Actually, would it be possible to restrict the animation to when the circuit is initially established, so it doesn’t re-animate between pages of the same site?
That would also provide some nice feedback that the circuit is reloading after requesting a new circuit.
if we had an integrated backend, it could track the exact state.
Side note
I've tried to have a look at what events I get from the control port.
I couldn't get a circuit to be created after I started a request.
From what I understand, tor continuously creates spare circuits, to be ready to create a stream on it.
I should tweak the torrc to try to make it less responsive and/or try harder with creating circuits . At that point, a log parser might be useful, too .
I think we could rework the matter in this way: «can we get the circuit information in a more direct way?».
I believe we will always pass through the SOCKS credential.
So, the correct question becomes: «is there a way to get the SOCKS credentials directly?».
It is very easy for the "New circuit for this site" part of the problem.
I refactored the domain isolator a few months ago, but I kept it back from 12.5 for 13.0 (it's been merged on the various 102.x-13.0 branches).
At the moment (my patches on 13.0), when the users wants a new circuit, the circuit display calls TorDomainIsolator.newCircuitForBrowser, and this function doesn't return anything.
We could make it return the new sets of credentials.
For the other cases, we need to find a way to make the domain isolator communicate with the circuit display.
One way is channel.loadInfo.innerWindowID.
It'd be nice if we could inject the SOCKS credential to the window, by knowing its ID, but I'm not sure it's possible.
Another way would be an observer.
Or the domain isolator could also keep a map windowID: SOCKS credential, and the circuit display could poll it at the right moment (when can we empty that map? After a certain window ID has been polled?).
Maybe that could solve the catch-all case, that right now doesn't work in some cases (e.g., local file containing a remote resource, e.g., a HTML file loading an image or a script).
I've seen the circuit display has a WeakMap<MozBrowser, BrowserCircuitData> to the last used circuit. Maybe we could change it to WeakMap<MozBrowser, Map<Document URI, BrowserCircuitData>>, instead, to show the (stale) data that has been used to actually load the page?
Is the idea to store the URI to search for the circuit instead of using credentials? I think the main issue with this is that when you open the web page in another tab, it will not share the same MozBrowser so you won't be able to look up the circuit.
Populating/updating this map should be very quick: either we have the data already, or we cannot update it. No communication on the control port should be involved at this stage.
Then, the lifetime of each browser will also determine when the data will be cleaned. Do we risk of accumulating too much data?
Or, can we add custom data to History only for the privileged side?
As part of nsISHEntry? I think this has the same issue with opening the same page in another tab, it won't share the history to look up.
We might be able to address this by integrating the circuit node information with the firefox request process by storing it within some relevant object, so it only sticks around for as long as it is in use. I don't know much about the firefox process, but maybe something like the page's loadInfo?
Is it living enough? Or is it deleted after the loading has ended?
I'm not sure, I just chose any of the internal structures that looks like it might stick around as long as the page is loaded and cached. I'm not sure where loadInfo is the right place to look, but overall we would ideally have:
When a page is loaded (with network activity) we want to be able to determine its circuit.
The circuit remains in memory and accessible for as long as the page is cached, but not longer.
We know definitively when a page is being loaded for the first time or reloaded with a new circuit, so that we know to ignore the cached circuit and display some loading state in the UI.
I was already adding a comment while you commented, too .
When a page is loaded (with network activity) we want to be able to determine its circuit.
I think some integration is actually possible.
The Domain Isolator can even access the circuit display (but for some reason it breaks it, i.e., the button isn't shown anymore, I will have to investigate more).
The first request has channel.loadInfo.innerWindowID === 0, which is a little bit annoying.
The rest of the requests have a real innerWindowID, which can be used to detect that a local file, (or something else) includes some remote elements and needs to get the circuit display shown.
I still haven't figured out how I can get the channelId, but we always have one.
Being able to use it when we don't have a window id would be great.
Then, I've tried to create a topic on TorMonitorService, that notify observers when a stream goes to SUCCEEDED.
In this way we can push both the credentials and the circuit (the node fingerprints, for now) to the circuit display.
We don't even have to store streams and credentials anymore, if the observer also manages the circuit cache.
I'm still trying to figure out the various pieces, so I haven't thought where we should collect the node data.
As long as it doesn't use tor-control-port.js directly I could be happy for now, and come back to a better implementation later .
We know definitively when a page is being loaded for the first time or reloaded with a new circuit, so that we know to ignore the cached circuit and display some loading state in the UI.
We can key the cache on the credentials, and show the animation whenever we have a miss.
Do you think this could work?
The circuit remains in memory and accessible for as long as the page is cached, but not longer.
For that I don't have a solution, yet.
I think the main issue with this is that when you open the web page in another tab, it will not share the same MozBrowser so you won't be able to look up the circuit.
I've tested earlier, and from what I could tell, MozBrowser is moved between different windows.
Of course, we'd need to store the WeakMap on the "process scope", not on the "chrome scope".
In other cases, if you open a link in a new tab or something like this, you also do a network request, and at that point, we'll have fresh data from the controller, unless the page is shown from (memory) cache.
I still haven't figured out how I can get the channelId, but we always have one. Being able to use it when we don't have a window id would be great.
Well, the channel also contains the URI (and originalURI), I forgot about that in my previous comment .
If it matches the requested URI and the user context ID, it should be safe to assume we can use these credentials SOCKS to match the circuit for that tab, even though we didn't get an innerWindowID.
We know definitively when a page is being loaded for the first time or reloaded with a new circuit, so that we know to ignore the cached circuit and display some loading state in the UI.
We can key the cache on the credentials, and show the animation whenever we have a miss. Do you think this could work?
I think we want something direct to be able to improve on what we do now. For example, we want to know the difference between waiting for an initial load and a new circuit request.
If domain-isolator handles all of this, then it would also know when a circuit is being re-requested, so it could just flag the stored circuit as such.
It seems that moving all of this circuit data and SOCKS credentials out of gTorCircuitDisplay and into domain-isolator might be a good first step. And gTorCircuitDisplay is able to request circuit data from domain-isolator by sending in the browser and register itself to be notified of any new circuit states.
It seems that moving all of this circuit data and SOCKS credentials out of gTorCircuitDisplay and into domain-isolator might be a good first step. And gTorCircuitDisplay is able to request circuit data from domain-isolator by sending in the browser and register itself to be notified of any new circuit states.
That could definitely work.
I will try to do something.
If domain-isolator handles all of this, then it would also know when a circuit is being re-requested, so it could just flag the stored circuit as such.
Yes. #40982 (closed) maybe isn't clear about that, but we have maps to clear also there (the association FPD: random nonce and User context ID: nonce).
Deleting an entry when the related circuit has been closed sounds like an idea.
(Moreover, the user context ID is taken into account, but there isn't any way to refresh it, but that's another story).
For example, we want to know the difference between waiting for an initial load and a new circuit request.
I'm sorry, but the difference isn't clear to me .
From the backend point of view they seem the same to me: you don't find a nonce, so you generate one.
From what I understand, you're referring to this case:
@donuts there's a UX question here which might change what we want in the back end. Consider this set up:
Open "example.org/page1" in one tab and "example.org/page2" in another tab. Both are loaded over circuit A.
Reload "example.org/page1" with a new circuit: circuit B.
Switch to the "example.org/page2" tab.
In this tab, do we want to show cirucit A or circuit B? Note that:
The top document for /page2 was not reloaded, so the visible page was loaded with circuit A.
Any ongoing or future connections for /page2 will use circuit B. E.g. if the page wants to do some network requests. If the page is static, then circuit B will only be used in this tab when moving to another page or reloading.
Right now, we show circuit B. Is this generally what you expect? To show the active circuit rather than the historic circuit?
So, if you load a page, go doing something else for a certain amount of time (let's say one hour), then come back, the old circuit won't exist anymore.
So, we should speak of last known rather than active.
@pierov yes, there is that extra detail. For the example, I'm assuming we're not waiting so the circuit is still active.
I think it only takes about 10 minutes for the circuit to expire, and for busy websites that like to do a lot of networking (tested with amazon dot com) you can actually see the circuit display switch whilst seemingly doing nothing on the page :)
But that is a similar case, do we show the circuit that loaded the top document, or do we show the circuit that was last used within the page.
If we decide we can say that a circuit has expired and don't display anything caching becomes very easy, just delete all the entries when we get the circuit close signal .