Over at tpo/community/outreach#28531 (moved), we're working on a censorship snapshot that summarises where and how Tor is blocked. The idea is that Tor Browser will use this snapshot to help the user figure out what circumvention method works. The question is: how does Tor Browser get this snapshot? It's not a good idea to hard-code it into Tor Browser because it will take a long time to update it. It's probably better for Tor Browser to fetch the snapshot each time it starts.
To guarantee that Tor Browser can always get the snapshot, even in the face of censorship, we should serve it over a domain-fronted endpoint, like we do for moat. There are several ways in which we could serve the snapshot:
Polyanthum, the host that runs BridgeDB and rdsys, runs an Apache reverse proxy. We could serve the snapshot directly over Apache, which is probably the simplest solution.
We could also teach rdsys how to serve the snapshot. Technically, the snapshot is a resource and rdsys is our resource distribution system.
Whatever solution we go with, we should make sure that we can easily and quickly update the snapshot if we learn of a new censorship event.
We decided to implement this as a new API endpoint for moat, which is a part of BridgeDB at the moment. I'm going to move this ticket back to the BridgeDB repository.
Here's a quick dump of implementation thoughts. Architecturally, there's a bit of conflation of Moat as a transport to make web requests to bridges.torproject.org, and the functionality of BridgeDB's Moat distribution endpoint.
Right now Moat-the-transport (a meek tunnel to bridges.torproject.org) is connected directly to BridgeDB's Moat distributor. However, we can generalize Moat to connect to the Apache2 web server frontend, which is responsible for redirecting and serving not only BridgeDB requests, but also bridgestrap, wolpertinger, and rdsys.
In order to make this censorship snapshot (or, location-specific recommendations) available over Moat, here are some options:
Modify BridgeDB to respond to this request over a third API endpoint, let's call it /snapshot for now, alongside the fetch and check endpoints. This would then be accessible by making a request through Moat to https://bridges.torproject.org/moat/snapshot.
Both of these are fairly simple. My gut tells me that (2) is a better option architecturally since this functionality will be separate from that of BridgeDB and we'll be splitting BridgeDB up in favor of rdsys in the future anyway. @meskio I'm curious about your thoughts here.
I agree with you that (2) look like a better option. We'll not need to reimplement this service when BridgeDB gets deprecated. And if we ever decide this service should be part of rdsys (or any other future service) having it written in go should help to integrate it.
Actually we might not need to write a service for those two endpoints. We can configure apache to serve the static file we place in the server. And I believe is possible to configure apache answer the requests with the client IP address (or the one from the X-Forwarded-For), but we'll need to ask TPA people as my knowledge of apache is pretty limited.
Anyway writing a service that does that two things is pretty trivial.
So currently we are wanting 3 APIs (NOTE edited to match latest discussion):
/* Get Cirucmvention Settings return a JSON array of TorSettings objects in the following format based on the detected location of the query (using X-Forwarded-For HTTP header?). We return an Array of these objects to provide end-users with alternatives (sorted by order in which we would prefer users to use them, with the most preferred configuration appearing first, least preferred last)*//* Request Params */{/* array of strings this optional param to specify which transports the client suggests (to avoid returning useless bridges), valid params obfs4,meek_azure,snowflake, etc*/types:["val1","val2",...],/* string: 2-letter country code optional, the client explicitly requests settings for a particular country if this param is not provided, moat determines the client's country code via geoip lookup using the provided X-Forwarded-For header*/country:"val",}/* Returns */{/* only provide if a bridge is required */bridges:{/* array of strings specifies the raw bridge strings to use to connect; optional param only provided for "bridgedb" source */bridge_strings:["val1","val2",...],},/* only provide if a specific firewall config is required (also in practice we probably won't use this but include it in the spec just in case */firewall:{/* int array: values from 1 to 65535 contains list of ports allowed to connect through */allowed_ports:[val1,val2,...],}}/* Get Circumvention Map Download a copy of the censorship circumvention map for manual use; return a JSON object with country-codes as keys mapping to instances of the above Tor Settings object arrays. For example:*//* Request Params */{}/* Returns */{cn:[{bridges:{...}},{bridges:{...},firewall:{...}},...],fr:[...],us:[...],...}/* Get BuiltIn Bridges Download a copy of the available builtin bridges ( https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Default-Bridges ). We should provide this API so we can update builtin bridges w/o shipping new Tor Browser releases, and also so that 3rd party applications can maintain a list of working bridges.*//* Request Params */{/* array of strings optional param, which transport types does the client want bridges for if not provided, return entire list of builtin bridges */types:["val1","val2",...],}/* Returns */{/* array of strings an array of all of the builtin bridge strings of the requested types */bridge_strings:["val1","val2",...],}
TODO: we need to specify how we tell users to fallback to earlier configurations in the list (to prvent the problem of all users using an expensive PT when they shouldn't technically need to); this could just be a bridgedb set counter for each configuration object.
This specification should be a good starting point, and we an tweak it as we run into implementation issues as we go.
Great, thanks for making a proposal. I think I get the idea, and makes sense. I just have few questions about it:
I'm wonder if will not be useful to provide the users country location in the Circumvention Settings (or another API call). Might be useful to provide some feedback to the user on what country TBB believes they are in?
I see the bridges entry in the Settings might contain bridge lines.
I'm a bit lost on the conversation about captchas and not captchas, I think I miss somethings from last meeting on the topic. Do we provide that without asking for any captcha? Did we plan to have a different pool of bridges for this?
Do we still want to provide bridge lines if the bridge is not the first item in the list? I'm not sure how I feel about providing bridges all around when they might not be used if vanilla bridges will work fine, but it might not be an issue.
This will imply that the Settings JSON format will not be the same than the Map country entries and require a different parser. I don't think that is going to be much an issue, but it bothers a bit my OCD.
What is the proxy entry for? Is there any case where a country will require a proxy configuration to connect to internet? Are we using any kind of socks/https proxies as circumvention mechanism in TBB?
I guess the Map will not contain countries that doesn't need circumvention, isn't it? For the Settings will we reply an empty list if we believe no circumvention mechanism is needed?
The previously proposed Circunvention Map format we had other things besides bridges, like the reachability of different websites or the dirauths. I understand for this usecase TBB might not care about it, but I wonder if we might not want to keep the country entries in the Map a JSON object where the bridge list are just one entry so we can extend it in the future for other usecases. Or maybe we should not overengineer it for usecases we don't really know.
I don't think I understand your comment:
TODO: we need to specify how we tell users to fallback to earlier configurations in the list (to prvent the problem of all users using an expensive PT when they shouldn't technically need to); this could just be a bridgedb set counter for each configuration object.
AFAIK in the Circumvention Settings we are providing a list of PTs, but from bridgedb we don't know what is actually being used. We can guess it will be the first one, but if this is the case not sure what "earlier configuration in the list" means.
I'm wonder if will not be useful to provide the users country location in the Circumvention Settings (or another API call). Might be useful to provide some feedback to the user on what country TBB believes they are in?
Our original design was two API calls, first to find geolocation and then second to request settings for a country-code. Later we figured there was not much point, since BridgeDB can figure out the caller's location through the X-Forwarded-For header, so we reduced it to a single request. We were also somewhat hesitant to directly expose 'hey, you are in country X' to users since it may confuse/scare them (ping @duncan ).
Maybe the 'Get Cirucmvention Settings' call can have an optional arg which is the country code which it will Moat will use if provided, otherwise do the geoip lookup based on the X-Forwarded-For header?
I'm a bit lost on the conversation about captchas and not captchas, I think I miss somethings from last meeting on the topic. Do we provide that without asking for any captcha? Did we plan to have a different pool of bridges for this?
I believe the current plan is to not use CAPTCHAs for this request, and to keep CAPTCHAs for the existing fetch/check APIs so we can do some science to determine if the CAPTCHA-protected bridges are enumerated at a different rate from the unprotected ones ( ping @cohosh ).
Do we still want to provide bridge lines if the bridge is not the first item in the list? I'm not sure how I feel about providing bridges all around when they might not be used if vanilla bridges will work fine, but it might not be an issue.
I'm not sure I follow, but I don't see any problem also providing 'vanilla' bridge lines as well.
This will imply that the Settings JSON format will not be the same than the Map country entries and require a different parser. I don't think that is going to be much an issue, but it bothers a bit my OCD.
Yeah I see your point. The difference is kind of a side-effect of the current about:preferenes#tor design and the legacy settings backend implementation. We could side-step this problem entirely by just always returning bridge-lines, even in the case where a tor-browser user has them already in their builtin list.
What is the proxy entry for? Is there any case where a country will require a proxy configuration to connect to internet? Are we using any kind of socks/https proxies as circumvention mechanism in TBB?
I'm open to dropping this; it's only included because we have it on the about:preferences#tor page. My logic was better off having it and never using it then to suddenly need it and not have it. Though now that I think about it, if the user does require proxy settings and don't have them, then they wouldn't be able to access moat in the first place.
So yeah, drop this (and I think the same argument could apply for Firewall's allowed_ports as well).
I guess the Map will not contain countries that doesn't need circumvention, isn't it? For the Settings will we reply an empty list if we believe no circumvention mechanism is needed?
That sounds right.
The previously proposed Circunvention Map format we had other things besides bridges, like the reachability of different websites or the dirauths. I understand for this usecase TBB might not care about it, but I wonder if we might not want to keep the country entries in the Map a JSON object where the bridge list are just one entry so we can extend it in the future for other usecases. Or maybe we should not overengineer it for usecases we don't really know.
I think for now we just leave it with what we know we need, we can always add more json entries w/o breaking backwards compatibity.
I don't think I understand your comment:
TODO: we need to specify how we tell users to fallback to earlier configurations in the list (to prvent the problem of all users using an expensive PT when they shouldn't technically need to); this could just be a bridgedb set counter for each configuration object.
AFAIK in the Circumvention Settings we are providing a list of PTs, but from bridgedb we don't know what is actually being used. We can guess it will be the first one, but if this is the case not sure what "earlier configuration in the list" means.
So from discussion with @arma last week, we want to avoid a situation where a large block of users switch to an expensive PT and then keep using it (whether because of a bug or a temporary situation). We want to provide an automatic way out to different alternatives. I propose that when a user requests circumvention settings, we return an array of possible configurations, sorted in order by how which ones we would prefer they use (eg, snowflake, then vanilla, then obfs4, then meek-azure).
We need to specify how users move from the more 'hardcore' (and expensive) configurations out to cheaper methods. I was suggesting that along with each set of bridges, we could also define a counter that specifies how often users use a configuration before moving to earlier entries in the list.
Maybe users which auto-connect just always request valid configs, and then try to use each config in the list every time they bootstrap?
I was a bit confused because in your proposal for the Circumvention Settings it says:
We return an Array of these objects to provide end-users with alternatives
But after reading your answers I guess your proposed json for now is to provide single bridge type on each request (or none if not needed). And that bridge will be the first one in the list of bridge types for the required country in the Circumvention Map. Is that correct?
In the new version of the API it looks like the Circumvention Settings does only include bridge_strings, is that correct? Can we address any kind of bridge type as bridge_string? I'm not sure for example if we have any kind of definition already for snowflake, but I guess we can do that.
With the update we have lost the content of each bridge entry in Circumvention Map. I don't remember what was there besides the type.
I'm wonder if will not be useful to provide the users country location in the Circumvention Settings (or another API call). Might be useful to provide some feedback to the user on what country TBB believes they are in?
Our original design was two API calls, first to find geolocation and then second to request settings for a country-code. Later we figured there was not much point, since BridgeDB can figure out the caller's location through the X-Forwarded-For header, so we reduced it to a single request. We were also somewhat hesitant to directly expose 'hey, you are in country X' to users since it may confuse/scare them (ping @duncan ).
I see the point of being scary for users to see that we know where they are. My reasoning is that we might not have detected correctly the country, and it will be useful for the user to be able to notice it and change it. But I will live that on @duncan's opinion.
Maybe the 'Get Cirucmvention Settings' call can have an optional arg which is the country code which it will Moat will use if provided, otherwise do the geoip lookup based on the X-Forwarded-For header?
That will not be hard to do. I thought TBB will use the Circumvention Map when the user provides the country and will not request the Circumvention Settings if the user provides a country. I'm ok with anything will be useful for you.
The previously proposed Circunvention Map format we had other things besides bridges, like the reachability of different websites or the dirauths. I understand for this usecase TBB might not care about it, but I wonder if we might not want to keep the country entries in the Map a JSON object where the bridge list are just one entry so we can extend it in the future for other usecases. Or maybe we should not overengineer it for usecases we don't really know.
I think for now we just leave it with what we know we need, we can always add more json entries w/o breaking backwards compatibity.
Right now each country entry is an array, I don't see an easy way to extend it keeping the backward compatibility. What I was saying is that this will be easier to extend:
So from discussion with @arma last week, we want to avoid a situation where a large block of users switch to an expensive PT and then keep using it (whether because of a bug or a temporary situation). We want to provide an automatic way out to different alternatives. I propose that when a user requests circumvention settings, we return an array of possible configurations, sorted in order by how which ones we would prefer they use (eg, snowflake, then vanilla, then obfs4, then meek-azure).
We need to specify how users move from the more 'hardcore' (and expensive) configurations out to cheaper methods. I was suggesting that along with each set of bridges, we could also define a counter that specifies how often users use a configuration before moving to earlier entries in the list.
Maybe users which auto-connect just always request valid configs, and then try to use each config in the list every time they bootstrap?
How is bridgedb going to know how often users use a configuration? We could count how many times we have provided a certain type of bridge as a first option in the Circumvention Settings, but bridgedb (currently) doesn't have any idea of the load that each kind of bridge network has or can sustain.
I'm not sure I have any good proposal on how to make it clever. For now I would keep it simple and just make the Circumvention Settings to give bridges using the sorting of the Circumvention Map, so we always give the type that is first for this country (or provide a list sorted in the same order than the *Circumvention Map is). So if we detect a problem we can at least manually change the Circumvention Map, which will be provided to bridgedb as a static file. And lets see with the usage if we need it more clever or not.
The previously proposed Circunvention Map format we had other things besides bridges, like the reachability of different websites or the dirauths. I understand for this usecase TBB might not care about it, but I wonder if we might not want to keep the country entries in the Map a JSON object where the bridge list are just one entry so we can extend it in the future for other usecases. Or maybe we should not overengineer it for usecases we don't really know.
I think for now we just leave it with what we know we need, we can always add more json entries w/o breaking backwards compatibity.
Right now each country entry is an array, I don't see an easy way to extend it keeping the backward compatibility. What I was saying is that this will be easier to extend:
Let's not worry about backwards compatability. The current state of censorship map is not used by anything. If the format isn't useful to us for this, we should have no qualms about changing it.
I'm a bit lost on the conversation about captchas and not captchas, I think I miss somethings from last meeting on the topic. Do we provide that without asking for any captcha? Did we plan to have a different pool of bridges for this?
I believe the current plan is to not use CAPTCHAs for this request, and to keep CAPTCHAs for the existing fetch/check APIs so we can do some science to determine if the CAPTCHA-protected bridges are enumerated at a different rate from the unprotected ones ( ping @cohosh ).
We've been having a few conversations about whether or not to keep using CAPTCHAs and I believe the solution was to "do science" and separate out a different pool of bridges that we'll serve without CAPTCHAs over moat. I think we are not yet sure of the details, but I'd suggest serving the bridges without a CAPTCHA by default and then if the user still can't connect (and/or if their country code matches a place we know these bridges are enumerated), then we fall back to giving CAPTCHAs.
Not sure if this matches well with what @richard has planned. We can also tackle this later as a next step to this work.
An "easy" option will be to have a new pool for the Circumvention Settings requests, as TBB will get bridge lines directly when asking for Circumvention Settings (without going over the existing moat bridge request).
And have another pool for normal moat requests with CAPTCHAs. So if a user asks manually for a bridge will get a CAPTCHA and hopefully the bridge will have less chances to be blocked. And if the anticensor mechanism is configured automatically there will not be CAPTCHAs.
Our original design was two API calls, first to find geolocation and then second to request settings for a country-code. Later we figured there was not much point, since BridgeDB can figure out the caller's location through the X-Forwarded-For header, so we reduced it to a single request. We were also somewhat hesitant to directly expose 'hey, you are in country X' to users since it may confuse/scare them (ping @duncan ).
I see the point of being scary for users to see that we know where they are. My reasoning is that we might not have detected correctly the country, and it will be useful for the user to be able to notice it and change it. But I will live that on @duncan's opinion.
Noted, I'll have a think about this when I'm updating the UI/flows. What degree of accuracy can we reasonably expect here? I'm assuming the use of VPNs/other proxies would interfere, correct?
Yes, if people use VPN/proxy we'll see their country as the one of the exit of the VPN/proxy. But this is ok, as the anticensorship measures they will need are for the place of the VPN/proxy and not for their actual location.
The problem that worries me here is that we'll use database that matches IP addresses to countries, but those databases are not perfect and sometimes IPs are reported as one country but the people is actually in another place. I have no idea how often we might hit that, and maybe we don't need to worry about it. We could ignore this problem for now and let's see if it does appear in reality.
After a short talk on irc with @cohosh we decided to implement it as a rdsys distributor. We we'll not be able to provide actual bridgelines in this service until we finish the bridgedb move to use rdsys (rdsys#12 (closed)).
But I will try to have an implementation soonish providing dummy bridges until we can provide proper ones.
I would say don't worry about putting any effort into providing intermediate 'mock' APIs before implementing the real thing in rdysys; if we get to the point with the UX where we need to make these calls, I can just as easily do the mocks locally within tor browser.
Don't worry about the mock, providing the default bridges will be easy to do, and most of my work will be focus on the functionality that we actually need. I will like to be sure that what we produce in our API is what TBB expects and that it makes sense.
They accept the parameters you proposed and the output is pretty close to the proposal (I did few changes). I can't provide real distributed bridges until I create a proper pool for it (rdsys#60 (closed)). Because of that they provide built-in bridge lines for both built-in and distributed needs. There is no built-in vanilla bridges, it will not provide bridge lines for vanilla bridges.
What do you think of the json format? Any thing I'm missing there? Anything that can be improved?
In the circumvention map (rdsys#64 (closed)) for now I'm doing the following priority order:
distributed vanilla bridges
built-in obfs4 bridges
distributed obfs4 bridges
snowflake
Might it make more sense to prioritize always obfs4? Do we want to prioritize distributed on top of built-in?
Ok we need a slight tweak to the circumvention/settings API. Can we have the returned object contain an array of settings configurations, in the order of the user's preferred transports? For instance, if we make a settings call with the following request object:
{transports:["obfs4","snowflake"]country:"zz"}
could we get back (for example) an object in this or similar format and contain an array of configurations:
{settings:[{bridges:{source:"builtin",// tor-browser currently has built-in, bridgedb, and user-providedtype:"obfs4",}},{bridges:{source:"bridgeb",// or some other bucket/source for bridges not behind the captcha?type:"obfs4"bridge_strings:["obfs4 blah blah blah","obfs4 other blah blah blah",...],}},{bridges:{source:"builtin",type:"snowflake"},// eg if a country tor only works on certain ports somehowfirewall:{allowed_ports:[80,443],},// this format allows us to add more members to our settings objects in the future if necessary}]}
I hope you don't mind me butting in. I'm working on the OnionShare desktop S96 changes to interact with this censorship circumvention API - see some early work here . These methods will be used as part of mimicking Tor Browser's 'Attempt automatic connection' flow in OnionShare (work for which has begun here).
I had some questions:
There is talk here about being able to provide the country code as a request parameter to the /settings endpoint (e.g, to override the geo-ip lookup from the X-Forwarded-For header). That will be super useful for me to test the functionality with returned json data, since I'm not currently in a country that is likely to require censorship settings, and therefore right now I'm just getting {} back from the endpoint. Also we'll then be able to write our unit tests if there is such a request param.
Is the domain fronting available to reach this API in a safe way? If so can anyone tell me how to use it? I dug around in Gitlab and there's talk here and there (e.g #16650 (closed)) about the API being domain fronted but I couldn't find working params. It's possible it's just not done yet. If so, sorry again if I'm jumping the gun :) Separate teams, same goals..
Thanks for any help and for the work done on this so far!
I hope you don't mind me butting in. I'm working on the OnionShare desktop S96 changes to interact with this censorship circumvention API - see some early work here . These methods will be used as part of mimicking Tor Browser's 'Attempt automatic connection' flow in OnionShare (work for which has begun here).
Amazing, thanks for joining the conversation.
There is talk here about being able to provide the country code as a request parameter to the /settings endpoint (e.g, to override the geo-ip lookup from the X-Forwarded-For header). That will be super useful for me to test the functionality with returned json data, since I'm not currently in a country that is likely to require censorship settings, and therefore right now I'm just getting {} back from the endpoint. Also we'll then be able to write our unit tests if there is such a request param.
country you already firgured out, is to provide a single country to look it up.
transports is to be able to tell it what type of transport you have support for, so it will not return configuration for the ones you don't have support for. If you don't provide it it will reply with the full list of supported transports for this country.
Take into account that the bridge lines provided by /settings are always builtin bridges, I haven't connected it yet to the actual bridgedb to distribute proper unique bridges. This will take still one or two months.
Is the domain fronting available to reach this API in a safe way? If so can anyone tell me how to use it? I dug around in Gitlab and there's talk here and there (e.g #16650 (closed) (closed)) about the API being domain fronted but I couldn't find working params. It's possible it's just not done yet. If so, sorry again if I'm jumping the gun :) Separate teams, same goals..
404 when this country needs transports but non of the provided ones in the request will work:
$ curl -d '{"country": "cn", "transports": ["obfs4"]}' https://bridges.torproject.org/moat/circumvention/settings{ "errors": [ { "code": 404, "detail": "No provided transport is available for this country" } ]}
406 when the server can't determine the country of the IP address
Ups, I didn't explain it correctly. I know is weird, but has some historical reasons. The http status code is always 200 (unless a server error that will produce a 500), but it will reply with a json with a "errors" array and a "code" integer.
@meskio@richard hello again! I hope it's ok to continue to ask some more questions here about the Censorship API as we continue to implement it in OnionShare.
We have been making great progress with using the new Censorship Circumvention API routes for obtaining bridges based on country, built-in bridges, etc.
We even have a PoC that, upon native Tor connection failure, uses Meek domain-fronting to fetch bridges from https://bridges.torproject.org/moat/circumvention/settings (currently hardcoded to country:tm) and reconnects successfully using those bridges. The PoC lacks the user-initiated consent to do this (that's also being worked on) but it was a great milestone to get the real core functionality of all this working end to end.
We have also implemented a feature in OnionShare that now uses your /moat/circumvention/builtin to automatically update OnionShare's knowledge of the 'built-in' bridges if the user chooses to use those (or if the response from /moat/circumvention/settings recommends to use built-in bridges). Previously, like Tor Browser, we relied upon hardcoded copies of the built-in bridges, which go out of date.
So it's all coming together nicely.
I have two questions:
We are implementing the scenario whereby 'Allow us to access your location in order to obtain the right bridges' fails to determine the location, and the user is offered a fallback to specify their country.
I understand that if the 'country' POST param is not sent to /moat/circumvention/settings, your API performs a GeoIP lookup of the requestor's IP address in order to return settings for that country.
I was wondering what 'error' condition we should expect if, hypothetically, the geoip check fails to determine the country for the user? Should we expect an empty JSON response, similar to that of a country that has no censorship recommendations (like when I hit /moat/circumvention/settings from my Australian connection and get {} back)? Or should we look for an 'error' key in the output?
This will help guide us on how to detect when to fallback to 'please specify the country yourself' behavior.
/moat/circumvention/settings might return bridges with different types, for a country. For example specifying the country:tm currently returns both obfs4 and snowflake bridges.
Are we to expect any order of precedence between different bridge types or does the Tor Project think it isn't important? And if the country 'tm' returns obfs4 and snowflake bridges, should we attempt connecting with the obfs4 bridges and retry with the snowflake bridges if the obfs4 bridges don't make it through? Or is it ok to just try one of them and not the other?
I'm interested really to know what Tor Browser will be doing in this scenario, e.g if you'll be 'falling back' through a 'cascading' set of different bridge types or not depending on what works. Wherever possible, we obviously want to match that behavior in OnionShare so it's a consistent experience across apps.
I was wondering what 'error' condition we should expect if, hypothetically, the geoip check fails to determine the country for the user? Should we expect an empty JSON response, similar to that of a country that has no censorship recommendations (like when I hit /moat/circumvention/settings from my Australian connection and get {} back)? Or should we look for an 'error' key in the output?
In case of an error you will get a json with the format I described here: #40025 (comment 2755309)
For not being able to determine the location the json will be like:
{"errors":[{"code":406,"detail":"Could not find country code for circumvention settings"}]}
This will help guide us on how to detect when to fallback to 'please specify the country yourself' behavior.
Sounds good. Take into account that even if we don't have any circumvention settings for your country that doesn't mean you don't need any, you might be in a network that blocks access to tor (my local uni does) or our circumvention settigns might be wrong. Will OnionShare ask the user for manual settings and/or recommend default bridges if got a {} from the settings?
Are we to expect any order of precedence between different bridge types or does the Tor Project think it isn't important? And if the country 'tm' returns obfs4 and snowflake bridges, should we attempt connecting with the obfs4 bridges and retry with the snowflake bridges if the obfs4 bridges don't make it through? Or is it ok to just try one of them and not the other?
The circumvention settings will return every known working circumvention mechanism for the country. They will be sorted by what we think will be best to use. So I think it makes sense to use the first option. I'm having some doubts about the falling back to the next one without asking the user, as trying a lot of PTs might be easily noticeable if your ISP/government does repress you by using Tor. But I guess if the user has agreed to 'automatically connect to Tor' or something like that it should be fine. I'm not sure how TB is going to treat.
@duncan @richard what is the plan if the first circumvention setting fails in TB? Will TB just try the next one?
Thanks @meskio ! I had forgotten you'd already told me about the 406 error.. thanks for the refresher.
Take into account that even if we don't have any circumvention settings for your country that doesn't mean you don't need any, you might be in a network that blocks access to tor (my local uni does) or our circumvention settigns might be wrong. Will OnionShare ask the user for manual settings and/or recommend default bridges if got a {} from the settings?
Yeah, in our scenario, we are only beginning this whole process of hitting the API if we already failed to connect to Tor.
We hit the Tor API using Meek domain fronting, so hopefully will already be addressing the most common scenario there of the Tor API itself being blocked (domain fronting should make it hard to block without blocking the CDN front).
If we still get a {} or any other HTTP error (other than the 406, where we'll let the user choose their country), we'll probably give up, and open our Tor Settings tab, which lets the user manually enter bridges, or choose built-in bridges, or request bridges from BridgeDB (which also uses Meek domain-fronting), etc.
Found some confusing things when debugging autobootstrap, following convo from today clarifies the above described APIs
<richard> meskio: sorry, what's the expected behaviour of the circumvention_settings and circumvention_map calls?<richard> should they always be returning bridge_strings?<meskio> richard: the circumvention map will never return bridgestrings, just the static json map<meskio> the settings will return bridge_strings if the circumvention map claims that those will be useful in the country<richard> so if we need to set a bridge to circumvent censorship, circumvention_settings will always return them?<richard> but the map just says which bucket said brdiges woudl come from<richard> ?<meskio> exactly<meskio> that can be changed if needed, but that is the currently behaviour<richard> ok<richard> so othe question i have then<richard> for by we are currently getting back:"[ { \"bridges\": { \"type\": \"obfs4\", \"source\": \"builtin\", \"bridge_strings\": [ \"obfs4 38.229.1.78:80 C8CBDB2464FC9804A69531437BCF2BE31FDD2EE4 cert=Hmyfd2ev46gGY7NoVxA9ngrPF2zCZtzskRTzoWXbxNkzeVnGFPWmrTtILRyqCTjHR+s9dg iat-mode=1\", \"obfs4 146.57.248.225:22 10A6CD36A537FCE513A322361547444B393989F0 cert=K1gDtDAIcUfeLqbstggjIw2rtgIKqdIhUlHp82XRqNSq/mtAjp1BIC9vHKJ2FAEpGssTPw iat-mode=0\", \"obfs4 37.218.245.14:38224 D9A82D2F9C2F65A18407B1D2B764F130847F8B5D cert=bjRaMrr1BRiAW8IE9U5z27fQaYgOhX1UCmOpg2pFpoMvo6ZgQMzLsaTzzQNTlm7hNcb+Sg iat-mode=0\" ] } }, { \"bridges\": { \"type\": \"vanilla\", \"source\": \"bridgedb\" } }, { \"bridges\": { \"type\": \"obfs4\", \"source\": \"bridgedb\", \"bridge_strings\": [ \"obfs4 38.229.1.78:80 C8CBDB2464FC9804A69531437BCF2BE31FDD2EE4 cert=Hmyfd2ev46gGY7NoVxA9ngrPF2zCZtzskRTzoWXbxNkzeVnGFPWmrTtILRyqCTjHR+s9dg iat-mode=1\", \"obfs4 146.57.248.225:22 10A6CD36A537FCE513A322361547444B393989F0 cert=K1gDtDAIcUfeLqbstggjIw2rtgIKqdIhUlHp82XRqNSq/mtAjp1BIC9vHKJ2FAEpGssTPw iat-mode=0\", \"obfs4 37.218.245.14:38224 D9A82D2F9C2F65A18407B1D2B764F130847F8B5D cert=bjRaMrr1BRiAW8IE9U5z27fQaYgOhX1UCmOpg2pFpoMvo6ZgQMzLsaTzzQNTlm7hNcb+Sg iat-mode=0\" ] } }, { \"bridges\": { \"type\": \"snowflake\", \"source\": \"builtin\", \"bridge_strings\": [ \"snowflake 0.0.3.0:1 2B280B23E1107BB62ABFC40DDCC8824814F80A72\" ] } }]"<richard> 'by' the country<richard> what are we meant to do with the second entry?<meskio> the current test implmenetation doesn't have any vanilla bridges<meskio> that is why is not providing any bridgelines<richard> and what does it mean for us that the 3rd entry has 'bridgedb' as a source?<richard> ok that makes sense<meskio> but once it goes into production it will provide bridge_strings for vanilla bridges if needed<meskio> or we will remove the vanilla bridges from the circumvention map<meskio> there is two kind of sources:<meskio> builtin: means that is the builtin bridges that come with TB<richard> mmhm<meskio> bridgedb: means that those bridges come from bridgedb<meskio> anyway for both cases bridge_lines are provided<richard> ook thanks<meskio> no prob<richard> i was a bit worried looking at the debugger and seeing no bridge lines<meskio> I should document it somewhere<meskio> sorry, I should have removed the vanilla option from the testing deployment as we don't have bridges for <richard> ok long-term<richard> we have a lot of builtin obfs4 bridges<richard> like 13 or 14 or something like now in tor-browser<richard> will circumvention_settings always return the same set of working builtins?<richard> if a country is accessible via obs4 for example<richard> currently we get back 3 or 4 bridges from the test instnace<richard> but once it's returning 'real' data, will we request circumvention_settings for by, get a list of obfs4, shoudl we then overwrite the existing list of builtin obfs4 bridges from the returned list?<richard> for the country 'by' I mean<richard> or shoudl we be treating the returned builtin obfs4, and our larger list of builtin obfs4 as separate entities<meskio> mmm, not sure the implications of 'rewriting the list of builtin'<meskio> is probably best to use the ones provided by the settings, as they might be more up to date than the TB ones<meskio> do you think 3 bridges is too few? should we provide the full list there?<meskio> AFAIK tor will just connect to the first one if it works and if not it will try the following one<meskio> hopefully builtin bridges are stable enough that it should never fail the 3 of them<richard> yeah, so this is primarily a Tor Browser implementation/abstraction problem<richard> i guess the problem i want to avoid, is getting into an unrecoverable bad state<richard> and tor-browser having no way of using 'builtin' bridges, but a separate set of 'builtin' bridges from the ones that are shipped with tor browser<richard> which means we either need a second subset of builtin bridges enumeration, or we need to overwrite our builtins with the new ones<richard> or option 3<richard> i just ignore the provided bridge strings when the are builtin<meskio> if you want to update the TB builtin bridges you should use the /builtin endpoint of the moat API<meskio> no the circumvention settings<richard> and just use our own list after querying the circumvention_builtin api<meskio> I agree the settings bridges should not rewrite the TBs builtin bridges<richard> yes yes exactly<richard> ok<richard> i think the api as-is is good, especially for our 3rd party partners<richard> who don't have this legacy problem<richard> they can just ask for bridges and use the ones as received<richard> but tor-browser can be a bit smarter about it given that it also ships witha list of builtin bridges and needs to keep a consistent UI between autobootstrap and manual settings
@meskio: We're planning on using these APIs in the next alpha release (11.5a5) scheduled for March 15th 2022. Will they be ready for public consumption by then? Should I open another ticket to track?
@meskio is there anything we (OnionShare) need to know about it now being 'in production'? We kind of have been treating the existing URLs as already in production. Has anything changed such as URLs etc that we need to know about, or is everything the same in terms of consuming the API?