FAQs:
analyze_completed
table?
How to read the - status_check column
status_check_value | Explanation In Code | In Simpler form |
---|---|---|
0 | Tor Blocked | Tor is discriminated |
1 | Site is blocked on tor and non-tor browsers | Tor is not discriminated . Both are blocked from server-side |
2 | Tor is not blocked, rather non-tor browser is blocked | Tor is not discriminated :) |
Proceed for further tests.. |
- dom_analyze column
dom_analyze_value | Explanation In Code | In Simpler form |
---|---|---|
0 | Tor most probably Errors!! | Tor is discriminated |
1 | Resembles same | Tor is not discriminated |
2 | Survived Checklist but still doubt (Further modules might help) | Tor is not discriminated |
3 | Doubtful case!! checking for keywords... Tor Blocked : checklist!! |
Tor is discriminated |
4 | Equal | Tor is not discriminated |
- captcha_checker column
captcha_checker_value | Explanation In Code | In Simpler form |
---|---|---|
0 | Same... | Tor is not discriminated |
1 | Captcha Present | Tor is discriminated |
Why separate modules when the Consensus Lite Module is actually nothing but similar to the DOM Analysis?
While creating the modules, first I decided to compare the Structure of the HTML data returned by the control node as well as tor node and come to conclusions. For the demo purpose it did work. Check it out here.
Further on tinkering more I came to conclusions that many websites work on geo-location
based principle. So it would be possible that a website from two different countries won't be similar. Therefore, I came up with VPNs/Proxies/Web-Proxies in that same region to further reduce the scope of error. Because we now do have a hint about how a website might appear in that specific country/region. Of-course if our proxy isn't blocked :P
So, there are cases when we don't want to call the consensus lite hence making the application a bit optimized , like when both tor and control nodes have exact same resemblance of DOM nodes count or even similar (structure). I found Wikipedia being one such example. Now we don't really require much measurements here, because we assume our Control Node (non-tor node) to be an Autonomous System (AS) that doesn’t censor or modify the requests.
Why did I call the Consensus Lite Module, when DOMs are very different (we already know that we get a different page) so why not call the module when the DOMs are similar?
Yes that is totally correct, we already know we have got a different page, but we don't really know if that different page is a blocked page or a page belonging to that particular country. For instance assume Dominos, (might redirect you to different url) or websites similar to it, where websites are mapped by geo-locations
and let's say our control node or non-tor node is somewhere in Europe (Germany) and tor exit node belonging to the US, there could be quite some difference in this page (even totally different). So it could be either similar, or not at all similar, but we don't really know if it's blocking Tor node or not!