Update GSoC 2021 authored by hackhard's avatar hackhard
As of now **The Captcha Monitoring** project tracks how often CDN (for ex. Cloudflare, Akamai, Amazon Cloudfront, etc.) fronted webpages return CAPTCHAs to Tor clients, the details of which could be found in [here](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/GSoC-2020-Home). As of now **The Captcha Monitoring** project tracks how often CDN (for ex. Cloudflare, Akamai, Amazon Cloudfront, etc.) fronted webpages return CAPTCHAs to Tor clients, the details of which could be found in [here](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/GSoC-2020-Home).
This is a wiki to the GSoC 2021 branch which will be tracking the The Alexa Top 500 Websites, to give a detailed perspective of websites mentioned above, blocking or returning Captchas to Tor clients. The project aims to do so by fetching webpages over a period of time from both Tor clients/browsers and the non-tor browsers thereby comparing the results. The results will be then collected and provide answers to the different metrics, and also form an understanding to how websites are blocking Tor and affect the Internet freedom. This is a wiki to the GSoC 2021 branch which will be tracking the The Alexa Top 500 Websites, to give a detailed perspective of websites mentioned above, blocking or returning Captchas to Tor clients. The project aims to do so by fetching webpages over a period of time from both Tor clients/browsers and the non-tor browsers thereby comparing the results. The results will be then collected and provide answers to the different metrics, and also form an understanding to how websites are blocking Tor and affect the Internet freedom.
...@@ -46,12 +45,20 @@ Here we can see that the status codes are `200`, but still there is a clear diff ...@@ -46,12 +45,20 @@ Here we can see that the status codes are `200`, but still there is a clear diff
One could find in more details [here](https://hackhard.github.io/my-blog//My-Approach-29-03). One could find in more details [here](https://hackhard.github.io/my-blog//My-Approach-29-03).
## Nearby Goals : ## Nearby Goals and Roadmap:
- [ ] Working of the Captcha Monitor if the website blocks Tor fully _(Response errors or difference in status codes)._ - [ ] Check with few handful websites:
- [ ] Working of the Captcha Monitor if the website returns Captcha . - [ ] Check if the website blocks Tor fully _(Response errors or difference in status codes)_.
- [ ] Working of the Captcha Monitor if the website redirects to another error page *(without blocking it).* - [ ] Add script to continue checking or return `Blocking`.
- [ ] Working of the Captcha Monitor if the website isn't blocked, and works fine. - [ ] Check if the website returns `Captcha` .
- [ ] Add websites with limited functionalities. - [ ] Add script to continue checking or return `Captcha`.
- [ ] Check if the website redirects to another error page (without blocking it).
- [ ] Add script to continue checking or return `Error page`.
- [ ] Check if the website allows anonymous visiting, and works fine.
- [ ] Run tests.
- [ ] Check if the website allows limited functionalities.
- [ ] Repeat above with a random website
- [ ] If the above works good run for more number of websites
- [ ] Else, checking for the problem.
## Posts : ## Posts :
Meanwhile I'll be updating a blog which will further provide more details to the project, from the difficulties faced to the different approaches being taken to help with the understanding, documentation and easy contribution. Meanwhile I'll be updating a blog which will further provide more details to the project, from the difficulties faced to the different approaches being taken to help with the understanding, documentation and easy contribution.
... ...
......