|
|
The **Cloudflare CAPTCHA Monitoring** project aims to track how often Cloudflare fronted webpages return CAPTCHAs to Tor clients. The project aims to achieve this by fetching webpages via both Tor and other mainstream web browsers and comparing the results. The tests are repeated periodically to find the patterns over time. Collected metadata, metrics, and results are analyzed and displayed on a dashboard to understand how Cloudflare manipulates internet traffic and affects people's access to the internet.
|
|
|
|
|
|
## Interesting places to visit
|
|
|
- [dashboard.captcha.wtf](https://dashboard.captcha.wtf) or [captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion](http://captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion/)
|
|
|
- [api.captcha.wtf](https://api.captcha.wtf/)
|
|
|
|
|
|
## Code
|
|
|
The codebase consists of three separate repository which are dedicated to the three different components of the project.
|
|
|
The codebase consists of four separate repositories that are dedicated to the four different components of the project.
|
|
|
- The code for the core of the project that does measurements is located in [this GitLab repository](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor).
|
|
|
- The code for API and dashboard for visualizing the results is located [this GitLab repository](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor-Dashboard).
|
|
|
- The code for the dashboard for visualizing the results is located [this GitLab repository](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor-Dashboard).
|
|
|
- The code for API is located [this GitLab repository](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor-API).
|
|
|
- The code for the websites fronted by Cloudflare is located [this GitLab repository](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor-Web).
|
|
|
|
|
|
## Documentation
|
|
|
You also find the documentation in [this Read the Docs page](https://captcha-monitor.readthedocs.io). [Not updated at the moment]
|
|
|
- Core CAPTCHA Monitor code documentation -> [Read the Docs page](https://captcha-monitor.readthedocs.io) [Not updated at the moment]
|
|
|
- Interactive API documentation -> [api.captcha.wtf](https://api.captcha.wtf/)
|
|
|
|
|
|
## Dataset
|
|
|
The data collected so far can be found in this dashboard: [dashboard.captcha.wtf](https://dashboard.captcha.wtf) or [captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion](http://captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion/)
|
|
|
You can view various visualizations of the collected data on the [dashboard](https://dashboard.captcha.wtf). If you prefer to access the raw data and conduct your own research, you can use the [API](https://api.captcha.wtf/) to fetch the data.
|
|
|
|
|
|
The data can be downloaded through the API.
|
|
|
If you want to get a copy of the whole database, I would be very happy to share it, please [contact](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/home#contact) me.
|
|
|
|
|
|
## Detailed description
|
|
|
By design, Cloudflare is developed to alter the traffic between the web servers and internet users. Cloudflare modifies the internet traffic to protect the Cloudflare fronted web servers from various attacks from users with malicious intentions. Even though this seems like a practice with good faith on the surface to protect servers, it harms millions of users more than doing good. Cloudflare makes decisions to block or not to block users based on multiple factors such as visitor's IP address, resources requested, request payload and frequency, and customer-defined firewall rules [Source](https://web.archive.org/web/20200328165212/https://support.cloudflare.com/hc/en-us/articles/205177068-How-does-Cloudflare-work). They don't share the specifics of their decision-making mechanism since it keeps changing over time, and it is not open-source. However, this doesn't stop us from experimenting with the algorithm and understanding how it decides to block/not block users.
|
... | ... | @@ -100,15 +106,15 @@ Please consider taking a look at the CAPTCHA Monitor Project's [Kanban board](ht |
|
|
- [ ] Split the codebase into more modular pieces that can be chained, create a pipeline
|
|
|
- [X] CAPTCHA Monitor
|
|
|
- [ ] A tool for organizing/compacting the data
|
|
|
- [ ] API
|
|
|
- [X] API
|
|
|
- [X] Dashboard
|
|
|
- [ ] Brainstorm about new metrics to collect
|
|
|
- [ ] Find more third-party websites to track
|
|
|
- [ ] Brainstorm about new metrics to collect [In progress]
|
|
|
- [ ] Find more third-party websites to track [In progress]
|
|
|
- [ ] Submit a report the Tor Research Safety Board
|
|
|
- [ ] Brainstorm the integration with OONI people
|
|
|
- [ ] Brainstorm the integration with Tor Metrics people
|
|
|
- [X] Create an API for people to fetch data easily
|
|
|
- [ ] Create an API for running the system on the user-provided websites
|
|
|
- [ ] Brainstorm the integration with Tor Metrics people [In progress]
|
|
|
- [X] Create an API for people to fetch data easily [api.captcha.wtf](https://api.captcha.wtf/)
|
|
|
- [ ] Create a new endpoint to the API for performing measurements on the user-provided websites
|
|
|
|
|
|
## Development
|
|
|
GeKo & arma are the mentors of this project, and currently, I'm (woswos) the only developer of this project. I develop this project as a part of the [Google Summer of Code program](https://summerofcode.withgoogle.com/projects/#5323380741373952).
|
... | ... | |