|
|
The **Cloudflare CAPTCHA Monitoring** project aims to track how often Cloudflare fronted webpages return CAPTCHAs to Tor clients. The project aims to achieve this by fetching webpages via both Tor and other mainstream web browsers and comparing the results. The tests are repeated periodically to find the patterns over time. Collected metadata, metrics, and results are analyzed and displayed on a dashboard to understand how Cloudflare manipulates internet traffic and affects people's access to the internet.
|
|
|
The **CAPTCHA Monitoring** project aims to track how often CDN (for ex. Cloudflare, Akamai, Amazon Cloudfront, etc.) fronted webpages return CAPTCHAs to Tor clients. The project aims to achieve this by fetching webpages via both Tor and other mainstream web browsers and comparing the results. The tests are repeated periodically to find the patterns over time. Collected metadata, metrics, and results are analyzed and displayed on a dashboard to understand how CDN providers manipulate internet traffic and affect people's access to the internet.
|
|
|
|
|
|
## Interesting places to visit
|
|
|
- [dashboard.captcha.wtf](https://dashboard.captcha.wtf) or [captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion](http://captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion/)
|
... | ... | @@ -21,6 +21,7 @@ You can view various visualizations of the collected data on the [dashboard](htt |
|
|
If you want to get a copy of the whole database, I would be very happy to share it, please [contact](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/home#contact) me.
|
|
|
|
|
|
## Detailed description
|
|
|
_(Keep in mind that the project was focused only on Cloudflare initially, and later expanded to tracking other CDN providers as well.)_
|
|
|
By design, Cloudflare is developed to alter the traffic between the web servers and internet users. Cloudflare modifies the internet traffic to protect the Cloudflare fronted web servers from various attacks from users with malicious intentions. Even though this seems like a practice with good faith on the surface to protect servers, it harms millions of users more than doing good. Cloudflare makes decisions to block or not to block users based on multiple factors such as visitor's IP address, resources requested, request payload and frequency, and customer-defined firewall rules ([Source](https://web.archive.org/web/20200328165212/https://support.cloudflare.com/hc/en-us/articles/205177068-How-does-Cloudflare-work)). They don't share the specifics of their decision-making mechanism since it keeps changing over time, and it is not open-source. However, this doesn't stop us from experimenting with the algorithm and understanding how it decides to block/not block users.
|
|
|
|
|
|
Cloudflare mentions that IP address based rules have the highest hierarchy, and it is followed by Firewall Rules, Zone(URL) Lockdown, User Agent Blocking, and Web Application Firewall ([Source](https://web.archive.org/web/20200328143759/https://support.cloudflare.com/hc/en-us/articles/115002059131-Understanding-your-site-protection-options)). Thus, Cloudflare clearly states in their documentation that they do consider the user's IP addresses and their web browser's User Agent while deciding to block a user. Unfortunately, Cloudflare algorithms trigger all red flags when these two parameters (IP address and user agent) are matching to a typical Tor user. This is an easy thing to do for Cloudflare because Tor Browser uses the one fingerprint for all philosophy, and the list of Tor exit nodes is publicly available. The Cloudflare CTO himself, explains that they fetch the list of Tor exit nodes and assign a reputation to the nodes in trac ticket:18361#comment:23 to block certain users.
|
... | ... | |