Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
CAPTCHA-Monitor
CAPTCHA-Monitor
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 22
    • Issues 22
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards

GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

  • Barkin Simsek
  • CAPTCHA-MonitorCAPTCHA-Monitor
  • Wiki
  • Home

Home · Changes

Page history
Update home authored Aug 28, 2020 by Barkin Simsek's avatar Barkin Simsek
Hide whitespace changes
Inline Side-by-side
Showing with 2 additions and 1 deletion
+2 -1
  • home.md home.md +2 -1
  • No files found.
home.md
View page @ 368122f3
The **Cloudflare CAPTCHA Monitoring** project aims to track how often Cloudflare fronted webpages return CAPTCHAs to Tor clients. The project aims to achieve this by fetching webpages via both Tor and other mainstream web browsers and comparing the results. The tests are repeated periodically to find the patterns over time. Collected metadata, metrics, and results are analyzed and displayed on a dashboard to understand how Cloudflare manipulates internet traffic and affects people's access to the internet.
The **CAPTCHA Monitoring** project aims to track how often CDN (for ex. Cloudflare, Akamai, Amazon Cloudfront, etc.) fronted webpages return CAPTCHAs to Tor clients. The project aims to achieve this by fetching webpages via both Tor and other mainstream web browsers and comparing the results. The tests are repeated periodically to find the patterns over time. Collected metadata, metrics, and results are analyzed and displayed on a dashboard to understand how CDN providers manipulate internet traffic and affect people's access to the internet.
## Interesting places to visit
- [dashboard.captcha.wtf](https://dashboard.captcha.wtf) or [captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion](http://captchaufjq5m2i73up537pldaxnbp6rzcbdrzc7y5rlwtx3mwigznad.onion/)
......@@ -21,6 +21,7 @@ You can view various visualizations of the collected data on the [dashboard](htt
If you want to get a copy of the whole database, I would be very happy to share it, please [contact](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/home#contact) me.
## Detailed description
_(Keep in mind that the project was focused only on Cloudflare initially, and later expanded to tracking other CDN providers as well.)_
By design, Cloudflare is developed to alter the traffic between the web servers and internet users. Cloudflare modifies the internet traffic to protect the Cloudflare fronted web servers from various attacks from users with malicious intentions. Even though this seems like a practice with good faith on the surface to protect servers, it harms millions of users more than doing good. Cloudflare makes decisions to block or not to block users based on multiple factors such as visitor's IP address, resources requested, request payload and frequency, and customer-defined firewall rules ([Source](https://web.archive.org/web/20200328165212/https://support.cloudflare.com/hc/en-us/articles/205177068-How-does-Cloudflare-work)). They don't share the specifics of their decision-making mechanism since it keeps changing over time, and it is not open-source. However, this doesn't stop us from experimenting with the algorithm and understanding how it decides to block/not block users.
Cloudflare mentions that ​IP address​ based rules have the highest hierarchy, and it is followed by Firewall Rules​, ​Zone(URL) Lockdown,​ ​User Agent Blocking​, and ​Web Application Firewall ([Source](https://web.archive.org/web/20200328143759/https://support.cloudflare.com/hc/en-us/articles/115002059131-Understanding-your-site-protection-options)). Thus, Cloudflare clearly states in their documentation that they do consider the user's ​IP addresses​ and their web browser's ​User Agent​ while deciding to block a user. Unfortunately, Cloudflare algorithms trigger all red flags when these two parameters (IP address and user agent) are matching to a typical Tor user. This is an easy thing to do for Cloudflare because Tor Browser uses the one fingerprint for all philosophy, and the list of Tor exit nodes is publicly available. The Cloudflare CTO himself, explains that they fetch the list of Tor exit nodes and assign a reputation to the nodes in trac ​ticket:18361#comment:23​ to block certain users.
......
Clone repository

Home
 Code
 Interesting Places to Visit
 Documentation
 Dataset
 Detailed Description
 Expected Long-term Impact
 Approach
 Metrics to Track
 Related Tickets
 Roadmap
 Domains Used For Testing
 Development
 Contact
 Reporting Bugs
 Contributing

GSoC 2020

Design Docs
 Dashboard Graphs
 Dashboard UI

Updates
 Tor Mailing List Threads
 Monthly Reports
  August 2020
  July 2020
  June 2020
  May 2020
 Weekly Blog Posts
  August 2020
  July 2020
  June 2020
  May 2020

Archive
 Dashboard Graphs v0