Understand and document BridgeDB bot scraping attempts
We are aware of automated attempts to enumerate bridges in BridgeDB, but lack a more rigorous understanding of the problem.
We have detected bot requests from bridgeDB's web interface and deployed some defences by forbidding requests with headers that are commonly associated with bots, and handing out fake bridges to suspected bot requests (legacy/trac#31252 (moved)), and
We also suspect that these bots are solving our CAPTCHAs more accurately than users (legacy/trac#24607 (moved)).
After a recent campaign to get more volunteer bridges, we set up an experiment to test the reachability of a subset of these new bridges from a probe site in Beijing and found all new bridges in our sample to be blocked (most were blocked from the very start of the experiment): legacy/trac#31701 (moved)
This ticket is for documenting bot behaviour and brainstorming ways to detect and analyze the automatic scraping of BridgeDB from censor-owned bots.