Proxy vs Relay countries
https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/blob/master/src/captchamonitor/core/analyzer.py#L104-108 matches proxy countries to the relay countries which after certain reviews I saw that the proxy list (https://spys.me/proxy.txt) we are using are diverse but for comparison to relays are offering varying countries to relays. Though, the list changes often but the times I saw, it had proxies of countries like: Maldives, Sweden, Austria, Romania missing.
So currently, I think the better approach would be to rather check for all proxies and that would get an even more diverse options for proxies that aren't good or have bad connectivity and also to try to use the html_similarity
module (https://github.com/matiskay/html-similarity).
So,plan to add another module and remove the country checker thing from analyzer.py
.
Steps to be taken:
-
Remove the country or location filter for proxy and relay -
Add html_similarity
module
cc: @woswos