Automatic check of ListOfServicesBlockingTor

There is a page in Wiki for ListOfServicesBlockingTor. It is to be reported by isers about Tor blocks. It has drawbacks

  1. It can contain obsolete information.
  2. It has a lot of websites actually blocking Tor missing.

We need a better approach. We need an automated service, doing the following:

  1. It has a DB
  2. It has a table of services (domain/ip and port) blocking Tor
  3. It recrawls this services and checks if Tor is still blocked every day/hour/minute/second. When it crawls it connects either using TLS or using netcat, and checks the downloaded text for signs of blocking and similarity to the text of the previous crawl.
  4. It has a table of services not blocking Tor and new services to be checked.
  5. It has a web spider discovering new services (the services not present in either table) and adding them to the table to be checked.
  6. It has a form to add a service manually with free-form description in markdown.
  7. It generates the daily report and posts it into the Wiki.
Edited by meskio