Skip to content

Rename URL table to Domain

Barkin Simsek requested to merge domain into master

I decided to rename the URL table to Domain and add the url field to the fetcher queue because the original intention of the URL table was to store only the domain names and their related information (HTTP, HTTPS, IPv4, IPv6 support, etc.). At some point, I forgot about this fact and started feeding the domain names from that table into the fetchers as if they were complete URLs, for example, asking Tor Browser fetcher to fetch torproject.org. However, we should ask Tor Browser fetcher to fetch https://torproject.org, http://torproject.org, https://check.torproject.org, etc

To achieve this, I added a new url field to the fetch queues. Now, the job scheduler can decide which exact URL to fetch instead of a generic domain name without any protocol prefix.

I also renamed update_websites to update_domains to keep the naming consistent.

Edited by Barkin Simsek

Merge request reports

Loading