Are redirect followed?
From main/docs/operations.md, I found this extremely interesting:
3330 unique two-label
.oniondomains were configured from 26937 unique sites. 13956 of those unique sites have the same Onion-Location configuration as Twitter, which likely means that they copied some of their HTML attributes.
I wondered if these sites were clones/phishing attempts of Twitter.
So, I tried to open a couple and I got redirected to https://x.com/someusername.
I had already seen this pattern: some people use a subdomain to redirect to their Twitter/X account. So, is the scraper following these redirects and then associating them to the original URL?
If so, personally I think these cases should be ignored instead (or considered only if there's also an explicit Onion-Location header).