Verified Commit c1085aee authored by anarcat's avatar anarcat
Browse files

files for the 2020 crawl are already present, in the same location

parent 43867e29
Loading
Loading
Loading
Loading
+4 −4
Original line number Diff line number Diff line
@@ -633,14 +633,14 @@ the crawl. Another crawl was performed back in 2019, so the known full
archives of Trac are as follows:

 * [june 2019 ticket crawl](https://archive.fart.website/archivebot/viewer/job/5vytc): 6h30, 29892 files, 1.9 GiB
 * [june 2019 full crawl](https://archive.fart.website/archivebot/viewer/job/bpu6j): 5 days, 7h30, 732488 files, 105.4 GiB
 * [june 2020 ticket crawl](https://archive.fart.website/archivebot/viewer/job/c4xu3): 4h30, 33582 files, 1.9GiB
 * [june 2020 full crawl]() (TBD, still processing, should appear [in
   the viewer shortly](https://archive.fart.website/archivebot/viewer/?q=trac.torproject.org))
 * [june 2019 and 2020 full crawls](https://archive.fart.website/archivebot/viewer/job/bpu6j): 5 days, 7h30, 732488 files,
   105.4 GiB; TBD for the 2020 crawl

This information can be extracted back again from the `*-meta.warc.gz`
(text) files in the above URLs. This was done as part of [ticket
40003](https://gitlab.torproject.org/tpo/tpa/services/-/issues/40003).
40003](https://gitlab.torproject.org/tpo/tpa/services/-/issues/40003). There has also been other, independent, crawls of Trac,
which are partly visible [in the viewer](https://archive.fart.website/archivebot/viewer/?q=trac.torproject.org).

### History