Fix either spec or code regarding full path of sanitized webstats files
This issue came up when discussing webstats tarballs that I created the other day: what file structure should these tarballs have, internally.
Turns out we already specified this file structure in Section 5.4 of the Protocol of CollecTor's File Structure:
"'webstats' contains compressed log files structured and named according to the 'Tor web server logs' specification, section 4.3 ."
And Section 4.3 of the referenced specification says:
''"Sanitized log files may additionally be sorted into directories by virtual host and date as in: /YYYY/MM/__access.log_YYYYMMDD[.xz]"''
So, I'd say this is sufficiently specified.
However, the current structure of CollecTor's
out/ directory is different, as implemented here:
this.storagePath = Paths.get( WEBSTATS, this.desc.getVirtualHost(), this.desc.getLogDate().format(yearPattern), // year this.desc.getLogDate().format(monthPattern), // month this.desc.getLogDate().format(dayPattern), // day name).toString();
Note the day part which does not exist in the specification.
So, we'll either have to fix the specification or the code. I don't feel strongly which one we change. But let's make a decision really soon, before I start reprocessing archives due to legacy/trac#25522 (moved). Therefore setting priority to High.