Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
Trac
Trac
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Create a new issue
  • Issue Boards

GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

  • Legacy
  • TracTrac
  • Issues
  • #25523

Closed (moved)
Open
Opened Mar 16, 2018 by Karsten Loesing@karsten

Add support for webstats tarballs

I started creating tarballs containing .xz-compressed webstats files. When I attempt to feed them into DescriptorReader, it fails with an exception like the following:

Cannot parse descriptor file ’in/webstats-2016-01.tar’.
��s",�����k)�nnq����w؆jG�I�[1��eѰCx%��'.
        at org.torproject.descriptor.impl.DescriptorParserImpl.detectTypeAndParseDescriptors(DescriptorParserImpl.java:136)
        at org.torproject.descriptor.impl.DescriptorParserImpl.parseDescriptors(DescriptorParserImpl.java:33)
        at org.torproject.descriptor.impl.DescriptorReaderImpl$DescriptorReaderRunnable.readTarball(DescriptorReaderImpl.java:325)
        at org.torproject.descriptor.impl.DescriptorReaderImpl$DescriptorReaderRunnable.readTarballs(DescriptorReaderImpl.java:276)
        at org.torproject.descriptor.impl.DescriptorReaderImpl$DescriptorReaderRunnable.run(DescriptorReaderImpl.java:162)
        at java.lang.Thread.run(Thread.java:745)}

The tarballs I created contain files as follows:

$ tar tf webstats-2016-01.tar
[...]
webstats-2016-01/torproject.org/2016/01/25/torproject.org_aroides.torproject.org_access.log_20160125.xz
webstats-2016-01/torproject.org/2016/01/25/torproject.org_archeotrichon.torproject.org_access.log_20160125.xz

When I extract tarball files before reading them with DescriptorReader, this works just fine.

I think that the issue is that DescriptorParserImpl#detectTypeAndParseDescriptors() looks at descriptorFile rather than fileName to obtain the file name. The effect is that it learns the tarball file name, rather than the file name of the contained log file:

-    if (descriptorFile.getName().contains(LogDescriptorImpl.MARKER)
+    if (fileName.contains(LogDescriptorImpl.MARKER)

The above is untested and probably insufficient. It's just supposed to start the bug hunting. Priority is medium, because we can just extract tarballs for now. But it's a bug, and it may confuse users as soon as we provide these tarballs and no working code to process them.

This is also related to #22695 (moved).

Assigning to iwakeh who said they'd like to grab it.

To upload designs, you'll need to enable LFS and have admin enable hashed storage. More information
Assignee
Assign to
metrics-lib 2.3.0
Milestone
metrics-lib 2.3.0
Assign milestone
Time tracking
None
Due date
None
Reference: legacy/trac#25523