Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Trac Trac
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Service Desk
    • Milestones
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
  • Wiki
    • Wiki
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar
  • Legacy
  • TracTrac
  • Issues
  • #25317

Closed (moved)
(moved)
Open
Created Feb 21, 2018 by iwakeh iwakeh@iwakeh

Enable webstats to process large (> 2G) logfiles

Quote from #25161 (moved), comment 12: Looking at the stack trace and the input log files, I noticed that two log files are larger than 2G when decompressed:

3.2G in/webstats/archeotrichon.torproject.org/dist.torproject.org-access.log-20160531
584K in/webstats/archeotrichon.torproject.org/dist.torproject.org-access.log-20160531.xz
2.1G in/webstats/archeotrichon.torproject.org/dist.torproject.org-access.log-20160601
404K in/webstats/archeotrichon.torproject.org/dist.torproject.org-access.log-20160601.xz

I just ran another bulk import with just those two files as import and ran into the same exception.

It seems like we shouldn't attempt to decompress these files into a byte[] in FileType.decompress, because Java can only handle arrays with up to 2 billion elements: https://en.wikipedia.org/wiki/Criticism_of_Java#Large_arrays . Maybe we should work with streams there, not byte[].

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking