Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Trac Trac
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Service Desk
    • Milestones
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
  • Wiki
    • Wiki
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar
  • Legacy
  • TracTrac
  • Issues
  • #2921

Closed (moved)
(moved)
Open
Created Apr 15, 2011 by Karsten Loesing@karsten

Improve bulk import of relay descriptors into metrics database

We currently have two ways to import relay descriptors into the metrics database:

  • JDBC import: We have a Java importer that connects to the metrics database via JDBC. We use a few tweaks like committing batches of up to 500 rows, but importing months of data is still a time-consuming task.

  • psql \copy: The Java importer can be configured to parse relay descriptor files and write files for psql's \copy command. The disadvantage is that \copy cannot handle duplicates very well, so that we have to pre-process the bulk import files.

I wonder if there are better approaches than these two, or if there are improvements to how we implement them. It would be good to compare the performance of these two approaches and any improvements to them for 1 (12, 24) months of data.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking