Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Trac Trac
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Service Desk
    • Milestones
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
  • Wiki
    • Wiki
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar
  • Legacy
  • TracTrac
  • Issues
  • #28799

Closed (moved)
(moved)
Open
Created Dec 10, 2018 by Karsten Loesing@karsten

Use readr's read_csv() to speed up drawing graphs

Let's use R.cache to speed up drawing graphs. I already prepared a patch that I'm going to post here as soon as I have a ticket number. From the commit message:

Over two years ago, in commit 1f90b72 from October 2016, we made our user graphs faster by avoiding to read the large .csv file on demand. Instead we read it once as part of the daily update, saved it to disk as .RData file using R's save() function, and loaded it back to memory using R's load() function when drawing a graph.

This approach worked okay. It just had two disadvantages:

  1. We had to write a small amount of R code for each graph type, which is why we only did it for graphs with large .csv files.
  2. Running these small R script as part of the daily update made it harder to move away from Ant towards a Java-only execution model.

The new approach implemented in this commit uses R.cache, which caches data for use by concurrent Rserve clients. The first time we read a .csv file we save it to the cache, and all subsequent times we just load it back from the cache. We're using the file name and last modified time as key in the cache to avoid using stale data. We're also clearing the cache on startup to avoid running out of disk space.

One somewhat unwanted side effect is that drawing the first graph from a new .csv file may take a few more seconds as compared to drawing subsequent graphs. This seems acceptable, though.

Requires installing the R.cache package from CRAN, which is available on Debian as r-cran-r.cache.

Edit: Turns out that we don't want R.cache but readr's read_csv() instead. See comments below.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking