Decide what graph to display when there's no data to graph

added component::metrics/website in Legacy / Trac owner::iwakeh in Legacy / Trac priority::high in Legacy / Trac resolution::fixed in Legacy / Trac severity::normal in Legacy / Trac status::closed in Legacy / Trac type::enhancement in Legacy / Trac labels

This issue isn't new, and the current situation is not as bad. Setting priority to low.

Trac:
Priority: Medium to Low

For avoiding the 500 server error when trying to access the png (as reported in detail in legacy/trac#25468 (moved)) an empty graph/png stating 'no data available for this parameter choice' should be generated. Currently, the corresponding csv only contains the header. Maybe, also add a comment here too in case no data is available. It should be avoided to call R if there is no data available.

Replying to iwakeh:

For avoiding the 500 server error when trying to access the png (as reported in detail in legacy/trac#25468 (moved)) an empty graph/png stating 'no data available for this parameter choice' should be generated.

Sounds good. We'll also need such a static file in the PDF format. I'll create placeholders using R/ggplot2 in a bit. We just need to write the code to put them in.

Currently, the corresponding csv only contains the header. Maybe, also add a comment here too in case no data is available.

Sounds good, too. Want to suggest a text? I can put it in then.

It should be avoided to call R if there is no data available.

Well, we need to call R to find out whether there's data available for the requested period of time. Or what did you mean?

Trac:

Trac:
no-data-available.pdf

Replying to karsten:

Replying to iwakeh:

For avoiding the 500 server error when trying to access the png (as reported in detail in legacy/trac#25468 (moved)) an empty graph/png stating 'no data available for this parameter choice' should be generated.

Sounds good. We'll also need such a static file in the PDF format. I'll create placeholders using R/ggplot2 in a bit. We just need to write the code to put them in.

Attached. I generated these two files with the following code:

library(ggplot2)
library(dplyr)
library(tidyr)

data.frame(x = c(0, NA), y = c(0, NA)) %>%
  filter(x != 0) %>%
  ggplot(aes(x = x, y = y)) + 
  scale_x_continuous("") + 
  scale_y_continuous("") + 
  labs(caption = "The Tor Project - https://metrics.torproject.org/") + 
  theme(plot.margin = margin(5.5, 11, 5.5, 5.5),
    plot.title = element_text(hjust = 0.5, margin = margin(b = 11), size = 36),
    panel.background = element_rect(fill = NA)) + 
  ggtitle("\n\n\n\nNo Data Available")
ggsave(filename = "no-data-available.png", width = 8, height = 5, dpi = 150)
ggsave(filename = "no-data-available.pdf", width = 8, height = 5, dpi = 150)

I can also write the code that returns these files, unless you'd like to do that.

Replying to karsten:

Replying to iwakeh:

For avoiding the 500 server error when trying to access the png (as reported in detail in legacy/trac#25468 (moved)) an empty graph/png stating 'no data available for this parameter choice' should be generated.

Sounds good. We'll also need such a static file in the PDF format. I'll create placeholders using R/ggplot2 in a bit. We just need to write the code to put them in.

Currently, the corresponding csv only contains the header. Maybe, also add a comment here too in case no data is available.

Sounds good, too. Want to suggest a text? I can put it in then.

It should be avoided to call R if there is no data available.

Well, we need to call R to find out whether there's data available for the requested period of time. Or what did you mean?

The java code currently returns the SC_INTERNAL_SERVER_ERROR. This is decided by the properties of the RObject received. As RObject is our code it should always be available in order to avoid the null check. In case, the R code doesn't give a result or fails the RObject should 'know', i.e., a method needs to be added, e.g., boolean error() for any unforeseeable error w/o any file returned (triggers error message) otherwise (in case of error()==false) it returns the wanted file or the 'no-data' version.

Regarding the text:

Unfortunately there is no data available for the chosen parameters.

Is fine for all and we should include the parameters chosen into the response, e.g., in the pdf, csv, png.

Huh, I'm afraid I don't follow. Do you mind writing this patch?

I'm also not clear where you'd want to include parameters in the pnd or pdf. Do you mean we should generate a custom "No Data Available" graph that displays the chosen parameters? If so, why? (Maybe this will be clearer with a patch, too.)

The use case I'm thinking of: Some people retrieve only the pdf/png/csv using the links provided. If such a file is downloaded a URL in there with all parameters and the no-data-info makes clear what happened. On the web page we wouldn't need to display a graph w/o dots just the text, b/c the parameters are visible still.

Other than that code might be easier to discuss. I'll provide a patch.

Trac:
Status: new to accepted
Owner: metrics-team to iwakeh

I see. It's a possible use case. But I'd say let's focus on our main use case first, which is users looking at the website. And if somebody links to our PDF/PNG/CSV directly, and these return "No Data Available", it should be up to them to also provide details to find out why that is the case.

"No Data Available" should be consistent. This should follow the Style Guide too.

Relay Search currently does not follow the Style Guide: https://metrics.torproject.org/rs/img/no-data-available.png

Trac:

Style guide compliant no data available image

Ah, you mean the image should be consistent across Tor Metrics pages? Good point.

But does it have to be purple? Or can we use the same image in some style-guide-compatible gray? And can we have it in 728 x 455?

Trac:

Style guide compliant no data available image - grey and resized

https://styleguide.torproject.org/visuals/#colors has the approved colors. I could also make SVG, PDF, etc. versions if that is useful. If we do include a new image to be used, we should also update Relay Search at the same time, I can make a patch for that quite easily and then both could be deployed together.

The white version of the Tor logo comes from https://github.com/TheTorProject/tor-media/tree/master/Tor%20Logo if you'd like to have a go at making your own version.

Yes, I think it looks better in gray.

Do you mind cutting off a few pixels at the top to make it 728 by 455 (not 728 by 479)? That's the dimensions we're using for all graphs.

And can you provide a PDF file with the same dimensions?

If you want to grab this ticket and write the code, feel free to do it.

Trac:
no-data-available.tar.xz

XCF, PNG and PDF, resized to the correct dimensions this time

New PNG and PDF look good! Thanks!

Next step: write the code to include these files if no graph can be generated.

I'll write some code.

Trac:
Owner: iwakeh to karsten

Hmm, ok, had this down on the list, b/c of the low prio. I can add some suggestions for the ideas in comments 5 to 7 to the branch, when reviewing.

Oh! No, I can give it back. I didn't know you were still planning to write this code.

Please find temp commit fe160e2 in my task-25264 branch with the code I wrote so far.

Trac:
Owner: karsten to iwakeh
Status: accepted to assigned

Trac:
Status: assigned to accepted

I used the attached pictures and pdf.

Please review this patch branch, which contains one commit fixing the usage of StringBuilder.append and a second one implementing the wanted functionality. The implementation actually catches R errors and makes the respective no-data file available instead. Java can only guess if an R problem occurred. Thus, every call for R-graph generation is now wrapped in an R function robust_call that creates the no-data objects in case of an error. (String function = ... was moved down in order to avoid a checkstyle complaint).

Trac:
Status: accepted to needs_review

Looks good. Merged and deployed. Closing. Thanks!

Trac:
Status: needs_review to closed
Resolution: N/A to fixed

Hmm, looks like we broke the CSV links. Any link that I open now says "No data available for the given parameters." The code looks right to me. Any idea what's wrong?

Trac:
Resolution: fixed to N/A
Status: closed to reopened
Priority: Low to High

These files don't seem to be on the server. Could there be a reason that they don't get written? Permissions (R runs as metrics) and space seem fine.

Errr, no, they only contain the "No data message".

There are errors in rserve.log, unfortunately w/o timestamp. It is not clear from the log, if R was restarted successfully and when ... Maybe, try to restart making sure all R servers are down first?

I just restarted Rserve. No change, I think. Hmm.

Trac:
Status: reopened to accepted

Please review this patch commit.

That was a strange error situation. The call list created in java wasn't proper, but worked for graphs and failed for csvs. That's why there was only no-data in csv files, because there was an R error. Now, the behavior for graphs is the same as before, but csv files always contain their column headers, i.e., the no-data message will not appear. It only shows up when there is an error.

Trac:
Status: accepted to needs_review

Patch commit looks good. Merged and deployed. This solves the issue. Good bug hunt! Closing. Thanks!

Trac:
Resolution: N/A to fixed
Status: needs_review to closed

Could you remove the cached csvs from the server? metrics.torproject.org/userstats-relay-country.csv?start=2018-01-04&end=2018-04-04&country=all&events=off is still 'no-data'.

Oops. Done!

closed

mentioned in issue legacy/trac#25468 (moved)

mentioned in issue legacy/trac#25707 (moved)

mentioned in issue legacy/trac#26278 (moved)

mentioned in issue legacy/trac#28304 (moved)

moved from legacy/trac#25264 (moved)

added Feature label and removed 1 deleted label

removed 1 deleted label

Decide what graph to display when there's no data to graph

Designs

Child items ...

Activity