Consider replacing materialized views, views and/or functions into tables generated by other script(s)
The initial table schemes in this repository, try to follow as much as possible the Tor descriptors and are created by descriptorParser.
To be able to aggregate data, perform calculations and create views for Grafana, we started to add views and functions in this repo. Because the views and functions were taking more than 0.5 seconds, we started to create materialized views (#11 (closed), !13 (merged), !24 (merged), !39, !41 (merged)), which copy the original data and therefore are faster. However, materialized views need to be refreshed (once a day is fine in this case), but because they don't just add the new data, they need to be build again and again. When we added the materialized refresh command to descriporParser () it increased the time it needed to run every hour.
It's possible to run materialized views in a parallel, using CONCURRENTLY
, without blocking concurrent queries (!13 (merged)). However, they're breaking a whole descriptorParser run, when the indexes were buggy (!24 (merged)).
I think we should revert refreshing materialized views from descriptorParser, but maybe, instead of running them with triggers, which i don't know right now how can be monitored to find errors without affecting descriptorParser (postgres log?, tpo/tpa/team#41224 (closed)?), create a simple script, launch by a systemd service that is just run daily, can be run while descriptorParser is running and doesn't block it.
Other possibility, that @hiro and i have talked about, but i almost forgot, is to generate the materialized views (or even tables) from a python/rust script, instead of doing it with SQL code. In some cases, creating the materialized views (or tables), might be faster and definitely easier for programmers.
I thought i needed a DB user that has write permissions to create the views from a script that isn't in this repo, but if i understand it correctly, i could add an script in https://gitlab.torproject.org/tpo/network-health/metrics/metrics-bin/-/tree/main/metricsdb, and it would be run as nsa and tor_fusion currently do.
To leave the current DB tables scheme as closest to the descriptors as possible and more importantly, to do not accidentally modify/delete the "original" tables created from the descriptors, i think we should use a different database. If we do this, we'd just need to add this other database to Grafana with read permissions only.
For the deployment of this script(s), trying to have code that is self-contained, we could create a podman container that gets build and deployed by CI (i'm doing this for anon_ticket and i think it's an idea in tpa) and gets launch by a systemd service. This would remove the need to add an script to launch an script in a different repo, metrics-bin.
In any case, we could take this path step by step, and we could try to refresh the views concurrently in a separate process first.