VM for metrics data processing
I need a VM for data processing of historical collector data. I have been trying to do that on metricsdb-01 but it has slowed down analysis and current ingestion.
The plan is to parse historical data from collector, from the beginning of times (tor speaking ofc). My idea is to ingest the data and export it into tabular format, probably parquet, to store on object storage, partitioned per month. Additionally some of this data would be aggregated into tables that would be copied over to metricsdb-01 and used to support graphs for the metrics website. Some data about relays would also be copied over to metricsdb-01. But the overall idea is to keep metricsdb-01 from growing exponentially and leverage object storage for historical analysis.
What is needed:
16 CPUs 128GB RAM 1TB SSD disk (2TB if possible)
Local postgresql A role/user that can write to postgresql on metricsdb-01 remotely