Create additional table for `extra_info_descriptor` bandwidth histories
The {read, write}-history columns in the extra_info_descriptor
table contain JSON, what it's already can be queried with PostgreSQL.
Example read_history
row:
{"line":
"read-history 2023-06-17 21:04:08 (86400 s) 6024585216,3048128512,4642625536,2614033408,2343061504",
"historyEndMillis":1687035848000,
"intervalLength":86400,
"bandwidthValues":{"1686690248000":6024585216,"1686776648000":3048128512,"1686863048000":4642625536,"1686949448000":2614033408,"1687035848000":2343061504}
}
For some queries, like the ones we're working on at tpo/network-health/team#313, it's still a bit complex to create queries with JSON and i ended up creating my own local DB with a table just for the bandwidth values. This is why i've thought that maybe it's helpful for other projects too to have this additional table.
It could also be in a different database or be created outside of this project code, from the existing tables (instead of the descriptors) or in a different programming language.
For tpo/network-health/team#313 the scheme i created from the read/write-history bandwidthValues
is:
CREATE TABLE public.bw_history (
fingerprint character varying(40) NOT NULL,
ts timestamp without time zone NOT NULL,
read_bytes bigint NOT NULL,
write_bytes bigint
);
ALTER TABLE ONLY public.bw_history
ADD CONSTRAINT bw_history_pkey PRIMARY KEY (fingerprint, ts, read_bytes)
Example row:
fingerprint | ts | read_bytes | write_bytes
------------------------------------------+---------------------+------------+-------------
9AB08DD5D3F20BA7DDC693061B65DAF4E6B1F901 | 2023-09-15 16:02:10 | 270259200 | 259977216
but maybe other projects need historyEndMillis
or intervalLength
fields too.
This issue might be related to #34
Update: there're already bandwidth histories in the Victoria Metrics database (https://metrics-db.torproject.org). The question now would be how to filter bandwidth histories by fingerprints obtained from the PostgreSQL database (metricsdb-01.torproject.org)