Don't insert the same analysis result multiple times
cc: @hackhard
I didn't catch this during code review but I just realized it. The same exact analysis result is inserted into the database twice with different fetch_completed_id
s: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/blob/master/src/captchamonitor/core/analyzer.py#L102-118
# Non tor from the FetchCompleted
analyzer_val_nt = AnalyzeCompleted(
captcha_checker=self.captcha_checker_value,
status_check=self.status_check_value,
dom_analyze=self.dom_analyze_value,
fetch_completed_id=non_tor.id,
)
# Tor from the FetchCompleted
analyzer_val_t = AnalyzeCompleted(
captcha_checker=self.captcha_checker_value,
status_check=self.status_check_value,
dom_analyze=self.dom_analyze_value,
fetch_completed_id=tor.id,
)
self.__db_session.add(analyzer_val_nt)
self.__db_session.add(analyzer_val_t)
self.__db_session.commit()
A better and not-wasteful approach is making fetch_completed_id
field an array and placing multiple IDs there.
Basically, the same situation is discussed here: https://stackoverflow.com/questions/40104502/how-to-create-a-field-with-a-list-of-foreign-keys-in-sqlalchemy
I will implement the method I mentioned but I wanted to keep a record here for the explanation.
Edited by Barkin Simsek