CAPTCHA-Monitor issueshttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues2021-06-07T17:20:43Zhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/6Create an API for running the system on the user-provided websites2021-06-07T17:20:43ZBarkin SimsekCreate an API for running the system on the user-provided websitesThis part of the API aims to run certain tests on user-provided websites. So that people are not limited to the types of websites provided by the CAPTCHA Monitor.This part of the API aims to run certain tests on user-provided websites. So that people are not limited to the types of websites provided by the CAPTCHA Monitor.https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/17Update CAPTCHA detection to do more than just searching for a string2021-07-17T05:05:48ZBarkin SimsekUpdate CAPTCHA detection to do more than just searching for a stringhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/41Dashboard restructuring2021-07-17T15:40:01ZBarkin SimsekDashboard restructuringHi,
At this point, the infrastructure part of the project is working stably, and there is enough data (60K+ measurements) to build useful graphs. I have started working on a design document to restructure the dashboard & graphs in it, a...Hi,
At this point, the infrastructure part of the project is working stably, and there is enough data (60K+ measurements) to build useful graphs. I have started working on a design document to restructure the dashboard & graphs in it, and it is open to everyone's contribution & feedback. Please tell if there is something you don't like or something could have been designed better. Thank you in advance!
The document is available in the wiki page: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/Dashboard-Graphs
Possibly helpful links:
* https://trac.torproject.org/projects/tor/wiki/doc/UX/MetricsRedesign
* https://metrics.torproject.org/reproducible-metrics.html
* https://material.io/design/communication/data-visualization.html#dashboardsBarkin SimsekBarkin Simsekhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/42Indicate/specify sample size for graphed data2021-05-20T06:17:54ZMatthew FinkelIndicate/specify sample size for graphed dataGraph like `Percentage of exit nodes received CAPTCHA by
Tor Browser version` show a percentage, but it's difficult to know how many samples were taken and were used in calculating the result. Adding this information will help with reaso...Graph like `Percentage of exit nodes received CAPTCHA by
Tor Browser version` show a percentage, but it's difficult to know how many samples were taken and were used in calculating the result. Adding this information will help with reasoning about changes in the graph and anomalies.https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/43Automate adding new versions of Tor Browser to the system2021-04-01T19:44:21ZBarkin SimsekAutomate adding new versions of Tor Browser to the systemhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/44Add dynamic binning2021-05-20T06:18:14ZBarkin SimsekAdd dynamic binninghttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/46Add bootstrapping2021-05-20T06:18:21ZBarkin SimsekAdd bootstrappinghttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/56Create a new web API2021-06-07T17:21:18ZBarkin SimsekCreate a new web APICheck flask-swaggerCheck flask-swaggerBarkin SimsekBarkin Simsekhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/65Check if the website allows anonymous browsing2021-08-25T16:43:25ZhackhardCheck if the website allows anonymous browsinghttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/71Rewrite try-except clauses that try multiple times with tenacity library2021-06-03T13:49:24ZBarkin SimsekRewrite try-except clauses that try multiple times with tenacity libraryBarkin SimsekBarkin Simsekhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/72Add a class for parsing individual relay descriptors2021-06-03T17:58:22ZBarkin SimsekAdd a class for parsing individual relay descriptorsBarkin SimsekBarkin Simsekhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/73Extend Collector class to download server descriptors as well2021-06-06T15:52:52ZBarkin SimsekExtend Collector class to download server descriptors as wellBarkin SimsekBarkin Simsekhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/79Validate created HAR files before pushing to the database2021-06-20T19:06:28ZBarkin SimsekValidate created HAR files before pushing to the database- https://haralyzer.readthedocs.io/en/latest/index.html
- http://www.softwareishard.com/har/viewer/- https://haralyzer.readthedocs.io/en/latest/index.html
- http://www.softwareishard.com/har/viewer/https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/80Figure out why stem controller.close() causes warning messages2021-06-22T15:10:52ZBarkin SimsekFigure out why stem controller.close() causes warning messageshttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/82Prioritize domains that were updated in an older time while updating the list2021-06-29T10:49:41ZBarkin SimsekPrioritize domains that were updated in an older time while updating the listhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/83Don't insert the same analysis result multiple times2021-07-19T15:22:38ZBarkin SimsekDon't insert the same analysis result multiple timescc: @hackhard
I didn't catch this during code review but I just realized it. The same exact analysis result is inserted into the database twice with different `fetch_completed_id`s: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/...cc: @hackhard
I didn't catch this during code review but I just realized it. The same exact analysis result is inserted into the database twice with different `fetch_completed_id`s: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/blob/master/src/captchamonitor/core/analyzer.py#L102-118
```
# Non tor from the FetchCompleted
analyzer_val_nt = AnalyzeCompleted(
captcha_checker=self.captcha_checker_value,
status_check=self.status_check_value,
dom_analyze=self.dom_analyze_value,
fetch_completed_id=non_tor.id,
)
# Tor from the FetchCompleted
analyzer_val_t = AnalyzeCompleted(
captcha_checker=self.captcha_checker_value,
status_check=self.status_check_value,
dom_analyze=self.dom_analyze_value,
fetch_completed_id=tor.id,
)
self.__db_session.add(analyzer_val_nt)
self.__db_session.add(analyzer_val_t)
self.__db_session.commit()
```
A better and not-wasteful approach is making `fetch_completed_id` field an array and placing multiple IDs there.
Basically, the same situation is discussed here: https://stackoverflow.com/questions/40104502/how-to-create-a-field-with-a-list-of-foreign-keys-in-sqlalchemy
I will implement the method I mentioned but I wanted to keep a record here for the explanation.Barkin SimsekBarkin Simsekhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/86Implement Consensus Module2021-07-13T21:08:58ZhackhardImplement Consensus ModuleImplement the `Consensus Module`.Implement the `Consensus Module`.https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/88Metrics Draft2021-07-28T21:56:49ZhackhardMetrics DraftThis issue or ticket is created to draft the metric ideas:
I plan to use Python libraries for the graph/plots.
So that would mean, extracting data from the database and that would be a starting point to apply our graphing algorithms.
A...This issue or ticket is created to draft the metric ideas:
I plan to use Python libraries for the graph/plots.
So that would mean, extracting data from the database and that would be a starting point to apply our graphing algorithms.
Also before jumping straight into the graphing algorithm I'm first looking into how well could be distribute our data plots to the website.
I'm inclined more to make a website that would be interactive, as in zoomable/navigable website, and I found the following libraries helpful:
+ [`mathplotlib` + `mpld3` ](https://www.freecodecamp.org/news/how-to-embed-interactive-python-visualizations-on-your-website-with-python-and-matplotlib/)
+ [`plotly`](https://plotly.com/python/interactive-html-export/)
+ [`Bokeh`](https://realpython.com/python-data-visualization-bokeh/)
Today while I was going through [`plotly`](https://plotly.com/python/) I came across [`Dash`](https://dash.plotly.com/) which is a framework for making building interactive web analytic applications, which would be helpful for our case.
So that just let's us with:
### How should we display our graphs?
+ Relay information: We can display information regarding each individual `Relay` information.
+ Website information: We can display information regarding each individual `Website` information.
##### I will update this frequently with more specific details.
Useful links:
+ https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/Dashboard-Graphs#weighted-captcha-rate-by-exit-probabilityhackhardhackhardhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/89Make the analyzer module more modular2021-07-17T05:05:48ZhackhardMake the analyzer module more modularI'm raising this ticket as the reply for the comments:
- https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744385
- https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744330
The p...I'm raising this ticket as the reply for the comments:
- https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744385
- https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744330
The progress will be mapped here:
- [ ] Modularization of the captcha checks that are mentioned here: [L336 to L346](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/blob/master/src/captchamonitor/core/analyzer.py#L336-346), overall improve the working of `consensus_lite_captcha()`
and the following discussions from !28 should be addressed: @woswos started a [discussion](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744088): (+3 comments)
> Why do you have this string search here and have another function called `captcha_checker()` at the same time? Also what happened to #17? Why did you hardcode this string search here? I have many questions...
- [ ] Again a work in modularization: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744330
and the following discussions from !28 should be addressed: @woswos started a [discussion](https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/merge_requests/28#note_2744327): (+1 comment)
> I'm not a fan of this repetitive code design and repetitive fields in the database (captcha_checker_value, consensus_lite_captcha, dom_analyze, consensus_lite_dom). However, for the sake of time, I will merge this now.hackhardhackhardhttps://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/issues/91Add an overall timeout option for jobs2021-07-17T16:39:51ZBarkin SimsekAdd an overall timeout option for jobsFor example, if a job takes more than 90 seconds, assume that the job is failed inside the worker.For example, if a job takes more than 90 seconds, assume that the job is failed inside the worker.