Changes

Gaba · 67542b0f
--- a/Meetings/2018/2018MexicoCity/Notes/TechnicalDebt.md
+++ b/Meetings/2018/2018MexicoCity/Notes/TechnicalDebt.md
+= Net team Technical debt session 10am Oct 3.
+== Notes, v1.
+what's frustrating:
+  * spaghetti dependencies
+  * spaghetti callgraph
+  * testability!  hard to write tests for a function -- not so easy to
+    construct a big pile of objects to actually do things.  Need to mock
+    too many things.  Not enough helpers, not reusable enough.
+  * Not clear to see where we might block.
+Solutions for new code:
+  * design for test, design for debuggability
+`  *
+Considerations:
+  * New code vs old code
+  * Hard to say whether we have a real test for something even when it's
+    covered.
+  * Should we reorganize our tests from bottom-up order, so we can make sure
+    that we're testing the parts before we test theirs.
+  * Can we have a quality-of-coverage metric?  Like, how covered is this?
+  * 
+TODO:
+  * Write a checklist for reviews, include "are there tests?"  and "Does this
+    bug have a regression test.
+  * Remove obsolete options more aggresively.
+  * Add a rule: Need an explanation for upward includes? Identify and review
+    upwards includes?
+  * Have a counter for upward includes, and advertise it in PR status?
+  * Add rule/reminder: patch against oldest version it might need to be
+    backported to.
+  * Add CI to check RAM usage during chutney/stem, make sure it doesn't go up
+    a lot?
+  * Make a test harness for "run inside tor" -- run tests inside a
+    bootstrapped client on a Chutney net?  Or inside a fake-bootstrapped Tor?
+  * Maybe identify and clean up or remove bogus tests?  Preemptively minimize
+    them, maybe?
+  * Can we have a mode to only count coverage from the tests that are meant
+    to test the particular code.
+  * Checklist: Do you check the various valid range elements, and off the
+    end?  Do you check success AND failure?
+  * Checklist: Do you check that the function actually does what it is
+    supposed to do? (eg, check that sqrt(x) *sqrt(x) == x)
+  * Goal: make fuzz-static-testcases actually work without BUG or ERR.
+Top stuff to simplify:
+  * Delete all the code about ipv6 dirports.  (We don't have them)
+  * Split control and config?  Maybe we can do this with some kind of
+    subsystem-based stuff. (X2)
+                     - They are biggest
+                     - They keep accumulating
+  * Separate and contain our crypto and networking state so that they can be
+    in separate threads, not know about each other.
+  * Would be nice to reduce C-macro usage in C/Rust dependency, reduce amount
+    of C-Rust-Coupled.
+        * Automate bindings?  Look at the ticket.
+        * We have an automated test from Alex C.
+  * Re-design extend_info_t so the reachability checks are done at a more
+    logical point.
+== Notes, v2
+Testability
+- amount of stuff you need to put together- would need a full circuit, a lot of complete set up for test state in order to test cells
+- lack of test functions - not good abstractions- slowely adding these but maybe not enough?
+- solutions for new code versus solutions for existing code
+- new code is easier to design for testing/debuggability 
+- tests are not carefully structured to invoke only the code that it is testing
+- example: function that checks something about a router - router initialization would be invoked, but it isn't actually thoroughly being tested
+- working out what the test failure is- not easy to tell what the specific issue is that would cause the test failure. Would executing tests in order help with this? 
+- optimizing tests- reducing blocking operations
+- DNS queries block
+How to make progress here over the next 6 months? 
+- success with big code movement
+- introducing new requirements for merging
+- should make the requirement that all new code has tests- checklists for review? 
+Code review checklist:
+    1. is there a visible changes file?
+    2. if this is a new feature, are there unit tests?
+    3. identify and review upward includes (and whether this functionality should belong in another layer)
+    - downward layers shouldn't called into upward layers
+    - core or features layers aren't enforced (layers aren't actually enforced here)
+    - no new upward includes requirement? we can't actually do this right now, as we have to include config.c
+    - we should feel free to move code around in the layers right now
+    - should we have something that counts dependency violations? 
+    4. Look at the _quality_ of the tests. Are they actually meaningful?
+    5. Testing extreme inputs as well as the normal expected inputs- also should test error cases
+Improve test vectors? 
+    - test that the function is doing what it is meant to do
+    - but also having arbitrary values is ok? (unclear)
+Should we have a new metric for test coverage?
+    - only code that is directly tested (as opposed to indirectly exercised)
+    - A static call graph might help with this?
+Automated PR checks:
+    - layer violations
+    - Ram usage 
+How to make tests more writable?
+    - run in a bootstrapped tor- bootstrapped small chutney network and then run tests? - this is more like an integration test
+Should we do any work to clean up existing tests?
+    - options tests (don't test what they are meant to be testing)
+    - Tor TLS tests were overfitted
+    - Config tests are overfitted as well? (expect configs to be parsed in an order) - when we rewrite confing, we should rewrite tests
+Should we remove bogus tests? We should minimize bogus tests to not pretend to test what they are actually testing (minimize their scope to only test what they actually test)
+git blame --follow --moved - looking at code that has moved as opposed to new code
+git --color-moved will make your life much better (should set as default in git config)
+What is everyone's top "this should be simplified" wishes?
+    - delete all IPv6 dirports
+    - go for the biggest c files
+    - config.c (has bad tests) - subsystem specific configurations
+    - control.c (well tested by stem)
+    - crypto state and network code - abstract these so that we can move to multithreading
+    - things should be able to register for events (publish/subsc
+    Testability
+    - amount of stuff you need to put together- would need a full circuit, a lot of complete set up for test state in order to test cells
+    - lack of test functions - not good abstractions- slowely adding these but maybe not enough?
+    - solutions for new code versus solutions for existing code
+    - new code is easier to design for testing/debuggability 
+    - tests are not carefully structured to invoke only the code that it is testing
+    - example: function that checks something about a router - router initialization would be invoked, but it isn't actually thoroughly being tested
+    - working out what the test failure is- not easy to tell what the specific issue is that would cause the test failure. Would executing tests in order help with this? 
+    - optimizing tests- reducing blocking operations
+    - DNS queries block
+    How to make progress here over the next 6 months? 
+    - success with big code movement
+    - introducing new requirements for merging
+    - should make the requirement that all new code has tests- checklists for review? 
+    Code review checklist:
+    1. is there a visible changes file?
+    2. if this is a new feature, are there unit tests?
+    3. identify and review upward includes (and whether this functionality should belong in another layer)
+    - downward layers shouldn't called into upward layers
+    - core or features layers aren't enforced (layers aren't actually enforced here)
+    - no new upward includes requirement? we can't actually do this right now, as we have to include config.c
+    - we should feel free to move code around in the layers right now
+    - should we have something that counts dependency violations? 
+    4. Look at the _quality_ of the tests. Are they actually meaningful?
+    5. Testing extreme inputs as well as the normal expected inputs- also should test error cases
+    Improve test vectors? 
+    - test that the function is doing what it is meant to do
+    - but also having arbitrary values is ok? (unclear)
+    Should we have a new metric for test coverage?
+    - only code that is directly tested (as opposed to indirectly exercised)
+    - A static call graph might help with this?
+    Automated PR checks:
+        - layer violations
+        - Ram usage 
+    How to make tests more writable?
+    - run in a bootstrapped tor- bootstrapped small chutney network and then run tests? - this is more like an integration test
+    Should we do any work to clean up existing tests?
+    - options tests (don't test what they are meant to be testing)
+    - Tor TLS tests were overfitted
+    - Config tests are overfitted as well? (expect configs to be parsed in an order) - when we rewrite confing, we should rewrite tests
+    Should we remove bogus tests? We should minimize bogus tests to not pretend to test what they are actually testing (minimize their scope to only test what they actually test)
+    git blame --follow --moved - looking at code that has moved as opposed to new code
+    git --color-moved will make your life much better (should set as default in git config)
+    What is everyone's top "this should be simplified" wishes?
+    - delete all IPv6 dirports
+    - go for the biggest c files
+    - config.c (has bad tests) - subsystem specific configurations
+    - control.c (well tested by stem)
+    - crypto state and network code - abstract these so that we can move to multithreading
+    - things should be able to register for events (publish/subscribe) - good abstractions should help with this (what would the abstraction for an inter-module communication be?) 
+    - Make C/Rust dependencies automatically checked: https://trac.torproject.org/projects/tor/ticket/24249
+    TODO:
+    - make progress by deleting code and deprecating old options
\ No newline at end of file