Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Tor Tor
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 344
    • Issues 344
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 35
    • Merge requests 35
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Core
  • TorTor
  • Issues
  • #40623
Closed
Open
Issue created Jun 10, 2022 by David Goulet@dgoulet🐼Owner

Didn't recognize a cell, but circ stops here! Closing circuit

During the ongoing DDoS, this protocol warning log line showed up more than usual likely due to network congestion:

        log_fn(LOG_PROTOCOL_WARN, LD_PROTOCOL,
               "Didn't recognize a cell, but circ stops here! Closing circuit. "
               "It was created %ld seconds ago.", (long)seconds_open);
      }

We looked into this in order to understand why and if this was indicative of some specific signal that would help us identify the root(s) of the DDoS.

What we believe is that after a circuit has been destroyed, the node emitting that log line is receiving "inflight cell" that were sent by the client before it would receive the TRUNCATED. And we see a lot more of those right now because of heavy congestion thus likelihood of longer inflight time.

Cause

Lets use this circuit setup as an example: C -> G -> M -> E. And the node emitting the log line is M.

It means that E sent a DESTROY on the circuit. And so, current C-tor behavior, at M, is to transform the DESTROY in a TRUNCATED cell (because it is coming from ahead) and send that cell down to C. Reminder that the truncated cell is a relay command and so it is encapsulated in onionskin down the circuit so only C can understand it.

Then, C would send a DESTROY on the circuit which is forwarded up to M stopping there because M->E link has already been removed for that circuit.

And so, the log line appears when M receives a non DESTROY cell after sending down the TRUNCATED. Hitting that log line also means that the circuit gets closed with a "tor protocol" error reason and so there is a bit of a flurry of DESTROY cell going downward and upward on that circuit because we sent a TRUNCATED but then decided it was a no good circuit due to a cell we can't relay forward.

Solution

This is a bit interesting that C-tor does that because tor-spec.txt stipulates:

Upon receiving an outgoing DESTROY cell, an OR frees resources associated with the corresponding circuit. If it's not the end of the circuit, it sends a DESTROY cell for that circuit to the next OR in the circuit. If the node is the end of the circuit, then it tears down any associated edge connections (see section 6.1).

... and says this about TRUNCATED:

To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell signaling a given OR (Stream ID zero). That OR sends a DESTROY cell to the next node in the circuit, and replies to the OP with a RELAY_TRUNCATED cell.

But we are not in that situation. The spec seems to mention "outgoing DESTROY cell" which is unclear what it means (from p_chan or n_chan?) but maybe that should be the right behavior as in regardless, we just forward a DESTROY either way?

After talking to @nickm about this, seems that we have possibly two good solution:

  1. M sends down the circuit a DESTROY instead which would get propagated down to C. And so, any inflight cells would just get ignored by M after that since the circuit would not exists anymore.

  2. Flag the circuit in a "truncated" state leading to ignoring any non DESTROY cells and thus avoiding that protocol warning.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking