Loading doc/HACKING +109 −104 Original line number Diff line number Diff line Loading @@ -6,31 +6,23 @@ the code, add features, fix bugs, etc. Read the README file first, so you can get familiar with the basics. 1. The programs. The pieces. 1.1. "or". This is the main program here. It functions as either a server or a client, depending on which config file you give it. 1.2. "orkeygen". Use "orkeygen file-for-privkey file-for-pubkey" to generate key files for an onion router. 2. The pieces. 2.1. Routers. Onion routers, as far as the 'or' program is concerned, Routers. Onion routers, as far as the 'tor' program is concerned, are a bunch of data items that are loaded into the router_array when the program starts. Periodically it downloads a new set of routers from a directory server, and updates the router_array. When a new OR connection is started (see below), the relevant information is copied from the router struct to the connection struct. 2.2. Connections. A connection is a long-standing tcp socket between Connections. A connection is a long-standing tcp socket between nodes. A connection is named based on what it's connected to -- an "OR connection" has an onion router on the other end, an "OP connection" has an onion proxy on the other end, an "exit connection" has a website or other server on the other end, and an "AP connection" has an application proxy (and thus a user) on the other end. 2.3. Circuits. A circuit is a path over the onion routing Circuits. A circuit is a path over the onion routing network. Applications can connect to one end of the circuit, and can create exit connections at the other end of the circuit. AP and exit connections have only one circuit associated with them (and thus these Loading @@ -38,22 +30,18 @@ connection types are closed when the circuit is closed), whereas OP and OR connections multiplex many circuits at once, and stay standing even when there are no circuits running over them. 2.4. Topics. Topics are specific conversations between an AP and an exit. Topics are multiplexed over circuits. Streams. Streams are specific conversations between an AP and an exit. Streams are multiplexed over circuits. 2.4. Cells. Some connections, specifically OR and OP connections, speak Cells. Some connections, specifically OR and OP connections, speak "cells". This means that data over that connection is bundled into 256 byte packets (8 bytes of header and 248 bytes of payload). Each cell has a type, or "command", which indicates what it's for. Robustness features. 3. Important parameters in the code. 4. Robustness features. 4.1. Bandwidth throttling. Each cell-speaking connection has a maximum [XXX no longer up to date] Bandwidth throttling. Each cell-speaking connection has a maximum bandwidth it can use, as specified in the routers.or file. Bandwidth throttling can occur on both the sender side and the receiving side. If the LinkPadding option is on, the sending side sends cells at regularly Loading @@ -75,7 +63,7 @@ The bandwidth throttling uses TCP to push back when we stop reading. We extend it with token buckets to allow more flexibility for traffic bursts. 4.2. Data congestion control. Even with the above bandwidth throttling, Data congestion control. Even with the above bandwidth throttling, we still need to worry about congestion, either accidental or intentional. If a lot of people make circuits into same node, and they all come out through the same connection, then that connection may become saturated Loading @@ -102,7 +90,7 @@ already guarantee in-order delivery of each cell. Rather than trying to build some sort of tcp-on-tcp scheme, we implement this minimal data congestion control; so far it's enough. 4.3. Router twins. In many cases when we ask for a router with a given Router twins. In many cases when we ask for a router with a given address and port, we really mean a router who knows a given key. Router twins are two or more routers that share the same private key. We thus give routers extra flexibility in choosing the next hop in the circuit: if Loading @@ -111,3 +99,20 @@ some of the twins are down or slow, it can choose the more available ones. Currently the code tries for the primary router first, and if it's down, chooses the first available twin. Coding conventions: Log convention: use only these four log severities. ERR is if something fatal just happened. WARNING is something bad happened, but we're still running. The bad thing is either a bug in the code, an attack or buggy protocol/implementation of the remote peer, etc. The operator should examine the bad thing and try to correct it. (No error or warning messages should be expected. I expect most people to run on -l warning eventually. If a library function is currently called such that failure always means ERR, then the library function should log WARNING and let the caller log ERR.) INFO means something happened (maybe bad, maybe ok), but there's nothing you need to (or can) do about it. DEBUG is for everything louder than INFO. Loading
doc/HACKING +109 −104 Original line number Diff line number Diff line Loading @@ -6,31 +6,23 @@ the code, add features, fix bugs, etc. Read the README file first, so you can get familiar with the basics. 1. The programs. The pieces. 1.1. "or". This is the main program here. It functions as either a server or a client, depending on which config file you give it. 1.2. "orkeygen". Use "orkeygen file-for-privkey file-for-pubkey" to generate key files for an onion router. 2. The pieces. 2.1. Routers. Onion routers, as far as the 'or' program is concerned, Routers. Onion routers, as far as the 'tor' program is concerned, are a bunch of data items that are loaded into the router_array when the program starts. Periodically it downloads a new set of routers from a directory server, and updates the router_array. When a new OR connection is started (see below), the relevant information is copied from the router struct to the connection struct. 2.2. Connections. A connection is a long-standing tcp socket between Connections. A connection is a long-standing tcp socket between nodes. A connection is named based on what it's connected to -- an "OR connection" has an onion router on the other end, an "OP connection" has an onion proxy on the other end, an "exit connection" has a website or other server on the other end, and an "AP connection" has an application proxy (and thus a user) on the other end. 2.3. Circuits. A circuit is a path over the onion routing Circuits. A circuit is a path over the onion routing network. Applications can connect to one end of the circuit, and can create exit connections at the other end of the circuit. AP and exit connections have only one circuit associated with them (and thus these Loading @@ -38,22 +30,18 @@ connection types are closed when the circuit is closed), whereas OP and OR connections multiplex many circuits at once, and stay standing even when there are no circuits running over them. 2.4. Topics. Topics are specific conversations between an AP and an exit. Topics are multiplexed over circuits. Streams. Streams are specific conversations between an AP and an exit. Streams are multiplexed over circuits. 2.4. Cells. Some connections, specifically OR and OP connections, speak Cells. Some connections, specifically OR and OP connections, speak "cells". This means that data over that connection is bundled into 256 byte packets (8 bytes of header and 248 bytes of payload). Each cell has a type, or "command", which indicates what it's for. Robustness features. 3. Important parameters in the code. 4. Robustness features. 4.1. Bandwidth throttling. Each cell-speaking connection has a maximum [XXX no longer up to date] Bandwidth throttling. Each cell-speaking connection has a maximum bandwidth it can use, as specified in the routers.or file. Bandwidth throttling can occur on both the sender side and the receiving side. If the LinkPadding option is on, the sending side sends cells at regularly Loading @@ -75,7 +63,7 @@ The bandwidth throttling uses TCP to push back when we stop reading. We extend it with token buckets to allow more flexibility for traffic bursts. 4.2. Data congestion control. Even with the above bandwidth throttling, Data congestion control. Even with the above bandwidth throttling, we still need to worry about congestion, either accidental or intentional. If a lot of people make circuits into same node, and they all come out through the same connection, then that connection may become saturated Loading @@ -102,7 +90,7 @@ already guarantee in-order delivery of each cell. Rather than trying to build some sort of tcp-on-tcp scheme, we implement this minimal data congestion control; so far it's enough. 4.3. Router twins. In many cases when we ask for a router with a given Router twins. In many cases when we ask for a router with a given address and port, we really mean a router who knows a given key. Router twins are two or more routers that share the same private key. We thus give routers extra flexibility in choosing the next hop in the circuit: if Loading @@ -111,3 +99,20 @@ some of the twins are down or slow, it can choose the more available ones. Currently the code tries for the primary router first, and if it's down, chooses the first available twin. Coding conventions: Log convention: use only these four log severities. ERR is if something fatal just happened. WARNING is something bad happened, but we're still running. The bad thing is either a bug in the code, an attack or buggy protocol/implementation of the remote peer, etc. The operator should examine the bad thing and try to correct it. (No error or warning messages should be expected. I expect most people to run on -l warning eventually. If a library function is currently called such that failure always means ERR, then the library function should log WARNING and let the caller log ERR.) INFO means something happened (maybe bad, maybe ok), but there's nothing you need to (or can) do about it. DEBUG is for everything louder than INFO.