Loading doc/design-paper/roadmap-future.tex +22 −89 Original line number Diff line number Diff line Loading @@ -14,8 +14,8 @@ \begin{document} \title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007} \author{Roger Dingledine \and Nick Mathewson \and Shava Nerad} \title{Tor Development Roadmap: Wishlist for 2008 and beyond} \author{Roger Dingledine \and Nick Mathewson} \maketitle \pagestyle{plain} Loading @@ -26,23 +26,13 @@ \section{Introduction} %Hi, Roger! Hi, Shava. This paragraph should get deleted soon. Right now, %this document goes into about as much detail as I'd like to go into for a %technical audience, since that's the audience I know best. It doesn't have %time estimates everywhere. It isn't well prioritized, and it doesn't %distinguish well between things that need lots of research and things that %don't. The breakdowns don't all make sense. There are lots of things where %I don't make it clear how they fit into larger goals, and lots of larger %goals that don't break down into little things. It isn't all stuff we can do %for sure, and it isn't even all stuff we can do for sure in 2007. The %tmp\{\} macro indicates stuff I haven't said enough about. That said, here %plangoes... Tor (the software) and Tor (the overall software/network/support/document suite) are now experiencing all the crises of success. Over the next year, we're probably going to grow more in terms of users, developers, and funding than before. This gives us the opportunity to perform long-neglected maintenance tasks. suite) are now experiencing all the crises of success. Over the next years, we're probably going to grow more in terms of users, developers, and funding than before. This document attempts to lay out all the well-understood next steps that Tor needs to take. We should periodically reorganize it to reflect current and intended priorities. \section{Code and design infrastructure} Loading Loading @@ -96,22 +86,6 @@ significantly. Sadly, many of these are patented and unavailable for us. \subsection{Scalability} \subsubsection{Improved directory efficiency} Right now, clients download a statement of the {\bf network status} made by each directory authority. We could reduce network bandwidth significantly by having the authorities jointly sign a statement reflecting their vote on the current network status. This would save clients up to 160K per hour, and make their view of the network more uniform. Of course, we'd need to make sure the voting process was secure and resilient to failures in the network.\plan{Must do; specify in 2006. 2 weeks to specify, 3-4 weeks to implement.} We should {\bf shorten router descriptors}, since the current format includes a great deal of information that's only of interest to the directory authorities, and not of interest to clients. We can do this by having each router upload a short-form and a long-form signed descriptor, and having clients download only the short form. Even a naive version of this would save about 40\% of the bandwidth currently spent by clients downloading descriptors.\plan{Must do; specify in 2006. 3-4 weeks.} We should {\bf have routers upload their descriptors even less often}, so that clients do not need to download replacements every 18 hours whether any Loading Loading @@ -154,11 +128,12 @@ have some preliminary designs~\cite{incentives-txt,tor-challenges}, but need to perform some more research to make sure they would be safe and effective.\plan{Write a draft paper; 2 person-months.} (XXX we did that) \subsection{Portability} Our {\bf Windows implementation}, though much improved, continues to lag behind Unix and Mac OS X, especially when running as a server. We hope to merge promising patches from Mike Chiussi to address this point, and bring merge promising patches from Christian King to address this point, and bring Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months to integrate not counting Mike's work.} Loading @@ -166,10 +141,6 @@ We should have {\bf better support for portable devices}, including modes of operation that require less RAM, and that write to disk less frequently (to avoid wearing out flash RAM).\plan{Optional; 2 weeks.} We should {\bf stop using socketpair on Windows}; instead, we can use in-memory structures to communicate between cpuworkers and the main thread, and between connections.\plan{Optional; 1 week.} \subsection{Performance: resource usage} We've been working on {\bf using less RAM}, especially on servers. This has paid off a lot for directory caches in the 0.1.2, which in some cases are Loading @@ -181,20 +152,8 @@ chunks produced with a specialized allocator.) This could potentially save around 25 to 50\% of the memory currently allocated for network buffers, and make Tor a more attractive proposition for restricted-memory environments like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks plus one week measurement.} We should improve our {\bf bandwidth limiting}. The current system has been crucial in making users willing to run servers: nobody is willing to run a server if it might use an unbounded amount of bandwidth, especially if they are charged for their usage. We can make our system better by letting users configure bandwidth limits independently for their own traffic and traffic relayed for others; and by adding write limits for users running directory servers.\plan{Do in 2006; 2-3 weeks.} On many hosts, sockets are still in short supply, and will be until we can migrate our protocol to UDP. We can {\bf use fewer sockets} by making our self-to-self connections happen internally to the code rather than involving the operating system's socket implementation.\plan{Optional; 1 week.} plus one week measurement.} (XXX We did this, but we need to do something more/else.) \subsection{Performance: network usage} We know too little about how well our current path Loading Loading @@ -272,39 +231,25 @@ tool. \subsection{Implementation: client-side and bridges-side} Our anticensorship design calls for some nodes to act as ``bridges'' that are outside a national firewall, and others inside the firewall to act as pure clients. This part of the design is quite clear-cut; we're probably ready to begin implementing it. To {\bf implement bridges}, we need to have servers publish themselves as limited-availability relays to a special bridge authority if they judge they'd make good servers. We will also need to help provide documentation for port forwarding, and an easy configuration tool for running as a bridge. To {\bf implement clients}, we need to provide a flexible interface to learn about bridges and to act on knowledge of bridges. We also need to teach them how to know to use bridges as their first hop, and how to fetch directory information from both classes of directory authority. Clients also need to {\bf use the encrypted directory variant} added in Tor 0.1.2.3-alpha. This will let them retrieve directory information over Tor once they've got their initial bridges. We may want to get the rest of the Tor user base to begin using this encrypted directory variant too, to provide cover. Bridges will want to be able to {\bf listen on multiple addresses and ports} if they can, to give the adversary more ports to block. \subsection{Research: anonymity implications from becoming a bridge} see arma's bridge proposal; e.g. should bridge users use a second layer of entry guards? \subsection{Implementation: bridge authority} The design here is also reasonably clear-cut: we need to run some we run some directory authorities with a slightly modified protocol that doesn't leak the entire list of bridges. Thus users can learn up-to-date information for bridges they already know about, but they can't learn about arbitrary new bridges. we need a design for distributing the bridge authority over more than one server \subsection{Normalizing the Tor protocol on the wire} Additionally, we should {\bf resist content-based filters}. Though an adversary can't see what users are saying, some aspects of our protocol are Loading @@ -313,10 +258,6 @@ easy to fingerprint {\em as} Tor. We should correct this where possible. Look like Firefox; or look like nothing? Future research: investigate timing similarities with other protocols. \subsection{Access control for bridges} Design/impl: password-protecting bridges, in light of above. And/or more general access control. \subsection{Research: scanning-resistance} \subsection{Research/Design/Impl: how users discover bridges} Loading Loading @@ -398,14 +339,6 @@ resist these attacks, or can improve our design to resist them, we should. unless a graduate student is interested.} \subsection{Implementation security} Right now, each Tor node stores its keys unencrypted. We should {\bf encrypt more Tor keys} so that Tor authorities can require a startup password. We should look into adding intermediary medium-term ``signing keys'' between identity keys and onion keys, so that a password could be required to replace a signing key, but not to start Tor. This would improve Tor's long-term security, especially in its directory authority infrastructure.\plan{Design this as a part of the revised ``v2.1'' directory protocol; implement it in 2007. 3-4 weeks.} We should also {\bf mark RAM that holds key material as non-swappable} so that there is no risk of recovering key material from a hard disk Loading Loading @@ -458,11 +391,11 @@ them as belonging to the same family.\plan{Do during v2.1 directory protocol To avoid attacks where an adversary claims good performance in order to attract traffic, we should {\bf have authorities measure node performance} (including stability and bandwidth) themselves, and not simply believe what they're told. Measuring stability can be done by tracking MTBF. Measuring bandwidth can be tricky, since it's hard to distinguish between a server with they're told. We also measure stability by tracking MTBF. Measuring bandwidth will be tricky, since it's hard to distinguish between a server with low capacity, and a high-capacity server with most of its capacity in use.\plan{Do ``Stable'' in 2007; 2-3 weeks. ``Fast'' will be harder; do it if we can interest a grad student.} use. See also Nikita's NDSS 2008 paper.\plan{Do it if we can interest a grad student.} {\bf Operating a directory authority should be easier.} We rely on authority operators to keep the network running well, but right now their job involves Loading Loading
doc/design-paper/roadmap-future.tex +22 −89 Original line number Diff line number Diff line Loading @@ -14,8 +14,8 @@ \begin{document} \title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007} \author{Roger Dingledine \and Nick Mathewson \and Shava Nerad} \title{Tor Development Roadmap: Wishlist for 2008 and beyond} \author{Roger Dingledine \and Nick Mathewson} \maketitle \pagestyle{plain} Loading @@ -26,23 +26,13 @@ \section{Introduction} %Hi, Roger! Hi, Shava. This paragraph should get deleted soon. Right now, %this document goes into about as much detail as I'd like to go into for a %technical audience, since that's the audience I know best. It doesn't have %time estimates everywhere. It isn't well prioritized, and it doesn't %distinguish well between things that need lots of research and things that %don't. The breakdowns don't all make sense. There are lots of things where %I don't make it clear how they fit into larger goals, and lots of larger %goals that don't break down into little things. It isn't all stuff we can do %for sure, and it isn't even all stuff we can do for sure in 2007. The %tmp\{\} macro indicates stuff I haven't said enough about. That said, here %plangoes... Tor (the software) and Tor (the overall software/network/support/document suite) are now experiencing all the crises of success. Over the next year, we're probably going to grow more in terms of users, developers, and funding than before. This gives us the opportunity to perform long-neglected maintenance tasks. suite) are now experiencing all the crises of success. Over the next years, we're probably going to grow more in terms of users, developers, and funding than before. This document attempts to lay out all the well-understood next steps that Tor needs to take. We should periodically reorganize it to reflect current and intended priorities. \section{Code and design infrastructure} Loading Loading @@ -96,22 +86,6 @@ significantly. Sadly, many of these are patented and unavailable for us. \subsection{Scalability} \subsubsection{Improved directory efficiency} Right now, clients download a statement of the {\bf network status} made by each directory authority. We could reduce network bandwidth significantly by having the authorities jointly sign a statement reflecting their vote on the current network status. This would save clients up to 160K per hour, and make their view of the network more uniform. Of course, we'd need to make sure the voting process was secure and resilient to failures in the network.\plan{Must do; specify in 2006. 2 weeks to specify, 3-4 weeks to implement.} We should {\bf shorten router descriptors}, since the current format includes a great deal of information that's only of interest to the directory authorities, and not of interest to clients. We can do this by having each router upload a short-form and a long-form signed descriptor, and having clients download only the short form. Even a naive version of this would save about 40\% of the bandwidth currently spent by clients downloading descriptors.\plan{Must do; specify in 2006. 3-4 weeks.} We should {\bf have routers upload their descriptors even less often}, so that clients do not need to download replacements every 18 hours whether any Loading Loading @@ -154,11 +128,12 @@ have some preliminary designs~\cite{incentives-txt,tor-challenges}, but need to perform some more research to make sure they would be safe and effective.\plan{Write a draft paper; 2 person-months.} (XXX we did that) \subsection{Portability} Our {\bf Windows implementation}, though much improved, continues to lag behind Unix and Mac OS X, especially when running as a server. We hope to merge promising patches from Mike Chiussi to address this point, and bring merge promising patches from Christian King to address this point, and bring Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months to integrate not counting Mike's work.} Loading @@ -166,10 +141,6 @@ We should have {\bf better support for portable devices}, including modes of operation that require less RAM, and that write to disk less frequently (to avoid wearing out flash RAM).\plan{Optional; 2 weeks.} We should {\bf stop using socketpair on Windows}; instead, we can use in-memory structures to communicate between cpuworkers and the main thread, and between connections.\plan{Optional; 1 week.} \subsection{Performance: resource usage} We've been working on {\bf using less RAM}, especially on servers. This has paid off a lot for directory caches in the 0.1.2, which in some cases are Loading @@ -181,20 +152,8 @@ chunks produced with a specialized allocator.) This could potentially save around 25 to 50\% of the memory currently allocated for network buffers, and make Tor a more attractive proposition for restricted-memory environments like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks plus one week measurement.} We should improve our {\bf bandwidth limiting}. The current system has been crucial in making users willing to run servers: nobody is willing to run a server if it might use an unbounded amount of bandwidth, especially if they are charged for their usage. We can make our system better by letting users configure bandwidth limits independently for their own traffic and traffic relayed for others; and by adding write limits for users running directory servers.\plan{Do in 2006; 2-3 weeks.} On many hosts, sockets are still in short supply, and will be until we can migrate our protocol to UDP. We can {\bf use fewer sockets} by making our self-to-self connections happen internally to the code rather than involving the operating system's socket implementation.\plan{Optional; 1 week.} plus one week measurement.} (XXX We did this, but we need to do something more/else.) \subsection{Performance: network usage} We know too little about how well our current path Loading Loading @@ -272,39 +231,25 @@ tool. \subsection{Implementation: client-side and bridges-side} Our anticensorship design calls for some nodes to act as ``bridges'' that are outside a national firewall, and others inside the firewall to act as pure clients. This part of the design is quite clear-cut; we're probably ready to begin implementing it. To {\bf implement bridges}, we need to have servers publish themselves as limited-availability relays to a special bridge authority if they judge they'd make good servers. We will also need to help provide documentation for port forwarding, and an easy configuration tool for running as a bridge. To {\bf implement clients}, we need to provide a flexible interface to learn about bridges and to act on knowledge of bridges. We also need to teach them how to know to use bridges as their first hop, and how to fetch directory information from both classes of directory authority. Clients also need to {\bf use the encrypted directory variant} added in Tor 0.1.2.3-alpha. This will let them retrieve directory information over Tor once they've got their initial bridges. We may want to get the rest of the Tor user base to begin using this encrypted directory variant too, to provide cover. Bridges will want to be able to {\bf listen on multiple addresses and ports} if they can, to give the adversary more ports to block. \subsection{Research: anonymity implications from becoming a bridge} see arma's bridge proposal; e.g. should bridge users use a second layer of entry guards? \subsection{Implementation: bridge authority} The design here is also reasonably clear-cut: we need to run some we run some directory authorities with a slightly modified protocol that doesn't leak the entire list of bridges. Thus users can learn up-to-date information for bridges they already know about, but they can't learn about arbitrary new bridges. we need a design for distributing the bridge authority over more than one server \subsection{Normalizing the Tor protocol on the wire} Additionally, we should {\bf resist content-based filters}. Though an adversary can't see what users are saying, some aspects of our protocol are Loading @@ -313,10 +258,6 @@ easy to fingerprint {\em as} Tor. We should correct this where possible. Look like Firefox; or look like nothing? Future research: investigate timing similarities with other protocols. \subsection{Access control for bridges} Design/impl: password-protecting bridges, in light of above. And/or more general access control. \subsection{Research: scanning-resistance} \subsection{Research/Design/Impl: how users discover bridges} Loading Loading @@ -398,14 +339,6 @@ resist these attacks, or can improve our design to resist them, we should. unless a graduate student is interested.} \subsection{Implementation security} Right now, each Tor node stores its keys unencrypted. We should {\bf encrypt more Tor keys} so that Tor authorities can require a startup password. We should look into adding intermediary medium-term ``signing keys'' between identity keys and onion keys, so that a password could be required to replace a signing key, but not to start Tor. This would improve Tor's long-term security, especially in its directory authority infrastructure.\plan{Design this as a part of the revised ``v2.1'' directory protocol; implement it in 2007. 3-4 weeks.} We should also {\bf mark RAM that holds key material as non-swappable} so that there is no risk of recovering key material from a hard disk Loading Loading @@ -458,11 +391,11 @@ them as belonging to the same family.\plan{Do during v2.1 directory protocol To avoid attacks where an adversary claims good performance in order to attract traffic, we should {\bf have authorities measure node performance} (including stability and bandwidth) themselves, and not simply believe what they're told. Measuring stability can be done by tracking MTBF. Measuring bandwidth can be tricky, since it's hard to distinguish between a server with they're told. We also measure stability by tracking MTBF. Measuring bandwidth will be tricky, since it's hard to distinguish between a server with low capacity, and a high-capacity server with most of its capacity in use.\plan{Do ``Stable'' in 2007; 2-3 weeks. ``Fast'' will be harder; do it if we can interest a grad student.} use. See also Nikita's NDSS 2008 paper.\plan{Do it if we can interest a grad student.} {\bf Operating a directory authority should be easier.} We rely on authority operators to keep the network running well, but right now their job involves Loading