How Big is the Dark Web?
Common media interpretations of the size of the "dark web", such as an early 2014 Businessweek article, indicate that the dark web is very large:
"In addition to facilitating anonymous communication online, Tor is an access point to the "dark Web," vast reaches of the Internet that are intentionally kept hidden and don't show up in Google or other search engines, often because they harbor the illicit, from child porn to stolen credit card information."
This, however, appears to be misleading; and may impact on and cause FUD for those not generally exposed to Tor and other elements of the dark web.
One of the elements which may cause confusion is the definition of "dark web" vs the definition of "deep web". Unfortunately, some sites don't distinguish between the deep and dark web. As a result the two words have become confused in general usage, leading to a drift in meaning.
Web: the portion of the Internet which is accessible via a web browser; the World Wide Web.
Deep web (search): information which is not registered with any search engine (definition as per the August 2001 paper The Deep Web: Surfacing Hidden Value from the Journal of Electronic Publishing). This includes information which is housed in databases and which is only viewable through dynamic pages generated when the content is requested, and information which resides behind authentication such as on private organizational networks and public networks such as Facebook.
Deep web: Often confusingly used as a synonym for Dark web
Dark web: that portion of the web which cannot be easily reached from the public Internet, and usually requires specialized software to access. Examples of the dark web are the Tor network and hidden services, the I2P network and its eepsites, and the RetroShare network.
Public Internet: the 'regular' Internet, available for all to use, and open to filtering/censorship by governments and ISPs.
Private network: a computer network which is reserved for specific purposes, e.g. company networks.
Overlay network: a computer network which is built on the top of another network. Nodes in the overlay can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network [Wikipedia].
Size depends on exactly how we categorize things. For the purposes of this article we're talking about the web, i.e. the portions of the Internet which are accessible from a web browser; not the entire Internet itself. So from here, bearing in mind that we're discussing the dark web and the deep web:
The deep web is claimed to be approximately 500 times larger than the public Internet, based on figures in the Surfacing Hidden Value report above and on others (e.g. About.com), although these others often only reference the paper above.
The dark web, however, is considered to be much smaller. There are likely to be 1-2000 Tor hidden services [ freehaven.net, donncha.is ], although this is very hard to establish as hidden services are, by design, hidden! There are currently approximately 3,000,000 Tor users; this provides an upper bound on the number of hidden services, and it is likely that few of these user actually run hidden services.
Should we expand the discussion to include all of the dark networks then the discussion becomes a little less clear. P2P, VPN and VoIP networks are all overlay networks, and may be considered dark networks in the same manner as the dark web above (specialized software is required for access). Tor is not the only dark or overlay network. Sizes of other dark networks such as I2P and RetroShare are hard to gauge; however based on popularity it would appear reasonable to guess that, in total, the size of dark networks combined is far smaller that the deep web, and highly likely to be smaller than the public Internet.
The dark web is not really all that large. Important, but not "vast reaches of the Internet", and certainly not as large as the deep web. Overlay or dark networks may indeed be very large, as well as being more accepted as part of 'regular' Internet technology.
Some related and positive (or at least not entirely negative) articles: