Home » Articles » A Comprehensive Privacy Analysis of Tor Hidden Services
Click Here To Hide Tor

A Comprehensive Privacy Analysis of Tor Hidden Services

Tor is one of the most popular darknets, which is known for its high levels of anonymity. Even though Tor’s protocol and its relay security have been thoroughly studied, there is currently no detailed analysis of the framework and privacy of hidden web services on this widely used darknet.

To address this, a group of researchers formulated a specialized analysis platform that they used to crawl and analyze more than 1.5 million URLs of hidden web services hosted on 7257 onion domains. For each studied web page, its resources, backlinks and its redirection graphs were thoroughly analyzed, along with the category and language distribution. As per the experiments done by the researchers, Tor hidden web services are grouped within a sparse, yet highly connected graph, throughout which approximately 10% of all onion sites are technically completely isolated.

TOR privacy.jpg

Methods Used:

As connections between various web pages represent an innate feature of the surface web, websites on the dark web are more isolated and ephemeral. Accordingly, crawling the dark web represents a rather daunting task.

To perform this study, the researchers manually collected a list of URLs linked to 195,748 onion domains from 25 public directories and forums. A custom crawler was then implemented to analyze the seed list to obtain data for purpose of the analysis and extract new domains. The used crawler can be modified to run according to one of two behavioral modes; the “collection mode” that operates via retrieving all HTML and Javascript resources of the target onion domain. This mode is restricted regarding the maximum number and depth of links it can analyze for each onion domain. When these limiting thresholds are reached, the system switches to the other mode; the “connectivity mode”, when it would simply crawl the remaining pages exploring new links directed towards other onion domains. Throughout this mode, the system won’t store a copy of the resources and so, it is not restricted regarding its depth level, but with a ceiling of 10,000 URLs per onion domains.

Results of the Comprehensive Privacy Analysis:

For the first time in literature, this study measures the tight connection that is present between Tor’s hidden web services and websites of the surface web. More specifically, over 20% of the onion domains analyzed throughout the study, imported resources from websites on the surface web. Moreover, backlinks to websites on the surface web, were much more prevalent than backlinks to onion domains.

The study showed that Tor hidden web services are usually highly connected, yet their internal structure is often sparse, with a large number of strongly connected domains. Collectively, 10% of Tor hidden websites don’t have any incoming links and surprisingly enough, around 98.8% of all discovered onion domains are already enlisted on public directories. A single public directory points to over 70% of the onion domains studied. Furthermore, the study proved that Tor hidden web services are more connected to the surface web, than to other web services on the Tor network. More specifically, more than 21% of all onion domain import resources, such as Javascript code, from the surface web. In view of these external components, the researchers found out that Google, by itself, can monitor access to approximately 14% of the Tor hidden web services that were analyzed in the study.

The study also estimated, for the first time, the nature and prevalence of web tracking in Tor’s hidden web services. The researchers concluded that web tracking is observably present in the dark web, yet it is not as prominent as it is on the surface web; over 40% of the scripts are often used for this purpose and 70% of those represent new tracking scripts that are not recognizable by current anti-tracking solutions. Via a generic web tracking analyzer, the researchers found out that even though scripts usage on Tor hidden services is much more limited when compared to its usage on the surface web, the percentage of them used for web tracking is almost similar in both the dark web and surface web. Over 75% of onion domains that include at least one Javascript file, do a form or other of web tracking. Furthermore, the greater percentage of web tracking scripts included in Tor hidden web services are not identifiable by current anti-tracking techniques.

One comment

  1. From a defensive point of view there isn’t anything new in this paper but it is a useful reminder that (a) you should only access a Tor hidden service via Tor itself and never a third-arty proxy and (b) whatever browser you use to visit a Tor hidden service one should use that browser /exclusively/ for browsing Tor HIDDEN services and never for anything else including the “normal web”. This may require one to have two separate installs of Tor on two separate computers: one for using Tor to browse hidden service and one to use Tor to browse the normal web.

Leave a Reply

Your email address will not be published. Required fields are marked *


Captcha: *