4.2. Problem tackled during this works
This work look for improving TOR anonymity by enforcing some security mechanism and enhancing the performance of others. The proposed improvements will focus on improving TOR overall security and performances along with preserving and enforcing the current features which showed some weaknesses against the recent advanced attacks. This work state a clear statement in which an adversary external to TOR at any location with any capabilities should be unable to link (at large scale or locally) to link or identify any TOR user. This property is known as end-to-end unlinkability which is defined as guaranteeing the anonymity of the source regardless the destination’s location. Note that the anonymity targeted is for both TOR sender anonymity and sender-receiver anonymity.
Figure 19.
TOR research and improvements work tree (AlSabah et al., 2015).
Figure 19.
TOR research and improvements work tree (AlSabah et al., 2015).
Meanwhile, to preserve TOR reputation in the future, the security design should be reviewed and adapted to emerging threat and achieve better security and performances. In fact, several TOR flaws and vulnerabilities including the newly emerged ones remains untreated, the following is part of the desired properties and capabilities to be included in TOR:
- a)
Cryptographic and performance enhancement
The current TOR deployment, despite the fact that it is considered as a low-latency anonymity solution, is facing performance trouble which impact both the quality of the service and the platform usability and flexibility to be used in as much possible situation. Reducing delays is a priority and is closely related to the cryptography used over TOR.
- b)
Circuit selection security
An attacker, whatever his capabilities, should not be able to redirect any TOR client to choose a compromised circuit (controlled Guard and Exit ORs) or even modify the packet header to change path without being detected. The adversary should not learn forwarding information of uncompromised OR, OR’s geographic positions, or the total number of hops on a circuit (length).
- c)
Sessions’ and Users’ Un-linkability
An adversary should not be able to perform timing attack to link Cells (packets) from different users or sessions, even between the same set of sources and destinations.
- d)
Network Confusion and Diffusion
An adversary eavesdropping on multiple links in the network should not be able to correlate two or more packets and determine that are from the same user by observing the bit patterns in the packet headers or data.
- e)
Node-to-Node cells’ authentication
In addition to the existing Data secrecy and end-to-end integrity, inter ORs TLS authentication and Directory server authentication, TOR should include selective authentication check mechanism to identify Rogue Cells or any forged cells injected by attacker into the circuit for any intention (misusing TOR computing power and causing DoS, inject fake Cells).
4.3. Proposed improvements and enhancements
During this work, several improvements will be proposed covering different research directions which all aims to address the current design weaknesses, security and performances issues described in Chapter 3. The security is a wide term which in TOR context refer to three thing; information security (Confidentiality, Integrity and Authenticity), Anonymity and un-linkability. The information security part is tackled in this work by proposing an improved AES mode guaranteeing the Confidentiality and the Authenticity simultaneously, where the anonymity and un-linkability are tackled by reviewing the TOR wrapping mechanism, circuit construction and routing within TOR which appear to be not relevant for the security but in reality it’s the most important factors for anonymity and un-link-ability. Moreover, this work tackle the performance issue in TOR and the related security flaws and vulnerabilities, as several attacks against TOR exploited the delays and the poor performances (timing attack, users’ clustering, and path selection attack).
4.3.1. Multi-layer encryption improvement
In this part, we assume that the AES (Advanced Encryption Standard) functioning principle is well known. Nevertheless, we will describe it in details at the appendix. The purpose of this part is to investigate the suitability of replacing the current AES implementation in CBC mode used for cells’ encryption by the authenticated-encryption OCB mode (Bogdanov et al, 2014).
OCB (offset code-book) is a new and revolutionary implementation of AES block cipher guaranteeing authentication along with the traditional confidentiality (privacy) of the user data, this type of ciphering id called authenticated-encryption scheme. Moreover, OCB mode is surprisingly and remarkably fast as it achieves authenticated encryption quicker and consuming almost the same duration as oAES encryption only CTR mode. Therefore, by adopting OCB the user can achieve in the cheapest way two out of three information security goals in an optimised and secure manner as OCB is considered as a simple cipher and resources efficient cipher when it comes to implementation in either hardware or software (Krovetz & Rogaway, 2014). In nowadays cryptography OCB solve three majors cipher issues:
- -
OCB eliminate the problem of authenticated-encryption with associated-data (AEAD),
- -
The OCB nonce required to encrypt and decrypt should not be necessary random as it utelise a counter,
- -
OCB can encrypt data of any size without padding it to any convenient-length and therefore save some precious computing power.
TOR like all others Cryptosystem present cryptographic vulnerabilities related to its corposants, and one of these is the unauthenticated Cells traveling throughout the network leaving the possibility of forging fake Cells an inject them inside the network for malicious purposes. The ORs authentication mechanism remains insufficient. To introduce a node-to-node authenticity check on TOR, two options are available; the first option is an AES mode providing both privacy and authenticity separately (CBC, CTR and others) which perform separately encrypting and then computing the associate authentication using two different keys, the cost of having authenticated-encryption is thus the cumulated cost of encrypting with the cost of MAC.
- a)
Working principle of the OCB mode
AES-OCB is a block-cipher with a block length and a key (K) of 128 bits each. It also uses a nonce (N) of 96 bits and an associated incremented counter value (Δ). The OCB detailed working principle is as follow (algorithm in Appendix 2):
- -
First the plaintext M is divided into blocks of 128 bits each M = M1 ... Mm, Here there is two cases; the data size in bit is a multiple of 128, or there is a remainder and therefore the algorithm require a padding.
- -
Secondly, a Checksum of 128 bits is calculated Checksum = M1 ⊕…⊕ Mm and will be used later during the authentication process.
- -
Thirdly, an initialisation function “Init” take place and using the nonce N which is concatenated with a 32 bits constant value to produce a 128 bits value called “Top”. Later, the
Ktop = EK (Top) is computed and Stretched to produce the 256 bits value
Stretch = Ktop || (Ktop⊕(Ktop<<8)) (left shift by 8 positions KTop and replace the empty by zeros). The value Init(N) which is the initial value for Δ.
- -
Fourthly, for each block “i” the increment is called to increment the Δ, XOR with the Mi, and encrypted using the key K and the algorithm AES-OCB as showed in the scheme. Later, the output of the encryption in stage 4 is XOR again with the Δ to produce Ci... The authenticated ciphertext is CT = C1 C2 … Cm T.
Afterward, the authentication value which is 128 bits length is computed by processing the associated data A which is XOR with the value of Δ for each block in the same way as the encryption part and later encrypted using the same key K. the result of all the blocks is XOR together to produce a 128 bits length authentication value Auth. Finally the checksum of the initial data is XOR again with the Δ, encrypted using the key K and then XOR with the authentication value Auth to produce a final authentication Tag “T” for the whole data (Krovetz & Rogaway, 2014).
Figure 20.
OCB mode functioning schemes (Bogdanov et al, 2014).
Figure 20.
OCB mode functioning schemes (Bogdanov et al, 2014).
The Decryption under OCB mode is faster and simpler. By having given K, N, and CT, the receiver recover the initial message M following the normal decrypting way. Then, the authentication tag T is re-computed and compared with the received one to determine the authenticity of the received message (Krovetz & Rogaway, 2014).
Figure 21.
AES-OCB version 3 encryption and authentication operations scheme (Krovetz & Rogaway, 2014).
Figure 21.
AES-OCB version 3 encryption and authentication operations scheme (Krovetz & Rogaway, 2014).
There is three different implementation (versions) of OCB mode. The adopted version for this work is version 3 which is the most optimised version in term of operation number and computational power. Although, the performance improvement will be relatively small, it remain crucial for TOR to adopt an authenticated-encryption for onion construction. In fact, in addition to the performance enhancement, implementing AES in OCB mode will bring the following properties:
- -
Fully parallelizable operations of the block ciphering can be performed simultaneously. Thus, OCB is very efficient and suitable for hardware encrypting at high network speeds.
- -
Block-ciphering scheme make it strong and resist better to the new timing attacks which the other mode like CBC would be vulnerable.
- -
OCB is a single key scheme as it use the same key for encryption and authentication which make it more efficient in term of memory use.
- -
OCB can process any data size without requiring it to be a multiple of the block length. Moreover, no external padding function is used and thus it economise time as there is no bits- waste in the ciphertext due to padding.
- -
The main computational function used beyond the block-ciphering is XOR which is very time and power efficient function (three 128 bits XOR per block).
- -
OCB can be perfectly used into memory-limited systems as the main memory cost the amount needed to hold the AES sub-keys.
An authenticated-encryption scheme enable two parties sharing a secret symmetric key to communicate in a manner that ensures both privacy and authenticity. AES implementation in OCB mode is designed to be time (time consumed to perform encryption) and resources (processor and memory) efficient in both software and hardware. In fact, the algorithm is perfectly adapted to restricted environments requiring accuracy and pseudo-synchronization along with providing provable security and authenticity (Krovetz & Rogaway, 2014). During this work we will start by assessing OCB Security and performance versus others competitor integrated authenticated-encryption modes. The use of an incremented nonce for each encryption and the decryption process by OCB is one of the major strength. In fact, It is required that the nonce should be unique (not necessary random, secret or unpredictable) for each message but OCB rely on a counter value which ensure that each nonce is different. Thus, the importance of the unicity of the nonce is crucial to maintaining perfect authenticity and privacy.
On the other hand, OCB competitor scheme and particularly CCM and GCM which offer an integrated authentication along with encryption will be assessed in similar testing environments during this work. Moreover, the traditional approach for achieving authenticated-encryption which rely on composite functions (encryption following by MAC or MAC followed by encryption) will be assessed alongside with OCB competitors evaluation, two implementations of both CBC and CTR mode followed by MAC computing will be performed to serve as a reference on the performance evaluation. In this work the implementation of CBC and CTR mode will not use separate keys, in fact we will use the same 128 bits key for encrypting the plaintext and then to calculate the associated MAC of the resulting ciphertext. Nevertheless, for security measure the CBC IV (Initialisation Vector) will not be derived from the key but instead it will be generated using a different function. The following table, summarize the different mode of implementation and the picked candidates for potential adoption instead of the existing CBC/CTR and are: OCB CCM, GCM, and EAX:
Figure 22.
AES implementation modes comparaison (Bogdanov et al, 2014).
Figure 22.
AES implementation modes comparaison (Bogdanov et al, 2014).
- c)
Cryptographic features comparison against competitors
To evaluate the features of each candidate in addition to the practical performances, this work rely on the following points to determine the suitability of the authenticated encryption mode, the features are summarised in the following table and divided into three major part:
Table 1.
Authenticated encryption features comparison (Krovetz & Rogaway, 2014).
Table 1.
Authenticated encryption features comparison (Krovetz & Rogaway, 2014).
Feature |
CCM |
GCM |
OCB |
Security Proved |
Yes |
Yes |
Yes |
Online ability |
No |
Yes |
Yes |
Key requirement |
128 bits block size |
128 or 64 bits block size |
128 or 64 bits block size |
- -
Provably secure: all the three modes are proved to be mathematically secure by assuming that the used with block cipher (AES) is pseudorandom permutation. As far as the cryptography permit, AES is proved secure and thus both three modes of implementation are absolutely secure.
- -
Online message processing: this feature is crucial for the suitability of the mode as the modes should be able to process data without knowing the whole length in advance as the TOR have no pre-set or pre-defined data length. Moreover, this feature is highly desired for a memory restricted environment which is the case of ORs in this part, CCM mode fail to achieve the set baselines.
- -
Cipher requirements: CCM mode is developed to only work with ciphers using block size of 128 bits, while GCM and OCB can work with cipher using different block size (64/128 bits).
Nevertheless, this feature will not affect the CCM mode as the block size in TOR is 128 bits which is anyway more efficient and better for performances.
4.3.2. The Encapsulation approach (Onion wrapping method)
TOR use Cells as mean of transporting TCP/IP data throughout the network to the exit OR which will be in charge of transmitting it to the destination under TCP/IP protocol. Currently, TOR perform a multi-layer encryption-encapsulation which mean that the initial data is placed into fixed size cells of 512 bytes each (509 bytes for data and 3 bytes for header) and then encrypted three times using the three ORs constituting the routing circuit keys into the inverse order. In other words, the whole data along with the next OR address or the final destination is encrypted three times (figure).
Figure 23.
the onion multi-layers encryption approach.
Figure 23.
the onion multi-layers encryption approach.
This approach of multi-layer encryption is the hearth of the TOR system as it allow only the ORs part of the Circuit and In Ordered way to have access to the information related to the next OR in the circuit or the final destination of data and thus achieving the anonymity of the sender (figure). However, giving the delays caused by heavyweight encryption of relatively big data this mechanism of encapsulation and wrapping became problematic as it, in one hand slowdown the performances and in the other hand was proved mathematically that this approach does not bring additional security to the system. In fact, in cryptography encrypting the same data using the same function (algorithm) several time using different keys will give the same security of encrypting it once using a composite key which is the aggregation of all the keys (not the addition but it is mathematically determined).
To summarize, improving the encapsulation and wrapping on TOR will not only improve security (privacy and anonymity) but also enhance the network overall security and resilience as the current relays on TOR are being exploited by several attacks (correlation and timing attacks).
- d)
Proposed Improvement:
Instead of multi-encrypting TOR Cells several time to produce an onion wrapping which only circuit ORs will be able to unwrap, we proposed a much efficient and time saving approach which perform a full encryption of the whole original data including both Exit OR address (Cell Header) and TCP/IP (Header and Data) in the first phase using the Exit node AES shared-secret Key. Then, for the remaining ORs (Entry OR and Middles ORs) only the cell header will be encrypted. In cryptography, encrypting several time data using different keys (k1, k2,…kn) is equivalent to ONE encryption using a composite key K. Thus, the current TOR encryption of the whole Cells several times is useless and cause performance slowing down only as one layer of strong encryption n is enough.
Figure 24.
TOR current cell multi-layers and encapsulation approach.
Figure 24.
TOR current cell multi-layers and encapsulation approach.
Moreover, the current TOR cells structure is vulnerable and should be reviewed, we propose that only the internal Cell Header contain the Circuit ID, where the external ones (OR1 and OR2) should contain only the address of the next OR and Command. Giving the fact that TOR is managed locally, including this kind of information cause redundancy causing the slowdown of the operation and also leave the circuit ID exposed to threats especially when a fully compromised OR is a part of the circuit.
The proposed approach of encapsulation and onion construction work as follow:
Figure 25.
TOR proposed cell multi-layers and encapsulation approach.
Figure 25.
TOR proposed cell multi-layers and encapsulation approach.
Basically, the proposed wrapping algorithm will economise (save) a crucial time and componential power. In theory, two (02) round of encrypting data of 509 bytes will be saved along with preserving the same security of the existing TOR. In cryptography encrypting the same text several layers using different (but same size) keys will have the same security of encrypting only one time using the composite key. As AES is provably secure and the key used is also assumed secure enough to suffice on one layer of encryption.
4.4.3. The number of intermediate ORs
TOR users anonymous online activities is mainly due to two computing technology Cryptography and Routing, TOR utilizes a series of ORs and makes users’ data traveling through a number of hops before it reaches the final destination. By ensuring that each OR have not more information than its predecessor and its successor in the circuit, TOR hide the origin or the destination of the cells containing data and therefore guarantee users’ anonymity. Given the aforementioned principle, it is obvious that the more is the number of ORs into a circuit, the better is the source of data (client) is hidden and thus anonym, tracking back the communications will become very complex and the majority of times just impossible to perform. Meanwhile, the number of ORs influence the connection performance as the long circuits cause more delays (latency) and running interactive applications requiring time precision connection becomes impossible. Hence TOR developers, were seeking the best trade-off between a secure connections that enables perfect anonymity while keeping connections latency bearable. Following several research and testing, TOR developers finally adopted the three (03) ORs circuits length which the current TOR deployment use. When a client is communicating with a server, the data is routed through three intermediate ORs before leaving the TOR network and reach its destination. This choice is defined as the optimal balance between security and usability of TOR.
A continuous debate has been raised regarding the appropriate circuit length especially after the 2015 FBI attack which with the help of researchers were able to control, at several occasion, both the Guard and the Exit ORs and therefore performed an advanced attack to de-anonymise several TOR users including “Drug website Silk-Road-2” owner. As consequence, current TOR short three (03) ORs circuit length will be critically reviewed in this work and several improvement strategies will be considered including increasing the length of the circuit and adopting new routing strategies such as controlled exit OR which will be discussed later in this work. TOR developers’ intention behind the choice of a default three ORs circuit is to provide the best balance between the security and performance. In fact, this choice include an Entry OR, an Exit OR and an additional OR aiming to obfuscating the link between the entry and the exit ORs in such way that even if an attacker is able to compromise either of these ORs, the middle OR will constitute the last layer of defence as the attacker can only observe encrypted traffic and it cannot directly deduce the identity of the user. Nevertheless, with the rapid increase and development of the computational power and cryptanalysis attacks, this defence layer become meaningless and can be compromised by performing timing correlation attack.
- a)
Proposed Improvement
A systematic thought is that increasing the circuit length further would produce an increase into the TOR security. Unfortunately, this operation will incur a significant impact on TOR performance and penalise further the network as the more ORs are involved in transporting one cell, the more times the same cells are relayed in the network before reaching its destination. To determine the impact of increasing circuit length onto TOR anonymity, we will experience different cases in which the variables will be either the number of OR into the circuit or the routing methodology itself such as controlled exit OR selection and network link-status depending selection.
In this work, we implemented a dynamic TOR circuit construction function which have as input the natural number P which is the length of the circuit, then we uses this function to measure the impact of a longer or shorter circuit on the performances. In the proposed function we proposed different circuits building approach for evaluation purposes, also rebuilding function was modified along with the initial function following two main criteria: time interval and circuit performance. Furthermore, as TOR connections terminating in the public internet, the weakest points for attack in the circuit is obviously the Exit ORs (last OR in the circuit). We introduced the notion of “Controlled Exit OR” in the circuit construction in which the algorithm responsible of defining the ORs which will take part in the circuit is adapted to choose the exit ORs from a pre-defined list which reflect in the real TOR the list of trusted ORs. This proposed solution is expected to reduce considerably the FBI attack success against TOR despite the number of rogue (fake) OR inserted into TOR network (Steven et al., 2011).
Figure 26.
Dynamic TOR circuit length scheme.
Figure 26.
Dynamic TOR circuit length scheme.
4.4.4. Circuit ORs’ Selection Approach
In this work we will improve the path selection process by proposing a novel path selection hybrid-algorithm relying on varying path length, controlled exit OR and real-time performance assessment functions. We also employ ORs parameters from a simulation of TOR to compare the proposed algorithm efficiency against.
In the current TOR implementation, ORs for circuit construction are selected at a uniformly random basis aiming to guarantee that ORs are selected uniformly and thus increases the probability and the uncertainty of an attacker trying to de-anonymise users by guessing the ORs used in a particular circuit. Although, because of the heterogeneity in resources caused by ORs of different capacity (computing power, bandwidth) and the emergence of a new classes of attacks, the selection algorithm was edited by TOR development team in the second generation of the network, the following changes were introduced in form of exception in the algorithm:
- -
No OR should be used in the same circuit more than once,
- -
ORs in the same circuit should belong to different class of TOR network,
- -
A special treatment for co-administered ORs is introduced by marking them as the same family,
- -
Directory Authorities will assign flags to ORs basing on the following parameters: performances, status, position and role.
Moreover, some important features were added following the attacks on TOR circuit selection in 2014. In fact, ORs selection algorithm was again changed in such way that entry OR (guard) is only selected from a subset of ORs classified as “entry guards” and which are particularly “trusted” and continuously authenticated to check the status. The entry guard subset is a group of ORs which are constantly active, having a bandwidth of at least 250 KB/s (Dingledine et al. 2014). In reality, the implemented selection algorithm allowed 3 ORs guard to be assigned for a user for a period of 30 to 60 days and used in combination for all circuits. However, due to the limit probability of attacks and the impact on the TOR overall traffic homogeneity, this algorithm was abandoned to allow client the use of only one entry OR for the same period of time (Dingledine & Mathewson, 2015).
On the other hand, current TOR circuit selection algorithm states that selecting the remaining ORs on the circuit (Middle and Exit) is proportional to the available bandwidth. This choice aims to ensure that powerful ORs are chosen more often. Nevertheless, the bandwidth information which TOR base on for making decision is being advertised by the ORs themselves which leave the opportunity for rogues ORs to advertise false data in order to acquire more traffic. Logically thinking, if a capable attacker with significant resources is able to inject an important number of rogue ORs having the best performance, it will be able to re-direct a significant amount of TOR traffic via these ORs (which will be middle ORs or exit OR of the selected circuits) and therefore being able to perform a time-analysis attack to determine the User as the Entry OR information will be already disclosed for the Middle OR (Steven et al., 2011).
- a)
Proposed Improvement
Figure 27.
TOR circuit selection scheme.
Figure 27.
TOR circuit selection scheme.
The proposed improvement for the circuit selection algorithm is twofold:
First, we introduce a new sub-function in the section algorithm imposing the selection of the Exit OR from a pre-defined subset only. This subset will be completely different from the entry guard subset (no common ORs) and contain more ORs . This method is called during this work “Controlled Exit” which will be implemented into the emulation platform for test purpose. It is evident that such method will have an impact on traffic homogeneity, but the security enhancement will be greater than the overall performance. Moreover, the proposed method will help into protecting the TOR hidden- services from being disclosed during the attacks.
The second proposed improvement is related to the choice of the ORs basing on the advertised capacity (bandwidth). The misleading information that rogue (fake) ORs can provide during the selection process could be fatal for user security. Thus, the directory authority responsible of the collection of such information and processing should not trust this information and rather evaluate or estimate itself the capacity of each router and therefore faster ORs will not have any more a higher probability. In this work, the concept of prudential-processing is introduced in which the calculation of ORs’ capacities is done in a real-time basis relying on both provided information by the ORs, historical status and credibility. This method will be implemented and tested into the emulation platform.
Figure 28.
the proposed controlled exit approach for TOR circuit selection.
Figure 28.
the proposed controlled exit approach for TOR circuit selection.
- b)
Dynamic circuit construction with traffic management
To tackle the delays and security issues related to the choice of circuit ORs, we introduced an enhanced circuit selection algorithm named Controlled exit with congestion-aware ORs Selection at Client side. In this work we tried to adopt and implement a circuit construction method proposed by (Wang et al.,2012) in which the multi-criteria circuit selection algorithm rely on both the status and performance data and real time indicator. First, TOR’s default bandwidth-weighted OR selection algorithm is used to construct circuits. Then, the proposed algorithm will use an opportunistic and active-probing function to calculate the circuit’s latency value.
4.4.5. Cells’ Multi-circuit routing (limited to three Circuits)
Multipath routing approach on TOR has been previously tackled by Snader et al. (2010) in context of improving TOR networking performances. The research simulated a case of downloading data fragment of 1MB each over a dedicated TOR simulation network. The files were divided into blocks of 512 Bytes and routed from the sources to destination over multiple circuits. The proposed mechanism was working as follow. An algorithm assign each Client (OP) two different Entry and Middle OR, the mechanism used the same Exit OR for both Circuit (
Figure 29).
This research observed that the security and throughput of the routed traffic were significantly enhanced. However, two circuits performance was less well than single circuit as the chances of choosing a slow OR as part of the circuits double and thus the median transfer time increased for the two circuits. Nevertheless, the research highlighted the fact that the security enhancement of such proposition is also considerable and also that the risk of including a compromised OR will be certainly be affected if this function is used.
- a)
Proposed Improvement
The current TOR deployment each Client (OP) is assigned by default three different Entry (guard) OR which will be the only Entry guard that this client can have (use) for a certain duration. This research will rely on this feature and implement an algorithm which will generate for each linked Traffic (having the same destination server) three different circuits. This Multipath routing has introduced to enhance security for TOR users and also the Entry and Exit OR themselves. TOR client starts by building a three different circuits using the following algorithm:
Figure 30.
the proposed TOR multi-circuit routing mechanism.
Figure 30.
the proposed TOR multi-circuit routing mechanism.