123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510 |
- Filename: 268-guard-selection.txt
- Title: New Guard Selection Behaviour
- Author: Isis Lovecruft, George Kadianakis, [Ola Bini]
- Created: 2015-10-28
- Status: Obsolete
- (Editorial note: this was origianlly written as a revision of
- proposal 259, but it diverges so substantially that it seemed
- better to assign it a new number for reference, so that we
- aren't always talking about "The old 259" and "the new 259". -NM)
- This proposal has been obsoleted by proposal #271.
- §1. Overview
- Tor uses entry guards to prevent an attacker who controls some
- fraction of the network from observing a fraction of every user's
- traffic. If users chose their entries and exits uniformly at
- random from the list of servers every time they build a circuit,
- then an adversary who had (k/N) of the network would deanonymize
- F=(k/N)^2 of all circuits... and after a given user had built C
- circuits, the attacker would see them at least once with
- probability 1-(1-F)^C. With large C, the attacker would get a
- sample of every user's traffic with probability 1.
- To prevent this from happening, Tor clients choose a small number of
- guard nodes (currently 3). These guard nodes are the only nodes
- that the client will connect to directly. If they are not
- compromised, the user's paths are not compromised.
- But attacks remain. Consider an attacker who can run a firewall
- between a target user and the Tor network, and make
- many of the guards they don't control appear to be unreachable.
- Or consider an attacker who can identify a user's guards, and mount
- denial-of-service attacks on them until the user picks a guard
- that the attacker controls.
- In the presence of these attacks, we can't continue to connect to
- the Tor network unconditionally. Doing so would eventually result
- in the user choosing a hostile node as their guard, and losing
- anonymity.
- This proposal outlines a new entry guard selection algorithm, which
- addresses the following concerns:
- - Heuristics and algorithms for determining how and which guard(s)
- is(/are) chosen should be kept as simple and easy to understand
- as possible.
- - Clients in censored regions or who are behind a fascist firewall
- who connect to the Tor network should not experience any
- significant disadvantage in terms of reachability or usability.
- - Tor should make a best attempt at discovering the most
- appropriate behaviour, with as little user input and
- configuration as possible.
- §2. Design
- Alice, an OP attempting to connect to the Tor network, should
- undertake the following steps to determine information about the
- local network and to select (some) appropriate entry guards. In the
- following scenario, it is assumed that Alice has already obtained a
- recent, valid, and verifiable consensus document.
- The algorithm is divided into four components such that the full
- algorithm is implemented by first invoking START, then repeatedly
- calling NEXT while adviced it SHOULD_CONTINUE and finally calling
- END. For an example usage see §A. Appendix.
- Several components of NEXT can be invoked asynchronously. SHOULD_CONTINUE
- is used for the algorithm to be able to tell the caller whether we
- consider the work done or not - this can be used to retry primary
- guards when we finally are able to connect to a guard after a long
- network outage, for example.
- This algorithm keeps track of the unreachability status for guards
- in state global to the system, so that repeated runs will not have
- to rediscover unreachability over and over again. However, this
- state does not need to be persisted permanently - it is purely an
- optimization.
- The algorithm expects several arguments to guide its behavior. These
- will be defined in §2.1.
- The goal of this algorithm is to strongly prefer connecting to the
- same guards we have connected to before, while also trying to detect
- conditions such as a network outage. The way it does this is by keeping
- track of how many guards we have exposed ourselves to, and if we have
- connected to too many we will fall back to only retrying the ones we have
- already tried. The algorithm also decides on sample set that should
- be persisted - in order to minimize the risk of an attacker forcing
- enumeration of the whole network by triggering rebuilding of
- circuits.
- §2.1. Definitions
- Bad guard: a guard is considered bad if it conforms with the function IS_BAD
- (see §G. Appendix for details).
- Dead guard: a guard is considered dead if it conforms with the function
- IS_DEAD (see §H. Appendix for details).
- Obsolete guard: a guard is considered obsolete if it conforms with the
- function IS_OBSOLETE (see §I. Appendix for details).
- Live entry guard: a guard is considered live if it conforms with the function
- IS_LIVE (see §D. Appendix for details).
- §2.1. The START algorithm
- In order to start choosing an entry guard, use the START
- algorithm. This takes four arguments that can be used to fine tune
- the workings:
- USED_GUARDS
- This is a list that contains all the guards that have been used
- before by this client. We will prioritize using guards from this
- list in order to minimize our exposure. The list is expected to
- be sorted based on priority, where the first entry will have the
- highest priority.
- SAMPLED_GUARDS
- This is a set that contains all guards that should be considered
- for connection. This set should be persisted between runs. It
- should be filled by using NEXT_BY_BANDWIDTH with GUARDS as an
- argument if it's empty, or if it contains less than SAMPLE_SET_THRESHOLD
- guards after winnowing out older guards.
- N_PRIMARY_GUARDS
- The number of guards we should consider our primary
- guards. These guards will be retried more frequently and will
- take precedence in most situations. By default the primary
- guards will be the first N_PRIMARY_GUARDS guards from USED_GUARDS.
- When the algorith is used in constrained mode (have bridges or entry
- nodes in the configuration file), this value should be 1 otherwise the
- proposed value is 3.
- DIR
- If this argument is set, we should only consider guards that can
- be directory guards. If not set, we will consider all guards.
- The primary work of START is to initialize the state machine depicted
- in §2.2. The initial state of the machine is defined by:
- GUARDS
- This is a set of all guards from the consensus. It will primarily be used
- to fill in SAMPLED_GUARDS
- FILTERED_SAMPLED
- This is a set that contains all guards that we are willing to connect to.
- It will be obtained from calling FILTER_SET with SAMPLED_GUARDS as
- argument.
- REMAINING_GUARDS
- This is a running set of the guards we have not yet tried to connect to.
- It should be initialized to be FILTERED_SAMPLED without USED_GUARDS.
- STATE
- A variable that keeps track of which state in the state
- machine we are currently in. It should be initialized to
- STATE_PRIMARY_GUARDS.
- PRIMARY_GUARDS
- This list keeps track of our primary guards. These are guards
- that we will prioritize when trying to connect, and will also
- retry more often in case of failure with other guards.
- It should be initialized by calling algorithm
- NEXT_PRIMARY_GUARD repeatedly until PRIMARY_GUARDS contains
- N_PRIMARY_GUARDS elements.
- §2.2. The NEXT algorithm
- The NEXT algorithm is composed of several different possibly flows. The
- first one is a simple state machine that can transfer between two
- different states. Every time NEXT is invoked, it will resume at the
- state where it left off previously. In the course of selecting an
- entry guard, a new consensus can arrive. When that happens we need
- to update the data structures used, but nothing else should change.
- Before jumping in to the state machine, we should first check if it
- was at least PRIMARY_GUARDS_RETRY_INTERVAL minutes since we tried
- any of the PRIMARY_GUARDS. If this is the case, and we are not in
- STATE_PRIMARY_GUARDS, we should save the previous state and set the
- state to STATE_PRIMARY_GUARDS.
- §2.2.1. The STATE_PRIMARY_GUARDS state
- Return each entry in PRIMARY_GUARDS in turn. For each entry, if the
- guard should be retried and considered suitable use it. A guard is
- considered to eligible to retry if is marked for retry or is live
- and id not bad. Also, a guard is considered to be suitable if is
- live and, if is a directory it should not be a cache.
- If all entries have been tried transition to STATE_TRY_REMAINING.
- §2.2.2. The STATE_TRY_REMAINING state
- Return each entry in USED_GUARDS that is not in PRIMARY_GUARDS in
- turn.For each entry, if a guard is found return it.
- Return each entry from REMAINING_GUARDS in turn.
- For each entry, if the guard should be retried and considered
- suitable use it and mark it as unreachable. A guard is
- considered to eligible to retry if is marked for retry or is live
- and id not bad. Also, a guard is considered to be suitable if is
- live and, if is a directory it should not be a cache.
- If no entries remain in REMAINING_GUARDS, transition to
- STATE_PRIMARY_GUARDS.
- §2.2.3. ON_NEW_CONSENSUS
- First, ensure that all guard profiles are updated with information
- about whether they were in the newest consensus or not.
- Update the bad status for all guards in USED_GUARDS and SAMPLED_GUARDS.
- Remove all dead guards from USED_GUARDS and SAMPLED_GUARDS.
- Remove all obsolete guards from USED_GUARDS and SAMPLED_GUARDS.
- §2.3. The SHOULD_CONTINUE algorithm
- This algorithm takes as an argument a boolean indicating whether the
- circuit was successfully built or not.
- After the caller have tried to build a circuit with a returned
- guard, they should invoke SHOULD_CONTINUE to understand if the
- algorithm is finished or not. SHOULD_CONTINUE will always return
- true if the circuit failed. If the circuit succeeded,
- SHOULD_CONTINUE will always return false, unless the guard that
- succeeded was the first guard to succeed after
- INTERNET_LIKELY_DOWN_INTERVAL minutes - in that case it will set the
- state to STATE_PRIMARY_GUARDS and return true.
- §2.4. The END algorithm
- The goal of this algorithm is simply to make sure that we keep track
- of successful connections made. This algorithm should be invoked
- with the guard that was used to correctly set up a circuit.
- Once invoked, this algorithm will mark the guard as used, and make
- sure it is in USED_GUARDS, by adding it at the end if it was not there.
- §2.5. Helper algorithms
- These algorithms are used in the above algorithms, but have been
- separated out here in order to make the flow clearer.
- NEXT_PRIMARY_GUARD
- - Return the first entry from USED_GUARDS that is not in
- PRIMARY_GUARDS and that is in the most recent consensus.
- - If USED_GUARDS is empty, use NEXT_BY_BANDWIDTH with
- REMAINING_GUARDS as the argument.
- NEXT_BY_BANDWIDTH
- - Takes G as an argument, which should be a set of guards to
- choose from.
- - Return a randomly select element from G, weighted by bandwidth.
- FILTER_SET
- - Takes G as an argument, which should be a set of guards to filter.
- - Filter out guards in G that don't comply with IS_LIVE (see
- §D. Appendix for details).
- - If the filtered set is smaller than MINIMUM_FILTERED_SAMPLE_SIZE and G
- is smaller than MAXIMUM_SAMPLE_SIZE_THRESHOLD, expand G and try to
- filter out again. G is expanded by adding one new guard at a time using
- NEXT_BY_BANDWIDTH with GUARDS as an argument.
- - If G is not smaller than MAXIMUM_SAMPLE_SIZE_THRESHOLD, G should not be
- expanded. Abort execution of this function by returning null and report
- an error to the user.
- §3. Consensus Parameters, & Configurable Variables
- This proposal introduces several new parameters that ideally should
- be set in the consensus but that should also be possible to
- set or override in the client configuration file. Some of these have
- proposed values, but for others more simulation and trial needs to
- happen.
- PRIMARY_GUARDS_RETRY_INTERVAL
- In order to make it more likely we connect to a primary guard,
- we would like to retry the primary guards more often than other
- types of guards. This parameter controls how many minutes should
- pass before we consider retrying primary guards again. The
- proposed value is 3.
- SAMPLE_SET_THRESHOLD
- In order to allow us to recognize completely unreachable network,
- we would like to avoid connecting to too many guards before switching
- modes. We also want to avoid exposing ourselves to too many nodes in a
- potentially hostile situation. This parameter, expressed as a
- fraction, determines the number of guards we should keep as the
- sampled set of the only guards we will consider connecting
- to. It will be used as a fraction for the sampled set.
- If we assume there are 1900 guards, a setting of 0.02
- means we will have a sample set of 38 guards.
- This limits our total exposure. Proposed value is 0.02.
- MINIMUM_FILTERED_SAMPLE_SIZE
- The minimum size of the sampled set after filtering out nodes based on
- client configuration (FILTERED_SAMPLED). Proposed value is ???.
- MAXIMUM_SAMPLE_SIZE_THRESHOLD
- In order to guarantee a minimum size of guards after filtering,
- we expand SAMPLED_GUARDS until a limit. This fraction of GUARDS will be
- used as an upper bound when expanding SAMPLED_GUARDS.
- Proposed value is 0.03.
- INTERNET_LIKELY_DOWN_INTERVAL
- The number of minutes since we started trying to find an entry
- guard before we should consider the network down and consider
- retrying primary guards before using a functioning guard
- found. Proposed value 5.
- §4. Security properties and behavior under various conditions
- Under normal conditions, this algorithm will allow us to quickly
- connect and use guards we have used before with high likelihood of
- working. Assuming the first primary guard is reachable and in the
- consensus, this algorithm will deterministically always return that
- guard.
- Under dystopic conditions (when a firewall is in place that blocks
- all ports except for potentially port 80 and 443), this algorithm
- will try to connect to 2% of all guards before switching modes to try
- dystopic guards. Currently, that means trying to connect to circa 40
- guards before getting a successful connection. If we assume a
- connection try will take maximum 10 seconds, that means it will take
- up to 6 minutes to get a working connection.
- When the network is completely down, we will try to connect to 2% of
- all guards plus 2% of all dystopic guards before realizing we are
- down. This means circa 50 guards tried assuming there are 1900 guards
- in the network.
- In terms of exposure, we will connect to a maximum of 2% of all
- guards plus 2% of all dystopic guards, or 3% of all guards,
- whichever is lower. If N is the number of guards, and k is the
- number of guards an attacker controls, that means an attacker would
- have a probability of 1-(1-(k/N)^2)^(N * 0.03) to have one of their
- guards selected before we fall back. In real terms, this means an
- attacker would need to control over 10% of all guards in order to
- have a larger than 50% chance of controlling a guard for any given client.
- In addition, since the sampled set changes slowly (the suggestion
- here is that guards in it expire every month) it is not possible for
- an attacker to force a connection to an entry guard that isn't
- already in the users sampled set.
- §A. Appendix: An example usage
- In order to clarify how this algorithm is supposed to be used, this
- pseudo code illustrates the building of a circuit:
- ESTABLISH_CIRCUIT:
- if chosen_entry_node = NULL
- if context = NULL
- context = ALGO_CHOOSE_ENTRY_GUARD_START(used_guards,
- sampled_guards=[],
- options,
- n_primary_guards=3,
- dir=false,
- guards_in_consensus)
- chosen_entry_node = ALGO_CHOOSE_ENTRY_GUARD_NEXT(context)
- if not IS_SUITABLE(chosen_entry_node)
- try another entry guard
- circuit = composeCircuit(chosen_entry_node)
- return circuit
- ON_FIRST_HOP_CALLBACK(channel):
- if !SHOULD_CONTINUE:
- ALGO_CHOOSE_ENTRY_GUARD_END(entryGuard)
- else
- chosen_entry_node = NULL
- §B. Appendix: Entry Points in Tor
- In order to clarify how this algorithm is supposed to be integrated with
- Tor, here are some entry points to trigger actions mentioned in spec:
- When establish_circuit:
- If *chosen_entry_node* doesn't exist
- If *context* exist, populate the first one as *context*
- Otherwise, use ALGO_CHOOSE_ENTRY_GUARD_START to initalize a new *context*.
- After this when we want to choose_good_entry_server, we will use
- ALGO_CHOOSE_ENTRY_GUARD_NEXT to get a candidate.
- Use chosen_entry_node to build_circuit and handle_first_hop,
- return this circuit
- When entry_guard_register_connect_status(should_continue):
- if !should_continue:
- Call ALGO_CHOOSE_ENTRY_GUARD_END(chosen_entry_node)
- else:
- Set chosen_entry_node to NULL
- When new directory_info_has_arrived:
- Do ON_NEW_CONSENSUS
- §C. Appendix: IS_SUITABLE helper function
- A guard is suitable if it satisfies all of the folowing conditions:
- - It's considered to be live, according to IS_LIVE.
- - It's a directory cache if a directory guard is requested.
- - It's not the chosen exit node.
- - It's not in the family of the chosen exit node.
- This conforms to the existing conditions in "populate_live_entry_guards()".
- §D. Appendix: IS_LIVE helper function
- A guard is considered live if it satisfies all of the folowing conditions:
- - It's not disabled because of path bias issues (path_bias_disabled).
- - It was not observed to become unusable according to the directory or
- the user configuration (bad_since).
- - It's marked for retry (can_retry) or it's been unreachable for some
- time (unreachable_since) but enough time has passed since we last tried
- to connect to it (entry_is_time_to_retry).
- - It's in our node list, meaninig it's present in the latest consensus.
- - It has a usable descriptor (either a routerdescriptor or a
- microdescriptor) unless a directory guard is requested.
- - It's a general-purpose router unless UseBridges is configured.
- - It's reachable by the configuration (fascist_firewall_allows_node).
- This conforms to the existing conditions in "entry_is_live()".
- A guard is observed to become unusable according to the directory or the
- user configuration if it satisfies any of the following conditions:
- - It's not in our node list, meaninig it's present in the latest consensus.
- - It's not currently running (is_running).
- - It's not a bridge and not a configured bridge
- (node_is_a_configured_bridge) and UseBridges is True.
- - It's not a possible guard and is not in EntryNodes and UseBridges is
- False.
- - It's in ExcludeNodes. Nevertheless this is ignored when
- loading from config.
- - It's not reachable by the configuration (fascist_firewall_allows_node).
- - It's disabled because of path bias issues (path_bias_disabled).
- This conforms to the existing conditions in "entry_guards_compute_status()".
- §E. Appendix: UseBridges and Bridges configurations
- This is mutually exclusive with EntryNodes.
- If options->UseBridges OR options->EntryNodes:
- - guards = populate_live_entry_guards() - this is the "bridge flavour" of
- IS_SUITABLE as mentioned before.
- - return node_sl_choose_by_bandwidth(guards, WEIGHT_FOR_GUARD)
- This is "choose a guard from S by bandwidth weight".
- UseBridges and Bridges must be set together. Bridges go to bridge_list (via
- bridge_add_from_config()), but how is it used?
- learned_bridge_descriptor() adds the bridge to the global entry_guards if
- UseBridges = True.
- We either keep the existing global entry_guards OR incorporate bridges in the
- proposal (remove non bridges from USED_GUARDS, and REMAINING_GUARDS = bridges?)
- If UseBridges is set as true, we need to fill the SAMPLED_GUARDS
- with bridges specified and learned from consensus.
- §F. Appendix: EntryNodes configuration
- This is mutually exclusive with Bridges.
- The global entry_guards will be updated with entries in EntryNodes
- (see entry_guards_set_from_config()).
- If EntryNodes is set, we need to fill the SAMPLED_GUARDS with
- EntryNodes specified in options.
- §G. Appendix: IS_BAD helper function
- A guard is considered bad if is not included in the newest
- consensus.
- §H. Appendix: IS_DEAD helper function
- A guard is considered dead if it's marked as bad for
- ENTRY_GUARD_REMOVE_AFTER period (30 days) unless they have been disabled
- because of path bias issues (path_bias_disabled).
- §I. Appendix: IS_OBSOLETE helper function
- A guard is considered obsolete if it was chosen by an Tor
- version we can't recognize or it was chosen more than GUARD_LIFETIME ago.
- -*- coding: utf-8 -*-
|