guard-spec.txt 32 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850
  1. Tor Guard Specification
  2. Isis Lovecruft
  3. George Kadianakis
  4. Ola Bini
  5. Nick Mathewson
  6. 1. Introduction and motivation
  7. Tor uses entry guards to prevent an attacker who controls some
  8. fraction of the network from observing a fraction of every user's
  9. traffic. If users chose their entries and exits uniformly at
  10. random from the list of servers every time they build a circuit,
  11. then an adversary who had (k/N) of the network would deanonymize
  12. F=(k/N)^2 of all circuits... and after a given user had built C
  13. circuits, the attacker would see them at least once with
  14. probability 1-(1-F)^C. With large C, the attacker would get a
  15. sample of every user's traffic with probability 1.
  16. To prevent this from happening, Tor clients choose a small number
  17. of guard nodes (e.g. 3). These guard nodes are the only
  18. nodes that the client will connect to directly. If they are not
  19. compromised, the user's paths are not compromised.
  20. This specification outlines Tor's guard selection algorithm,
  21. which tries to meet the following goals:
  22. - Heuristics and algorithms for determining how and which guards
  23. are chosen should be kept as simple and easy to understand as
  24. possible.
  25. - Clients in censored regions or who are behind a fascist
  26. firewall who connect to the Tor network should not experience
  27. any significant disadvantage in terms of reachability or
  28. usability.
  29. - Tor should make a best attempt at discovering the most
  30. appropriate behavior, with as little user input and
  31. configuration as possible.
  32. - Tor clients should discover usable guards without too much
  33. delay.
  34. - Tor clients should resist (to the extent possible) attacks
  35. that try to force them onto compromised guards.
  36. 2. State instances
  37. In the algorithm below, we describe a set of persistent and
  38. non-persistent state variables. These variables should be
  39. treated as an object, of which multiple instances can exist.
  40. In particular, we specify the use of three particular instances:
  41. A. UseBridges
  42. If UseBridges is set, then we replace the {GUARDS} set in
  43. [Sec:GUARDS] below with the list of configured
  44. bridges. We maintain a separate persistent instance of
  45. {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived
  46. values for the UseBridges case.
  47. In this case, we impose no upper limit on the sample size.
  48. B. EntryNodes / ExcludeNodes / Reachable*Addresses /
  49. FascistFirewall / ClientUseIPv4=0
  50. If one of the above options is set, and UseBridges is not,
  51. then we compare the fraction of usable guards in the consensus
  52. to the total number of guards in the consensus.
  53. If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC},
  54. we use a separate instance of the state.
  55. (While Tor is running, we do not change back and forth between
  56. the separate instance of the state and the default instance
  57. unless the fraction of usable guards is 5% higher than, or 5%
  58. lower than, {MEANINGFUL_RESTRICTION_FRAC}. This prevents us
  59. from flapping back and forth between instances if we happen to
  60. hit {MEANINGFUL_RESTRICTION_FRAC} exactly.
  61. If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a
  62. separate instance of the state, and warn the user.
  63. [TODO: should we have a different instance for each set of heavily
  64. restricted options?]
  65. C. Default
  66. If neither of the above variant-state instances is used,
  67. we use a default instance.
  68. 3. Circuit Creation, Entry Guard Selection (1000 foot view)
  69. A circuit in Tor is a path through the network connecting a client to
  70. its destination. At a high-level, a three-hop exit circuit will look
  71. like this:
  72. Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination
  73. Entry guards are the only nodes which a client will connect to
  74. directly. Exit relays are the nodes by which traffic exits the
  75. Tor network in order to connect to an external destination.
  76. 3.1 Path selection
  77. For any multi-hop circuit, at least one entry guard and middle node(s) are
  78. required. An exit node is required if traffic will exit the Tor
  79. network. Depending on its configuration, a relay listed in a
  80. consensus could be used for any of these roles. However, this
  81. specification defines how entry guards specifically should be selected and
  82. managed, as opposed to middle or exit nodes.
  83. 3.1.1 Entry guard selection
  84. At a high level, a relay listed in a consensus will move through the
  85. following states in the process from initial selection to eventual
  86. usage as an entry guard:
  87. relays listed in consensus
  88. |
  89. sampled
  90. | |
  91. confirmed filtered
  92. | | |
  93. primary usable_filtered
  94. Relays listed in the latest consensus can be sampled for guard usage
  95. if they have the "Guard" flag. Sampling is random but weighted by
  96. bandwidth.
  97. Once a path is built and a circuit established using this guard, it
  98. is marked as confirmed. Until this point, guards are first sampled
  99. and then filtered based on information such as our current
  100. configuration (see SAMPLED and FILTERED sections) and later marked as
  101. usable_filtered if the guard is not primary but can be reached.
  102. It is always preferable to use a primary guard when building a new
  103. circuit in order to reduce guard churn; only on failure to connect to
  104. existing primary guards will new guards be used.
  105. 3.1.2 Middle and exit node selection
  106. Middle nodes are selected at random from relays listed in the
  107. latest consensus, weighted by bandwidth. Exit nodes are chosen
  108. similarly but restricted to relays with a sufficiently permissive
  109. exit policy.
  110. 3.2 Circuit Building
  111. Once a path is chosen, Tor will use this path to build a new circuit.
  112. If the circuit is built successfully, Tor will either use it
  113. immediately, or Tor will wait for a circuit with a more preferred
  114. guard if there's a good chance that it will be able to make one.
  115. If the circuit fails in a way that makes us conclude that a guard
  116. is not reachable, the guard is marked as unreachable, the circuit is
  117. closed, and waiting circuits are updated.
  118. 4. The algorithm.
  119. 4.0. The guards listed in the current consensus. [Section:GUARDS]
  120. By {set:GUARDS} we mean the set of all guards in the current
  121. consensus that are usable for all circuits and directory
  122. requests. (They must have the flags: Stable, Fast, V2Dir, Guard.)
  123. **Rationale**
  124. We require all guards to have the flags that we potentially need
  125. from any guard, so that all guards are usable for all circuits.
  126. 4.1. The Sampled Guard Set. [Section:SAMPLED]
  127. We maintain a set, {set:SAMPLED_GUARDS}, that persists across
  128. invocations of Tor. It is an unordered subset of the nodes that
  129. we have seen listed as a guard in the consensus at some point.
  130. For each such guard, we record persistently:
  131. - {pvar:ADDED_ON_DATE}: The date on which it was added to
  132. sampled_guards.
  133. We set this value to a point in the past, using
  134. RAND(now, {GUARD_LIFETIME}/10). See
  135. Appendix [RANDOM] below.
  136. - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to
  137. sampled_guards.
  138. - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in
  139. the _most recent_ consensus we have seen.
  140. - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date
  141. of the earliest consensus in which this guard was listed such that we
  142. have not seen it listed in any later consensus. Otherwise "None."
  143. We randomize this to a point in the past, based on
  144. RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5)
  145. For each guard in {SAMPLED_GUARDS}, we also record this data,
  146. non-persistently:
  147. - {tvar:last_tried_connect}: A 'last tried to connect at'
  148. time. Default 'never'.
  149. - {tvar:is_reachable}: an "is reachable" tristate, with
  150. possible values { <state:yes>, <state:no>, <state:maybe> }.
  151. Default '<maybe>.'
  152. [Note: "yes" is not strictly necessary, but I'm
  153. making it distinct from "maybe" anyway, to make our
  154. logic clearer. A guard is "maybe" reachable if it's
  155. worth trying. A guard is "yes" reachable if we tried
  156. it and succeeded.]
  157. - {tvar:failing_since}: The first time when we failed to
  158. connect to this guard. Defaults to "never". Reset to
  159. "never" when we successfully connect to this guard.
  160. - {tvar:is_pending} A "pending" flag. This indicates that we
  161. are trying to build an exploratory circuit through the
  162. guard, and we don't know whether it will succeed.
  163. We require that {SAMPLED_GUARDS} contain at least
  164. {MIN_FILTERED_SAMPLE} guards from the consensus (if possible),
  165. but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards
  166. in the consensus, and not more than {MAX_SAMPLE_SIZE} in total.
  167. (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we
  168. set the maximum at {MIN_FILTERED_SAMPLE}.)
  169. To add a new guard to {SAMPLED_GUARDS}, pick an entry at random
  170. from ({GUARDS} - {SAMPLED_GUARDS}), weighted by bandwidth.
  171. We remove an entry from {SAMPLED_GUARDS} if:
  172. * We have a live consensus, and {IS_LISTED} is false, and
  173. {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER}
  174. days in the past.
  175. OR
  176. * We have a live consensus, and {ADDED_ON_DATE} is over
  177. {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either
  178. "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago.
  179. Note that {SAMPLED_GUARDS} does not depend on our configuration.
  180. It is possible that we can't actually connect to any of these
  181. guards.
  182. **Rationale**
  183. The {SAMPLED_GUARDS} set is meant to limit the total number of
  184. guards that a client will connect to in a given period. The
  185. upper limit on its size prevents us from considering too many
  186. guards.
  187. The first expiration mechanism is there so that our
  188. {SAMPLED_GUARDS} list does not accumulate so many dead
  189. guards that we cannot add new ones.
  190. The second expiration mechanism makes us rotate our guards slowly
  191. over time.
  192. 4.2. The Usable Sample [Section:FILTERED]
  193. We maintain another set, {set:FILTERED_GUARDS}, that does not
  194. persist. It is derived from:
  195. - {SAMPLED_GUARDS}
  196. - our current configuration,
  197. - the path bias information.
  198. A guard is a member of {set:FILTERED_GUARDS} if and only if all
  199. of the following are true:
  200. - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to
  201. true.
  202. - It is not disabled because of path bias issues.
  203. - It is not disabled because of ReachableAddresses policy,
  204. the ClientUseIPv4 setting, the ClientUseIPv6 setting,
  205. the FascistFirewall setting, or some other
  206. option that prevents using some addresses.
  207. - It is not disabled because of ExcludeNodes.
  208. - It is a bridge if UseBridges is true; or it is not a
  209. bridge if UseBridges is false.
  210. - Is included in EntryNodes if EntryNodes is set and
  211. UseBridges is not. (But see 2.B above).
  212. We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which
  213. is defined to be the subset of {FILTERED_GUARDS} where
  214. {is_reachable} is <yes> or <maybe>.
  215. We try to maintain a requirement that {USABLE_FILTERED_GUARDS}
  216. contain at least {MIN_FILTERED_SAMPLE} elements:
  217. Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
  218. and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
  219. add new elements to {SAMPLED_GUARDS} until one of the following
  220. is true:
  221. * {USABLE_FILTERED_GUARDS} is large enough,
  222. OR
  223. * {SAMPLED_GUARDS} is at its maximum size.
  224. ** Rationale **
  225. These filters are applied _after_ sampling: if we applied them
  226. before the sampling, then our sample would reflect the set of
  227. filtering restrictions that we had in the past.
  228. 4.3. The confirmed-guard list. [Section:CONFIRMED]
  229. [formerly USED_GUARDS]
  230. We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}.
  231. It contains guards that we have used before, in our preference
  232. order of using them. It is a subset of {SAMPLED_GUARDS}. For
  233. each guard in this list, we store persistently:
  234. - {pvar:IDENTITY} Its fingerprint.
  235. - {pvar:CONFIRMED_ON_DATE} When we added this guard to
  236. {CONFIRMED_GUARDS}.
  237. Randomized to a point in the past as RAND(now, {GUARD_LIFETIME}/10).
  238. We append new members to {CONFIRMED_GUARDS} when we mark a circuit
  239. built through a guard as "for user traffic."
  240. Whenever we remove a member from {SAMPLED_GUARDS}, we also remove
  241. it from {CONFIRMED_GUARDS}.
  242. [Note: You can also regard the {CONFIRMED_GUARDS} list as a
  243. total ordering defined over a subset of {SAMPLED_GUARDS}.]
  244. Definition: we call Guard A "higher priority" than another Guard B
  245. if, when A and B are both reachable, we would rather use A. We
  246. define priority as follows:
  247. * Every guard in {CONFIRMED_GUARDS} has a higher priority
  248. than every guard not in {CONFIRMED_GUARDS}.
  249. * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier
  250. on the {CONFIRMED_GUARDS} list has a higher priority.
  251. * Among guards that do not appear in {CONFIRMED_GUARDS},
  252. {is_pending}==true guards have higher priority.
  253. * Among those, the guard with earlier {last_tried_connect} time
  254. has higher priority.
  255. * Finally, among guards that do not appear in
  256. {CONFIRMED_GUARDS} with {is_pending==false}, all have equal
  257. priority.
  258. ** Rationale **
  259. We add elements to this ordering when we have actually used them
  260. for building a usable circuit. We could mark them at some other
  261. time (such as when we attempt to connect to them, or when we
  262. actually connect to them), but this approach keeps us from
  263. committing to a guard before we actually use it for sensitive
  264. traffic.
  265. 4.4. The Primary guards [Section:PRIMARY]
  266. We keep a run-time non-persistent ordered list of
  267. {list:PRIMARY_GUARDS}. It is a subset of {FILTERED_GUARDS}. It
  268. contains {N_PRIMARY_GUARDS} elements.
  269. To compute primary guards, take the ordered intersection of
  270. {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
  271. {N_PRIMARY_GUARDS} elements. If there are fewer than
  272. {N_PRIMARY_GUARDS} elements, append additional elements to
  273. PRIMARY_GUARDS chosen _uniformly_ at random from
  274. ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
  275. Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
  276. until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed
  277. elements always precede unconfirmed ones in the {PRIMARY_GUARDS} list.
  278. Note that {PRIMARY_GUARDS} do not have to be in
  279. {USABLE_FILTERED_GUARDS}: they might be unreachable.
  280. ** Rationale **
  281. These guards are treated differently from other guards. If one of
  282. them is usable, then we use it right away. For other guards
  283. {FILTERED_GUARDS}, if it's usable, then before using it we might
  284. first double-check whether perhaps one of the primary guards is
  285. usable after all.
  286. 4.5. Retrying guards. [Section:RETRYING]
  287. (We run this process as frequently as needed. It can be done once
  288. a second, or just-in-time.)
  289. If a primary sampled guard's {is_reachable} status is <no>, then
  290. we decide whether to update its {is_reachable} status to <maybe>
  291. based on its {last_tried_connect} time, its {failing_since} time,
  292. and the {PRIMARY_GUARDS_RETRY_SCHED} schedule.
  293. If a non-primary sampled guard's {is_reachable} status is <no>, then
  294. we decide whether to update its {is_reachable} status to <maybe>
  295. based on its {last_tried_connect} time, its {failing_since} time,
  296. and the {GUARDS_RETRY_SCHED} schedule.
  297. ** Rationale **
  298. An observation that a guard has been 'unreachable' only lasts for
  299. a given amount of time, since we can't infer that it's unreachable
  300. now from the fact that it was unreachable a few minutes ago.
  301. 4.6. Selecting guards for circuits. [Section:SELECTING]
  302. Every origin circuit is now in one of these states:
  303. <state:usable_on_completion>,
  304. <state:usable_if_no_better_guard>,
  305. <state:waiting_for_better_guard>, or
  306. <state:complete>.
  307. You may only attach streams to <complete> circuits.
  308. (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO
  309. cells, and INTRODUCE cells on <complete> circuits.)
  310. The per-circuit state machine is:
  311. New circuits are <usable_on_completion> or
  312. <usable_if_no_better_guard>.
  313. A <usable_on_completion> circuit may become <complete>, or may
  314. fail.
  315. A <usable_if_no_better_guard> circuit may become
  316. <usable_on_completion>; may become <waiting_for_better_guard>; or may
  317. fail.
  318. A <waiting_for_better_guard> circuit will become <complete>, or will
  319. be closed, or will fail.
  320. A <complete> circuit remains <complete> until it fails or is
  321. closed.
  322. Each of these transitions is described below.
  323. We keep, as global transient state:
  324. * {tvar:last_time_on_internet} -- the last time at which we
  325. successfully used a circuit or connected to a guard. At
  326. startup we set this to "infinitely far in the past."
  327. When we want to build a circuit, and we need to pick a guard:
  328. * If any entry in PRIMARY_GUARDS has {is_reachable} status of
  329. <maybe> or <yes>, return one of the first
  330. {NUM_USABLE_PRIMARY_GUARDS} or
  331. {NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} such guards, chosen
  332. uniformly at random. The circuit is <usable_on_completion>.
  333. [Note: We do not use {is_pending} on primary guards, since we
  334. are willing to try to build multiple circuits through them
  335. before we know for sure whether they work, and since we will
  336. not use any non-primary guards until we are sure that the
  337. primary guards are all down. (XX is this good?)]
  338. * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS}
  339. and {USABLE_FILTERED_GUARDS} is nonempty, return the first
  340. entry in that intersection that has {is_pending} set to
  341. false. Set its value of {is_pending} to true. The circuit
  342. is now <usable_if_no_better_guard>. (If all entries have
  343. {is_pending} true, pick the first one.)
  344. * Otherwise, if there is no such entry, select a member at
  345. random from {USABLE_FILTERED_GUARDS}. Set its {is_pending}
  346. field to true. The circuit is <usable_if_no_better_guard>.
  347. * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
  348. all the sampled guards. In this case we proceed by marking all guards
  349. as <maybe> reachable so that we can keep on trying circuits.
  350. Whenever we select a guard for a new circuit attempt, we update the
  351. {last_tried_connect} time for the guard to 'now.'
  352. In some cases (for example, when we need a certain directory feature,
  353. or when we need to avoid using a certain exit as a guard), we need to
  354. restrict the guards that we use for a single circuit. When this happens, we
  355. remember the restrictions that applied when choosing the guard for
  356. that circuit, since we will need them later (see [UPDATE_WAITING].).
  357. ** Rationale **
  358. We're getting to the core of the algorithm here. Our main goals are to
  359. make sure that
  360. 1. If it's possible to use a primary guard, we do.
  361. 2. We probably use the first primary guard.
  362. So we only try non-primary guards if we're pretty sure that all
  363. the primary guards are down, and we only try a given primary guard
  364. if the earlier primary guards seem down.
  365. When we _do_ try non-primary guards, however, we only build one
  366. circuit through each, to give it a chance to succeed or fail. If
  367. ever such a circuit succeeds, we don't use it until we're pretty
  368. sure that it's the best guard we're getting. (see below).
  369. [XXX timeout.]
  370. 4.7. When a circuit fails. [Section:ON_FAIL]
  371. When a circuit fails in a way that makes us conclude that a guard
  372. is not reachable, we take the following steps:
  373. * Set the guard's {is_reachable} status to <no>. If it had
  374. {is_pending} set to true, we make it non-pending.
  375. * Close the circuit, of course. (This removes it from
  376. consideration by the algorithm in [UPDATE_WAITING].)
  377. * Update the list of waiting circuits. (See [UPDATE_WAITING]
  378. below.)
  379. [Note: the existing Tor logic will cause us to create more
  380. circuits in response to some of these steps; and also see
  381. [ON_CONSENSUS].]
  382. ** Rationale **
  383. See [SELECTING] above for rationale.
  384. 4.8. When a circuit succeeds [Section:ON_SUCCESS]
  385. When a circuit succeeds in a way that makes us conclude that a
  386. guard _was_ reachable, we take these steps:
  387. * We set its {is_reachable} status to <yes>.
  388. * We set its {failing_since} to "never".
  389. * If the guard was {is_pending}, we clear the {is_pending} flag.
  390. * If the guard was not a member of {CONFIRMED_GUARDS}, we add
  391. it to the end of {CONFIRMED_GUARDS}.
  392. * If this circuit was <usable_on_completion>, this circuit is
  393. now <complete>. You may attach streams to this circuit,
  394. and use it for hidden services.
  395. * If this circuit was <usable_if_no_better_guard>, it is now
  396. <waiting_for_better_guard>. You may not yet attach streams to it.
  397. Then check whether the {last_time_on_internet} is more than
  398. {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:
  399. * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
  400. reachable.
  401. * If it is not, update the list of waiting circuits. (See
  402. [UPDATE_WAITING] below)
  403. [Note: the existing Tor logic will cause us to create more
  404. circuits in response to some of these steps; and see
  405. [ON_CONSENSUS].]
  406. ** Rationale **
  407. See [SELECTING] above for rationale.
  408. 4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING]
  409. We run this procedure whenever it's possible that a
  410. <waiting_for_better_guard> circuit might be ready to be called
  411. <complete>.
  412. * If any circuit C1 is <waiting_for_better_guard>, AND:
  413. * All primary guards have reachable status of <no>.
  414. * There is no circuit C2 that "blocks" C1.
  415. Then, upgrade C1 to <complete>.
  416. Definition: In the algorithm above, C2 "blocks" C1 if:
  417. * C2 obeys all the restrictions that C1 had to obey, AND
  418. * C2 has higher priority than C1, AND
  419. * Either C2 is <complete>, or C2 is <waiting_for_better_guard>,
  420. or C2 has been <usable_if_no_better_guard> for no more than
  421. {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds.
  422. We run this procedure periodically:
  423. * If any circuit stays in <waiting_for_better_guard>
  424. for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds,
  425. time it out.
  426. **Rationale**
  427. If we open a connection to a guard, we might want to use it
  428. immediately (if we're sure that it's the best we can do), or we
  429. might want to wait a little while to see if some other circuit
  430. which we like better will finish.
  431. When we mark a circuit <complete>, we don't close the
  432. lower-priority circuits immediately: we might decide to use
  433. them after all if the <complete> circuit goes down before
  434. {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.
  435. 4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS]
  436. We update {GUARDS}.
  437. For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
  438. {FIRST_UNLISTED_AT}.
  439. [**] We remove entries from {SAMPLED_GUARDS} if appropriate,
  440. according to the sampled-guards expiration rules. If they were
  441. in {CONFIRMED_GUARDS}, we also remove them from
  442. {CONFIRMED_GUARDS}.
  443. We recompute {FILTERED_GUARDS}, and everything that derives from
  444. it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}.
  445. (Whenever one of the configuration options that affects the
  446. filter is updated, we repeat the process above, starting at the
  447. [**] line.)
  448. 4.11. Deciding whether to generate a new circuit.
  449. [Section:NEW_CIRCUIT_NEEDED]
  450. We generate a new circuit when we don't have
  451. enough circuits either built or in-progress to handle a given
  452. stream, or an expected stream.
  453. For the purpose of this rule, we say that <waiting_for_better_guard>
  454. circuits are neither built nor in-progress; that <complete>
  455. circuits are built; and that the other states are in-progress.
  456. 4.12. When we are missing descriptors.
  457. [Section:MISSING_DESCRIPTORS]
  458. We need either a router descriptor or a microdescriptor in order
  459. to build a circuit through a guard. If we do not have such a
  460. descriptor for a guard, we can still use the guard for one-hop
  461. directory fetches, but not for longer circuits.
  462. (Also, when we are missing descriptors for our first
  463. {NUM_USABLE_PRIMARY_GUARDS} primary guards, we don't build
  464. circuits at all until we have fetched them.)
  465. A. Appendices
  466. A.0. Acknowledgements
  467. This research was supported in part by NSF grants CNS-1111539,
  468. CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.
  469. A.1. Parameters with suggested values. [Section:PARAM_VALS]
  470. (All suggested values chosen arbitrarily)
  471. {param:MAX_SAMPLE_THRESHOLD} -- 20%
  472. {param:MAX_SAMPLE_SIZE} -- 60
  473. {param:GUARD_LIFETIME} -- 120 days
  474. {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days
  475. [previously ENTRY_GUARD_REMOVE_AFTER]
  476. {param:MIN_FILTERED_SAMPLE} -- 20
  477. {param:N_PRIMARY_GUARDS} -- 3
  478. {param:PRIMARY_GUARDS_RETRY_SCHED}
  479. -- every 30 minutes for the first 6 hours.
  480. -- every 2 hours for the next 3.75 days.
  481. -- every 4 hours for the next 3 days.
  482. -- every 9 hours thereafter.
  483. {param:GUARDS_RETRY_SCHED} -- 1 hour
  484. -- every hour for the first 6 hours.
  485. -- every 4 hours for the next 3.75 days.
  486. -- every 18 hours for the next 3 days.
  487. -- every 36 hours thereafter.
  488. {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes
  489. {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds
  490. {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes
  491. {param:MEANINGFUL_RESTRICTION_FRAC} -- .2
  492. {param:EXTREME_RESTRICTION_FRAC} -- .01
  493. {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days
  494. {param:NUM_USABLE_PRIMARY_GUARDS} -- 1
  495. {param:NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} -- 3
  496. A.2. Random values [Section:RANDOM]
  497. Frequently, we want to randomize the expiration time of something
  498. so that it's not easy for an observer to match it to its start
  499. time. We do this by randomizing its start date a little, so that
  500. we only need to remember a fixed expiration interval.
  501. By RAND(now, INTERVAL) we mean a time between now and INTERVAL in
  502. the past, chosen uniformly at random.
  503. A.3. Why not a sliding scale of primaryness? [Section:CVP]
  504. At one meeting, I floated the idea of having "primaryness" be a
  505. continuous variable rather than a boolean.
  506. I'm no longer sure this is a great idea, but I'll try to outline
  507. how it might work.
  508. To begin with: being "primary" gives it a few different traits:
  509. 1) We retry primary guards more frequently. [Section:RETRYING]
  510. 2) We don't even _try_ building circuits through
  511. lower-priority guards until we're pretty sure that the
  512. higher-priority primary guards are down. (With non-primary
  513. guards, on the other hand, we launch exploratory circuits
  514. which we plan not to use if higher-priority guards
  515. succeed.) [Section:SELECTING]
  516. 3) We retry them all one more time if a circuit succeeds after
  517. the net has been down for a while. [Section:ON_SUCCESS]
  518. We could make each of the above traits continuous:
  519. 1) We could make the interval at which a guard is retried
  520. depend continuously on its position in CONFIRMED_GUARDS.
  521. 2) We could change the number of guards we test in parallel
  522. based on their position in CONFIRMED_GUARDS.
  523. 3) We could change the rule for how long the higher-priority
  524. guards need to have been down before we call a
  525. <usable_if_no_better_guard> circuit <complete> based on a
  526. possible network-down condition. For example, we could
  527. retry the first guard if we tried it more than 10 seconds
  528. ago, the second if we tried it more than 20 seconds ago,
  529. etc.
  530. I am pretty sure, however, that if these are worth doing, they
  531. need more analysis! Here's why:
  532. * They all have the potential to leak more information about a
  533. guard's exact position on the list. Is that safe? Is there
  534. any way to exploit that? I don't think we know.
  535. * They all seem like changes which it would be relatively
  536. simple to make to the code after we implement the simpler
  537. version of the algorithm described above.
  538. A.3. Controller changes
  539. We will add to control-spec.txt a new possible circuit state, GUARD_WAIT,
  540. that can be given as part of circuit events and GETINFO responses about
  541. circuits. A circuit is in the GUARD_WAIT state when it is fully built,
  542. but we will not use it because a circuit with a better guard might
  543. become built too.
  544. A.4. Persistent state format
  545. The persistent state format doesn't need to be part of this
  546. specification, since different implementations can do it
  547. differently. Nonetheless, here's the one Tor uses:
  548. The "state" file contains one Guard entry for each sampled guard
  549. in each instance of the guard state (see section 2). The value
  550. of this Guard entry is a set of space-separated K=V entries,
  551. where K contains any nonspace character except =, and V contains
  552. any nonspace characters.
  553. Implementations must retain any unrecognized K=V entries for a
  554. sampled guard when they regenerate the state file.
  555. The order of K=V entries is not allowed to matter.
  556. Recognized fields (values of K) are:
  557. "in" -- the name of the guard state instance that this
  558. sampled guard is in. If a sampled guard is in two guard
  559. states instances, it appears twice, with a different "in"
  560. field each time. Required.
  561. "rsa_id" -- the RSA id digest for this guard, encoded in
  562. hex. Required.
  563. "bridge_addr" -- If the guard is a bridge, its configured address and
  564. port (this can be the ORPort or a pluggable transport port). Optional.
  565. "nickname" -- the guard's nickname, if any. Optional.
  566. "sampled_on" -- the date when the guard was sampled. Required.
  567. "sampled_by" -- the Tor version that sampled this guard.
  568. Optional.
  569. "unlisted_since" -- the date since which the guard has been
  570. unlisted. Optional.
  571. "listed" -- 0 if the guard is not listed; 1 if it is. Required.
  572. "confirmed_on" -- date when the guard was
  573. confirmed. Optional.
  574. "confirmed_idx" -- position of the guard in the confirmed
  575. list. Optional.
  576. "pb_use_attempts", "pb_use_successes", "pb_circ_attempts",
  577. "pb_circ_successes", "pb_successful_circuits_closed",
  578. "pb_collapsed_circuits", "pb_unusable_circuits",
  579. "pb_timeouts" -- state for the circuit path bias algorithm,
  580. given in decimal fractions. Optional.
  581. All dates here are given as a (spaceless) ISO8601 combined date
  582. and time in UTC (e.g., 2016-11-29T19:39:31).
  583. TODO. Still non-addressed issues [Section:TODO]
  584. Simulate to answer: Will this work in a dystopic world?
  585. Simulate actual behavior.
  586. For all lifetimes: instead of storing the "this began at" time,
  587. store the "remove this at" time, slightly randomized.
  588. Clarify that when you get a <complete> circuit, you might need to
  589. relaunch circuits through that same guard immediately, if they
  590. are circuits that have to be independent.
  591. Fix all items marked XX or TODO.
  592. "Directory guards" -- do they matter?
  593. Suggestion: require that all guards support downloads via BEGINDIR.
  594. We don't need to worry about directory guards for relays, since we
  595. aren't trying to prevent relay enumeration.
  596. IP version preferences via ClientPreferIPv6ORPort
  597. Suggestion: Treat it as a preference when adding to
  598. {CONFIRMED_GUARDS}, but not otherwise.