202-improved-relay-crypto.txt 33 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820
  1. Filename: 202-improved-relay-crypto.txt
  2. Title: Two improved relay encryption protocols for Tor cells
  3. Author: Nick Mathewson
  4. Created: 19 Jun 2012
  5. Status: Meta
  6. Note: This is an important development step in improving our relay
  7. crypto, but it doesn't actually say how to do this.
  8. Overview:
  9. This document describes two candidate designs for a better Tor
  10. relay encryption/decryption protocol, designed to stymie tagging
  11. attacks and better accord with best practices in protocol design.
  12. My hope is that readers will examine these protocols, evaluate their
  13. security and practicality, improve on them, and help to pick one for
  14. implementation in Tor.
  15. In section 1, I describe Tor's current relay crypto protocol, its
  16. drawbacks, how it fits in with the rest of Tor, and some
  17. requirements/desiderata for a replacement. In sections 2 and 3, I
  18. propose two alternative replacements for this protocol. In section
  19. 4, I discuss their pros and cons.
  20. 1. Background and Motivation
  21. 1.0. A short overview of Tor's current protocols
  22. The core pieces of the Tor protocol are the link protocol,
  23. the circuit extend protocol, the relay protocol, and the stream
  24. protocol. All are documented in [TorSpec].
  25. Briefly:
  26. - The link protocol handles all direct pairwise communication
  27. between nodes. Everything else is transmitted over it. It
  28. uses TLS.
  29. - The circuit extend protocol uses public-key crypto to set up
  30. multi-node virtual tunnels, called "circuits", from a client
  31. through one or more nodes.
  32. *** The relay protocol uses established circuits to communicate
  33. from a client to a node on a circuit. That's the one we'll
  34. be talking about here. ***
  35. - The stream protocol is tunneled over relay protocol; clients
  36. use it to tell servers to open anonymous TCP connections, to
  37. send data, and so forth. Servers use it to report success or
  38. failure opening anonymous TCP connections, to send data from
  39. TCP connections back to clients, and so forth.
  40. In more detail: The link protocol's job is to connect two
  41. computers with an encrypted, authenticated stream, to
  42. authenticate one or both of them to the other, and to provide a
  43. channel for passing "cells" between them. The circuit extend
  44. protocol's job is to set up circuits: persistent tunnels that
  45. connect a Tor client to an exit node through a series of
  46. (typically three) hops, each of which knows only the previous and
  47. next hop, and each of which has a set of keys that they share
  48. only with the client. Finally, the relay protocol's job is to
  49. allow a client to communicate bidirectionally with the node(s) on
  50. the circuit, once their shared keys have been established.
  51. (We'll ignore the stream protocol for the rest of this document.)
  52. Note on terminology: Tor nodes are sometimes called "servers",
  53. "relays", or "routers". I'll use all these terms more or less
  54. interchangeably. For simplicity's sake, I will call the party
  55. who is constructing and using a circuit "the client" or "Alice",
  56. even though nodes sometimes construct circuits too.
  57. Tor's internal packets are called "cells". All the cells we deal
  58. with here are 512 bytes long.
  59. The nodes in a circuit are called its "hops"; most circuits are 3
  60. hops long. This doesn't include the client: when Alice builds a
  61. circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is
  62. "Bob_1" and the last hop is "Bob_3".
  63. 1.1. The current relay protocol and its drawbacks
  64. [This section describes Tor's current relay protocol. It is not a
  65. proposal; rather it is what we do now. Sections 2 and 3 have my
  66. proposed replacements for it.]
  67. A "relay cell" is a cell that is generated by the client to send
  68. to a node, or by a node to send to the client. It's called a
  69. "relay" cell because a node that receives one may need to relay
  70. it to the next or previous node in the circuit (or to handle the
  71. cell itself).
  72. A relay cell moving towards the client is called "inbound"; a
  73. cell moving away is called "outbound".
  74. When a relay cell is constructed by the client, the client adds one
  75. layer of crypto for each node that will process the cell, and gives
  76. the cell to the first node in the circuit. Each node in turn then
  77. removes one layer of crypto, and either forwards the cell to the next
  78. node in the circuit or acts on that cell itself.
  79. When a relay cell is constructed by a node, it encrypts it and sends
  80. it to the preceding node in the circuit. Each node between the
  81. originating node and the client then encrypts the cell and passes it
  82. back to the preceding node. When the client receives the cell, it
  83. removes layers of crypto until it has an unencrypted cell, which it
  84. then acts on.
  85. In the current protocol, the body of each relay cell contains, in
  86. its unencrypted form:
  87. Relay command [1 byte]
  88. Zeros [2 bytes]
  89. StreamID [2 bytes]
  90. Digest [4 bytes]
  91. Length [2 bytes]
  92. Data [498 bytes]
  93. (This only adds up to 509 bytes. That's because the Tor link
  94. protocol transfers 512-byte cells, and has a 3 byte overhead per
  95. cell. Not how we would do it if we were starting over at this
  96. point.)
  97. At every node of a circuit, the node relaying a cell
  98. encrypts/decrypts it with AES128-CTR. If the cell is outbound
  99. and the "zeros" field is set to all-zeros, the node additionally
  100. checks whether 'digest' is correct. A correct digest is the
  101. first 4 bytes of the running SHA1 digest of: a shared secret,
  102. concatenated with all the relay cells destined for this node on
  103. this circuit so far, including this cell. If _that's_ true, then
  104. the node accepts this cell. (See section 6 of [TorSpec] for full
  105. detail; see section A.1 for a more rigorous description.)
  106. The current approach has some actual problems. Notably:
  107. * It permits tagging attacks. Because AES_CTR is an XOR-based
  108. stream cipher, an adversary who controls the first node in a
  109. circuit can XOR anything he likes into the relay cell, and
  110. then see whether he/she encounters a correspondingly
  111. defaced cell at some exit that he also controls.
  112. That is, the attacker picks some pattern P, and when he
  113. would transmit some outbound relay cell C at hop 1, he
  114. instead transmits C xor P. If an honest exit receives the
  115. cell, the digest will probably be wrong, and the honest exit
  116. will reject it. But if the attacker controls the exit, he
  117. will notice that he has received a cell C' where the digest
  118. doesn't match, but where C' xor P _does_ have a good digest.
  119. The attacker will then know that his two nodes are on the
  120. same circuit, and thereby be able to link the user (whom the
  121. first node sees) to her activities (which the last node sees).
  122. Back when we did the Tor design, this didn't seem like a
  123. big deal, since an adversary who controls both the first and
  124. last node in a circuit is presumed to win already based on
  125. traffic correlation attacks. This attack seemed strictly
  126. worse than that, since it was trivially detectable in the
  127. case where the attacker _didn't_ control both ends. See
  128. section 4.4 of the Tor paper [TorDesign] for our early
  129. thoughts here; see Xinwen Fu et al's 2009 work for a more
  130. full explanation of the in-circuit tagging attack [XF]; and
  131. see "The 23 Raccoons'" March 2012 "Analysis of the Relative
  132. Severity of Tagging Attacks" mail on tor-dev (and the
  133. ensuing thread) for a discussion of why we may want to care
  134. after all, due to attacks that use tagging to amplify route
  135. capture. [23R]
  136. It also has some less practical issues.
  137. * The digest portion is too short. Yes, if you're an attacker
  138. trying to (say) change an "ls *" into an "rm *", you can
  139. only expect to get one altered cell out of 2^32 accepted --
  140. and all future cells on the circuit will be rejected with
  141. similar probability due to the change in the running hash
  142. -- but 1/2^32 is a pretty high success rate for crypto attacks.
  143. * It does MAC-then-encrypt. That makes smart people cringe.
  144. * Its approach to MAC is H(Key | Msg), which is vulnerable to
  145. length extension attack if you've got a Merkle-Damgard hash
  146. (which we do). This isn't relevant in practice right now,
  147. since the only parties who see the digest are the two
  148. parties that rely on it (because of the MAC-then-encrypt).
  149. 1.2. Some feature requirements
  150. Relay cells can originate at the client or at a relay. Relay cells
  151. that originate at the client are given to the first node in the
  152. circuit, and constructed so that they will be decrypted and forwarded
  153. by the first M-1 nodes in the circuit, and finally decrypted and
  154. processed by the Mth node, where the client chooses M. (Usually, the
  155. Mth node is the the last one, which will be an exit node.) Relay
  156. cells that originate at a hop in the circuit are given to the
  157. preceding node, and eventually delivered to the client.
  158. Tor provides a so called "leaky pipe" circuit topology
  159. [TorDesign]: a client can send a cell to any node in the circuit,
  160. not just the last node. I'll try to keep that property, although
  161. historically we haven't really made use of it.
  162. In order to implement telescoping circuit construction (where the
  163. circuit is built by iteratively asking the last node in the
  164. circuit to extend the circuit one hop more), it's vital that the
  165. last hop of the circuit be able to change.
  166. Proposal 188 [Prop188] suggests that we implement a "bridge guards"
  167. feature: making some (non-public) nodes insert an extra hop into
  168. the circuit after themselves, in a way that will make it harder
  169. for other nodes in the network to enumerate them. We
  170. therefore want our circuits to be one-hop re-extensible: when the
  171. client extends a circuit from Bob1 to Bob2, we want Bob1 to be
  172. able to introduce a new node "Bob1.5" into the circuit such that
  173. the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has
  174. been called "loose source routing".)
  175. Any new approach should be able to coexist on a circuit
  176. with the old approach. That is, if Alice wants to build a
  177. circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a
  178. revised relay protocol, then Alice should be able to build a
  179. circuit such that she can have Bob1 and Bob3 process each cell
  180. with the current protocol, and Bob2 process it with a revised
  181. protocol. (Why? Because if all nodes in a circuit needed to use
  182. the same relay protocol, then each node could learn information
  183. about the other nodes in the circuit from which relay protocol
  184. was chosen. For example, if Bob1 supports the new protocol, and
  185. sees that the old relay protocol is in use, and knows that Bob2
  186. supports the new one, then Bob1 has learned that Bob3 is some
  187. node that does not support the new relay protocol.)
  188. Cell length needs to be constant as cells move through the
  189. network. For historical reasons mentioned above in section 1.1,
  190. the length of the encrypted part of a relay cell needs to be 509
  191. bytes.
  192. 1.3. Some security requirements
  193. Two adjacent nodes on a circuit can trivially tell that they are
  194. on the same circuit, and the first node can trivially tell who
  195. the client is. Other than that, we'd like any attacker who
  196. controls a node on the circuit not to have a good way to learn
  197. any other nodes, even if he/she controls those nodes. [*]
  198. Relay cells should not be malleable: no relay should be able to
  199. alter a cell between an honest sender and an honest recipient in
  200. a way that they cannot detect.
  201. Relay cells should be secret: nobody but the sender and recipient
  202. of a relay cell should be able to learn what it says.
  203. Circuits should resist transparent, recoverable tagging attacks:
  204. if an attacker controls one node in a circuit and alters a relay
  205. cell there, no non-adjacent point in the circuit should be able
  206. to recover the relay cell as it would have received it had the
  207. attacker not altered it.
  208. The above properties should apply to sequences of cells too:
  209. an attacker shouldn't be able to change what sequence of cells
  210. arrives at a destination (for example, by removing, replaying, or
  211. reordering one or more cells) without being detected.
  212. (Feel free to substitute whatever formalization of the above
  213. requirements makes you happiest, and add whatever caveats are
  214. necessary to make you comfortable. I probably missed at least
  215. two critical properties.)
  216. [*] Of course, an attacker who controls two points on a circuit
  217. can probably confirm this through traffic correlation. But
  218. we'd prefer that the cryptography not create other, easier
  219. ways for them to do this.
  220. 1.4. A note on algorithms
  221. This document is deliberately agnostic concerning the choice of
  222. cryptographic primitives -- not because I have no opinions about
  223. good ciphers, MACs, and modes of operation -- but because
  224. experience has taught me that mentioning any particular
  225. cryptographic primitive will prevent discussion of anything else.
  226. Please DO NOT suggest algorithms to use in implementing these
  227. protocols yet. It will distract! There will be time later!
  228. If somebody _else_ suggests algorithms to use, for goodness' sake
  229. DON'T ARGUE WITH THEM! There will be time later!
  230. 2. Design 1: Large-block encryption
  231. In this approach, we use a tweakable large-block cipher for
  232. encryption and decryption, and a tweak-chaining function TC.
  233. 2.1. Chained large-block what now?
  234. We assume the existence of a primitive that provides the desired
  235. properties of a tweakable[Tweak] block cipher, taking blocks of any
  236. desired size. (In our case, the block size is 509 bytes[*].)
  237. It also takes a Key, and a per-block "tweak" parameter that plays
  238. the same role that an IV plays in CBC, or that the counter plays
  239. in counter mode.
  240. The Tweak-chaining function TC takes as input a previous tweak, a
  241. tweak chaining key, and a cell; it outputs a new tweak. Its
  242. purpose is to make future cells undecryptable unless you have
  243. received all previous cells. It could probably be something like
  244. a MAC of the old tweak and the cell using the tweak chaining key
  245. as the MAC key.
  246. (If the initial tweak is secret, I am not sure that TC needs to
  247. be keyed.)
  248. [*] Some large-block cipher constructions use blocks whose size is
  249. the multiple of some underlying cipher's block size. If we wind
  250. up having to use one of those, we can use 496-byte blocks instead
  251. at the cost of 2.5% wasted space.
  252. 2.2. The protocol
  253. 2.2.1. Setup phase
  254. The circuit construction algorithm needs to produce forward and
  255. backward keys Kf and Kb, the forward and backward tweak chaining
  256. keys TCKf and TCKb, as well as initial tweak values Tf and Tb.
  257. 2.2.2. The cell format
  258. We replace the "Digest" and "Zeros" fields of the cell with a
  259. single Z-byte "Zeros" field to determine when the cell is
  260. recognized and correctly decrypted; its length is a security
  261. parameter.
  262. 2.2.3. The decryption operations
  263. For a relay to handle an inbound RELAY cell, it sets Tb_next to
  264. TC(TCKb, Tb, Cell). Then it encrypts the cell using the large
  265. block cipher keyed with Kb and tweaked with Tb. Then it sets Tb
  266. to Tb_next.
  267. For a relay to handle an outbound RELAY cell, it sets Tf_next to
  268. TC(TCKf, Tf, Cell). Then it decrypts the cell using the large
  269. block cipher keyed with Kf and tweaked with Tf. Then it sets Tf
  270. to Tf_next. Then it checks the 'Zeros' field on the cell;
  271. if that field is all [00] bytes, the cell is for us.
  272. 2.3. Security discussion
  273. This approach is fairly simple (at least, no more complex than
  274. its primitives) and achieves some of our security goals. Because
  275. of the large block cipher approach, any change to a cell will
  276. render that cell undecryptable, and indistinguishable from random
  277. junk. Because of the tweak chaining approach, if even one cell
  278. is missed or corrupted or reordered, future cells will also
  279. decrypt into random junk.
  280. The tagging attack in this case is turned into a circuit-junking
  281. attack: an adversary who tries to mount it can probably confirm
  282. that he was indeed first and last node on a target circuit
  283. (assuming that circuits which turn to junk in this way are rare),
  284. but cannot recover the circuit after that point.
  285. As a neat side observation, note that this approach improves upon
  286. Tor's current forward secrecy, by providing forward secrecy while
  287. circuits are still operational, since each change to the tweak
  288. should make previous cells undecryptable if the old tweak value
  289. isn't recoverable.
  290. The length of Zeros is a parameter for what fraction of "random
  291. junk" cells will potentially be accepted by a router or client.
  292. If Zeros is Z bytes long, then junk cells will be accepted with
  293. P < 2^-(8*Z + 7). (The '+7' is there because the top 7 bits of
  294. the Length field must also be 0.)
  295. There's no trouble using this protocol in a mixed circuit, where
  296. some nodes speak the old protocol and some speak the
  297. large-block-encryption protocol.
  298. 3. Design 2: short-MAC-and-pad
  299. In this design, we behave more similarly to a mix-net design
  300. (such as Mixmaster or Mixminion's headers). Each node checks a
  301. MAC, and then re-pads the cell to its chosen length before
  302. decoding the cell.
  303. This design uses as a primitive a MAC and a stream cipher. It
  304. might also be possible to use an authenticating cipher mode,
  305. if we can find one that works like a stream cipher and allows us
  306. to efficiently output authenticators for the stream so far.
  307. NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be
  308. replaced with an authenticated encryption mode without too much
  309. loss of generality.
  310. 3.1. The protocol
  311. 3.1.1 Setup phase
  312. The circuit construction algorithm needs to produce forward and
  313. backward keys Kf and Kb, forward and backward stream cipher IVs
  314. IVf and IVb, and forward and backward MAC keys Mf and Mb.
  315. Additionally, the circuit construction algorithm needs a way for
  316. the client to securely (and secretly? XXX) tell each hop in the
  317. circuit a value 'bf' for the number of bytes of MAC it should
  318. expect on outbound cells and 'bb' for the number of bytes it
  319. should use on cells it's generating. Each node can get a
  320. different 'bf' and 'bb'. These values can be 0: if a node's bf
  321. is 0, it doesn't authenticate cells; if its bb is 0, it doesn't
  322. originate them.
  323. The circuit construction algorithm also needs a way to tell each
  324. the client to securely (and secretly? XXX) tell each hop in the
  325. circuit whether it is allowed to be the final destination for
  326. relay cells.
  327. Set the stream Sf and the stream Sb to empty values.
  328. 3.1.2. The cell format
  329. The Zeros and Digest field of the cell format are removed.
  330. 3.1.3. The relay operations
  331. Upon receiving an outbound cell, a node removes the first b bytes
  332. of the cell, and puts them aside as 'M'. The node then computes
  333. between 0 and 2 MACs of the stream consisting of all previously
  334. MAC'd data, plus the remainder of the cell:
  335. If b>0 and there is a next hop, the node computes M_relay.
  336. If this node was told to deliver traffic, or it is the last
  337. node in the circuit so far, the node computes M_receive.
  338. M_relay is computed as MAC(stream | "relay"); M_receive is
  339. computed as MAC(stream | "receive").
  340. If M = M_receive, this cell is for the node; it should process
  341. it.
  342. If M = M_relay, or if b == 0, this cell should be relayed.
  343. If a MAC was computed and neither of the above cases was met,
  344. then the cell is bogus; the node should discard it and destroy
  345. the circuit.
  346. The node then removes the first bf bytes of the cell, and pads the
  347. cell at the end with bf zero bytes. Finally, the node decrypts
  348. the whole remaining padded cell with the stream cipher.
  349. To handle an inbound cell, the node simply does a stream cipher
  350. with no checking.
  351. 3.1.4. Generating inbound cells.
  352. To generate an inbound cell, a node makes a 509-bb byte RELAY
  353. cell C, encrypts that cell with Kb, appends the encrypted cell to
  354. Sb, and prepends M_receive(Sb) to the cell.
  355. 3.1.5. Generating outbound cells
  356. Generating an outbound cell is harder, since we need to know what
  357. padding the earlier nodes will generate in order to know what
  358. padding the later nodes will receive and compute their MACs, but
  359. we also need to know what MACs we'll send to the later nodes in
  360. order to compute which MACs we'll send to the earlier ones.
  361. Mixnet clients have needed to do this for ages, though, so the
  362. algorithms are pretty well settled. I'll give one below in A.3.
  363. 3.2. Security discussion
  364. This approach is also simple and (given good parameter choices)
  365. can achieve our goals. The trickiest part is the algorithm that
  366. clients must follow to package cells, but that's not so bad.
  367. It's not necessary to check MACs on inbound traffic, because
  368. nobody but the client can tell scrambled messages from good ones,
  369. and the client can be trusted to keep the client's own secrets.
  370. With this protocol, if the attacker tries to do a tagging attack,
  371. the circuit should get destroyed by the next node in the circuit
  372. that has a nonzero "bf" value, with probability == 1-2^-(8*bf).
  373. (If there are further intervening honest nodes, they also have a
  374. chance to detect the attack.)
  375. Similarly, any attempt to replay, or reorder outbound cells
  376. should fail similarly.
  377. The "bf" values could reveal to each node its position in the
  378. circuit and the client preferences, depending on how we set them;
  379. on the other hand, having a fixed bf value would reveal to the
  380. last node the length of the circuit. Neither option seems great.
  381. This protocol doesn't provide any additional forward secrecy
  382. beyond what Tor has today. We could fix that by changing our use
  383. of the stream cipher so that instead of running in counter mode
  384. between cells, we use a tweaked stream cipher and change the
  385. tweak with each cell (possibly based on the unused portion of the
  386. MAC).
  387. This protocol does support loose source routing, so long as
  388. no padding bytes are added by any router-added nodes.
  389. In a circuit, every node has at least one relay cell sent to it:
  390. even non-exit nodes get a RELAY_EXTEND cell.
  391. 4. Discussion
  392. I'm not currently seeing a reason to strongly prefer one of these
  393. approaches over another.
  394. In favor of large-block encryption:
  395. - The space overhead seems smaller: we need to use up fewer
  396. bytes in order to get equivalent looking security.
  397. (For example, if we want to have P < 2^64 that a cell altered
  398. by hop 1 could be accepted by hop 2 or hop 3, *and* we want P
  399. < 2^64 that a cell altered by hop 2 could be accepted by hop
  400. 3, the large-block approach needs about 8 bytes for the Zeros
  401. field, whereas the short-MAC-and-pad approach needs 16 bytes
  402. worth of MACs.)
  403. - We get forward secrecy pretty easily.
  404. - The format doesn't leak anything about the length of the
  405. circuit, or limit it.
  406. - We don't need complicated logic to set the 'b' parameters.
  407. - It doesn't need tricky padding code.
  408. In the favor of short-MAC-and-pad:
  409. - All of the primitives used are much better analyzed and
  410. understood. There's nothing esoteric there. The format
  411. itself is similar to older, well-analyzed formats.
  412. - Most of the constructions for the large-block-cipher approach
  413. seem less efficient in CPU cycles than a good stream cipher
  414. and a MAC. (But I don't want to discuss that now; see section
  415. 1.4 above!)
  416. Unclear:
  417. - Suppose that an attacker controls the first and last hop of a
  418. circuit, and tries an end-to-end tagging attack. With
  419. large-block encryption, the tagged cell and all future cells
  420. on the circuit turn to junk after the tagging attack, with
  421. P~~100%. With short-MAC-and-pad, the circuit is destroyed at
  422. the second hop, with P ~~ 1- 2^(-8*bf). Is one of these
  423. approaches materially worse for the attacker?
  424. - Can we do better than the "compute two MACs" approach for
  425. distinguishing the relay and receive cases of the
  426. short-MAC-and-pad protocol?
  427. - To what extent can we improve these protocols?
  428. - If we do short-MAC-and-pad, should we apply the forward
  429. security hack alluded to in section 3.2?
  430. 5. Acknowledgments
  431. Thanks to the many reviewers of the initial drafts of this
  432. proposal. If you can make any sense of what I'm saying, they
  433. deserve much of the credit.
  434. A. Formal description
  435. Note that in all these cases, more efficient descriptions exist.
  436. A.1. The current Tor relay protocol.
  437. Relay cell format:
  438. Relay command [1 byte]
  439. Zeros [2 bytes]
  440. StreamID [2 bytes]
  441. Digest [4 bytes]
  442. Length [2 bytes]
  443. Data [498 bytes]
  444. Circuit setup:
  445. (Specified elsewhere; the client negotiates with each router in
  446. a circuit the secret AES keys Kf, Kb, and the secret 'digest
  447. keys' Df, and Db. They initialize AES counters Cf and Cb to
  448. 0. They initialize the digest stream Sf to Df, and Sb to Db.)
  449. HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]):
  450. 1. If the Zeros field of Cell is not [00 00], return False.
  451. 2. Let Cell' = Cell with its Digest field set to [00 00 00 00].
  452. 3. Let S' = Stream | Cell'.
  453. 4. If SHA1(S') = the Digest field of Cell, set Stream to S',
  454. and return True.
  455. 5. Otherwise return False.
  456. HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
  457. 1. Set the Zeros and Digest field of Cell' to [00] bytes.
  458. 2. Set Stream to Stream | Cell'.
  459. 3. Construct Cell from Cell' by setting the Digest field to
  460. SHA1(Stream), and taking all other fields as-is.
  461. 4. Return Cell.
  462. HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
  463. 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
  464. counter 'Ctr'. Increment 'Ctr' by the length of the cell.
  465. HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
  466. 1. Same as ENCRYPT.
  467. Router operation, upon receiving an inbound cell -- that is, one
  468. sent towards the client.
  469. 1. ENCRYPT(cell, Kb, Cb)
  470. 2. Send the decrypted contents towards the client.
  471. Router operation, upon receiving an outbound cell -- that is, one
  472. sent away from the client.
  473. 1. DECRYPT(cell, Kf, Cf)
  474. 2. If CHECK(Cell, Sf) is true, this cell is for us. Do not
  475. relay the cell.
  476. 3. Otherwise, this cell is not for us. Send the decrypted cell
  477. to the next hop on the circuit, or discard it if there is no
  478. next hop.
  479. Router operation, to create a relay cell that will be delivered
  480. to the client.
  481. 1. Construct a Relay cell Cell' with the relay command, length,
  482. stream ID, and body fields set as appropriate.
  483. 2. Let Cell = CONSTRUCT(Cell', Sb).
  484. 3. ENCRYPT(Cell, Kb, Cb)
  485. 4. Send the encrypted cell towards the client.
  486. Client operation, receiving an inbound cell.
  487. For each hop H in a circuit, starting with the first hop and
  488. ending (possibly) with the last:
  489. 1. DECRYPT(Cell, Kb_H, Cb_H)
  490. 2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop
  491. H. Exit the loop, and return the cell in its current
  492. form.
  493. If we reach the end of the loop without finding the right hop,
  494. the cell is bogus or corrupted.
  495. Client operation, sending an outbound cell to hop H.
  496. 1. Construct a Relay cell Cell' with the relay command, length,
  497. stream ID, and body fields set as appropriate.
  498. 2. Let Cell = CONSTRUCT(Cell', Sf_H)
  499. 3. For i = H..1:
  500. A. ENCRYPT(Cell, Sf_i, Cf_i)
  501. 4. Deliver Cell to the first hop in the circuit.
  502. A.2. The large-block-cipher protocol
  503. Same as A.1, except for the following changes.
  504. The cell format is now:
  505. Zeros [Z bytes]
  506. Length [2 bytes]
  507. StreamID [2 bytes]
  508. Relay command [1 byte]
  509. Data [504-Z bytes]
  510. Ctr is no longer a counter, but a Tweak value.
  511. Each key is now a tuple of (Key_Crypt, Key_TC)
  512. Streams are no longer used.
  513. HELPER FUNCTION: CHECK(Cell [in], Stream [in,out])
  514. 1. If the Zeros field of Cell contains only [00] bytes,
  515. return True.
  516. 2. Otherwise return false.
  517. HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
  518. 1. Let Cell be Cell', with its "Zeros" field set to Z [00]
  519. bytes.
  520. 2. Return Cell'.
  521. HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out])
  522. 2. Encrypt Cell using Key and Tweak
  523. 1. Let Tweak' = TC(Key_TC, Tweak, Cell)
  524. 3. Set Tweak to Tweak'.
  525. HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out])
  526. 1. Let Tweak' = TC(Key_TC, Tweak, Cell)
  527. 2. Decrypt Cell using Key and Tweak
  528. 3. Set Tweak to Tweak'.
  529. A.3. The short-MAC-and-pad protocol.
  530. Define M_relay(K,S) as MAC(K, S|"relay").
  531. Define M_receive(K,S) as MAC(K, S|"receive").
  532. Define Z(n) as a series of n [00] bytes.
  533. Define BODY_LEN as 509
  534. The cell message format is now:
  535. Relay command [1 byte]
  536. StreamID [2 bytes]
  537. Length [2 bytes]
  538. Data [variable bytes]
  539. Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]):
  540. Let M = Cell[0:b]
  541. Let Rest = Cell[b:...]
  542. If b == 0:
  543. Return (nil, Rest)
  544. Let S' = S | Rest
  545. If M == M_relay(K,S')[0:b]:
  546. Set S = S'
  547. Return ("relay", Rest)
  548. If M == M_receive(K,S')[0:b]:
  549. Set S = S'
  550. Return ("receive", Rest)
  551. Return ("BAD", nil)
  552. HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
  553. 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
  554. counter 'Ctr'. Increment 'Ctr' by the length of the cell.
  555. HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
  556. 1. Same as ENCRYPT.
  557. Router operation, upon receiving an inbound cell:
  558. 1. ENCRYPT(cell, Kb, Cb)
  559. 2. Send the decrypted contents towards the client.
  560. Router operation, upon receiving an outbound cell:
  561. 1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf)
  562. 2. If Status == "BAD", drop the cell and destroy the circuit.
  563. 3. Let Cell' = Msg | Z(BODY_LEN - len(Msg))
  564. 4. DECRYPT(Cell', Kf, Cf) [*]
  565. 5. If Status == "receive" or (Status == nil and there is no
  566. next hop), Cell' is for us: process it.
  567. 6. Otherwise, send Cell' to the next node.
  568. Router operation, sending a cell towards the client:
  569. 1. Let Body = a relay cell body of BODY_LEN-bb_i bytes.
  570. 2. Let Cell' = ENCRYPT(Body, Kb, Cb)
  571. 3. Let Sb = Sb | Cell'
  572. 4. Let M = M_receive(Mb, Sb)[0:b]
  573. 5. Send the cell M | Cell' back towards the client.
  574. Client operation, upon receiving an inbound cell:
  575. For each hop H in the circuit, from first to last:
  576. 1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i)
  577. 2. If Status = "relay", drop the cell and destroy
  578. the circuit. (BAD is okay; it means that this hop didn't
  579. originate the cell.)
  580. 3. DECRYPT(Msg, Kb_i, Cb_i)
  581. 4. If Status = "receive", this cell is from hop i; process
  582. it.
  583. 5. Otherwise, set Cell = Msg.
  584. Client operation, sending an outbound cell:
  585. Let BF = the total of all bf_i values.
  586. 1. Construct a relay cell body Msg of BODY_LEN-BF bytes.
  587. 2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i)
  588. 3. Let Pad_0 = "".
  589. 4. For i in range 1..N, where N is the number of hops:
  590. Let Pad_i = Pad_{i-1} | Z(bf_i)
  591. Let S_last = the last len(Pad_i) bytes of Stream_i.
  592. Let Pad_i = Pad_i xor S_last
  593. Now Pad_i is the padding as it will stand after node i
  594. has processed it.
  595. 5. For i in range N..1, where N is the number of hops:
  596. If this is the last hop, let M_* = M_receive. Else let
  597. M_* = M_relay.
  598. Let Body = Msg xor the first len(Msg) bytes of Stream_i
  599. Let M = M_*(Mf, Body | Pad_(i-1))
  600. Set Msg = M[:bf_i] | Body
  601. 6. Send Msg outbound to the first relay in the circuit.
  602. [*] Strictly speaking, we could omit the pad-and-decrypt
  603. operation once we know we're the final hop.
  604. R. References
  605. [Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses
  606. https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-guards.txt
  607. [TorSpec] The Tor Protocol Specification
  608. https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.txt
  609. [TorDesign] Dingledine et al, "Tor: The Second Generation Onion
  610. Router",
  611. https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf
  612. [Tweak] Liskov et al, "Tweakable Block Ciphers",
  613. http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf
  614. [XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity"
  615. [23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging
  616. Attacks" http://archives.seul.org/or/dev/Mar-2012/msg00019.html
  617. (You'll want to read the rest of the thread too.)