219-expanded-dns.txt 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349
  1. Filename: 219-expanded-dns.txt
  2. Title: Support for full DNS and DNSSEC resolution in Tor
  3. Authors: Ondrej Mikle
  4. Created: 4 February 2012
  5. Modified: 2 August 2013
  6. Target: 0.2.5.x
  7. Status: Needs-Revision
  8. 0. Overview
  9. Adding support for any DNS query type to Tor.
  10. 0.1. Motivation
  11. Many applications running over Tor need more than just resolving FQDN to
  12. IPv4 and vice versa. Sometimes to prevent DNS leaks the applications have to
  13. be hacked around to be supplied necessary data by hand (e.g. SRV records in
  14. XMPP). TLS connections will benefit from planned TLSA record that provides
  15. certificate pinning to avoid another Diginotar-like fiasco.
  16. 0.2. What about DNSSEC?
  17. Routine DNSSEC resolution is not practical with this proposal alone,
  18. because of round-trip issues: a single name lookup can require
  19. dozens of round trips across a circuit, rendering it very slow. (We
  20. don't want to add minutes to every webpage load time!)
  21. For records like TLSA that need extra signing, this might not be an
  22. unacceptable amount of overhead, but routine hostname lookup, it's
  23. probably overkill.
  24. [Further, thanks to the changes of proposal 205, DNSSEC for routine
  25. hostname lookup is less useful in Tor than it might have been back
  26. when we cached IPv4 and IPv6 addresses and used them across multiple
  27. circuits and exit nodes.]
  28. See section 8 below for more discussion of DNSSEC issues.
  29. 1. Design
  30. 1.1 New cells
  31. There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll
  32. use DNS_BEGIN and DNS_RESPONSE for short below).
  33. 1.1.1. DNS_BEGIN
  34. DNS_BEGIN payload:
  35. FLAGS [2 octets]
  36. DNS packet data (variable length, up to length of relay cell.)
  37. The DNS packet must be generated internally by Tor to avoid
  38. fingerprinting users by differences in client resolvers' behavior.
  39. [XXXX We need to specify the exact behavior here: saying "Just do what
  40. Libunbound does!" would make it impossible to implement a
  41. Tor-compatible client without reverse-engineering libunbound. - NM]
  42. The FLAGS field is reserved, and should be set to 0 by all clients.
  43. Because of the maximum length of the RELAY cell, the DNS packet may
  44. not be longer than 496 bytes. [XXXX Is this enough? -NM]
  45. Some fields in the query must be omitted or set to zero: see section 3
  46. below.
  47. 1.1.2. DNS_RESPONSE
  48. DNS_RESPONSE payload:
  49. STATUS [1 octet]
  50. CONTENT [variable, up to length of relay cell]
  51. If the low bit of STATUS is set, this is the last DNS_RESPONSE that
  52. the server will send in response to the given DNS_BEGIN. Otherwise,
  53. there will be more DNS_RESPONSE packets. The other bits are reserved,
  54. and should be set to zero for now.
  55. The CONTENT fields of the DNS_RESPONSE cells contain a DNS record,
  56. split across multiple cells as needed, encoded as:
  57. total length (2 octets)
  58. data (variable)
  59. So for example, if the DNS record R1 is only 300 bytes long, then it
  60. is sent in a single DNS_RESPONSE cell with payload [01 01 2C] R1. But
  61. if the DNS record R2 is 1024 bytes long, it's sent in 3 DNS_RESPONSE
  62. cells, with contents: [00 04 00] R2[0:495], [00] R2[495:992], and
  63. [01] R2[992:1024] respectively.
  64. [NOTE: I'm using the length field and the is-this-the-last-cell
  65. field to allow multi-packet responses in the future. -NM]
  66. AXFR and IXRF are not supported in this cell by design (see
  67. specialized tool below in section 5).
  68. 1.1.3. Matching queries to responses.
  69. DNS_BEGIN must use a non-zero, distinct StreamID. The client MUST NOT
  70. re-use the same stream ID until it has received a complete response
  71. from the server or a RELAY_END cell.
  72. The client may cancel a DNS_BEGIN request by sending a RELAY_END cell.
  73. The server may refused to answer, or abort answering, a DNS_BEGIN cell
  74. by sending a RELAY_END cell.
  75. 2. Interfaces to applications
  76. DNSPort evdns - existing implementation will be updated to use
  77. DNS_BEGIN.
  78. [XXXX we should add a dig-like tool that can work over the socksport
  79. via some extension, as tor-resolve does now. -NM]
  80. 3. Limitations on DNS query
  81. Clients must only set query class to IN (INTERNET), since the only
  82. other useful class CHAOS is practical for directly querying
  83. authoritative servers (OR in this case acts as a recursive resolver).
  84. Servers MUST return REFUSED for any for class other than IN.
  85. Multiple questions in a single packet are not supported and OR will
  86. respond with REFUSED as the DNS error code.
  87. All query RR types are allowed.
  88. [XXXX I originally thought about some exit policy like "basic RR types" and
  89. "all RRs", but managing such list in deployed nodes with extra directory
  90. flags outweighs the benefit. Maybe disallow ANY RR type? -OM]
  91. Client as well as OR MUST block attempts to resolve local RFC 1918,
  92. 4193, or 4291 adresses (PTR). REFUSED will be returned as DNS error
  93. code from OR. [XXXX Must they also refuse to report addresses that
  94. resolve to these? -NM]
  95. [XXX I don't think so. People often use public DNS
  96. records that map to private adresses. We can't effectively separate
  97. "truly public" records from the ones client's dnsmasq or similar DNS
  98. resolver returns. - OM]
  99. [XXX Then do you mean "must be returned as the DNS error from the OP"?]
  100. Request for special names (.onion, .exit, .noconnect) must never be
  101. sent, and will return REFUSED.
  102. The DNS transaction ID field MUST be set to zero in all requests and
  103. replies; the stream ID field plays the same function in Tor.
  104. 4. Implementation notes
  105. Client will periodically purge incomplete DNS replies. Any unexpected
  106. DNS_RESPONSE will be dropped.
  107. AD flag must be zeroed out on client unless validation is performed.
  108. [XXXX libunbound lowlevel API, Tor+libunbound libevent loop
  109. libunbound doesn't publicly expose all the necessary parts of low-level API.
  110. It can return the received DNS packet, but not let you construct a packet
  111. and get it in wire-format, for example.
  112. Options I see:
  113. a) patch libunbound to be able feed wire-format DNS packets and add API to
  114. obtain constructed packets instead of sending over network
  115. b) replace bufferevents for sockets in unbound with something like
  116. libevent's paired bufferevents. This means that data extracted from
  117. DNS_RESPONSE/DNS_BEGIN cells would be fed directly to some evbuffers that
  118. would be picked up by libunbound. It could possibly result in avoiding
  119. background thread of libunbound's ub_resolve_async running separate libevent
  120. loop.
  121. c) bind to some arbitrary local address like 127.1.2.3:53 and use it as
  122. forwarder for libunbound. The code there would pack/unpack the DNS packets
  123. from/to libunbound into DNS_BEGIN/DNS_RESPONSE cells. It wouldn't require
  124. modification of libunbound code, but it's not pretty either. Also the bind
  125. port must be 53 which usually requires superuser privileges.
  126. Code of libunbound is fairly complex for me to see outright what would the
  127. best approach be.
  128. ]
  129. 5. Separate tool for AXFR
  130. The AXFR tool will have similar interface like tor-resolve, but will
  131. return raw DNS data.
  132. Parameters are: query domain, server IP of authoritative DNS.
  133. The tool will transfer the data through "ordinary" tunnel using RELAY_BEGIN
  134. and related cells.
  135. This design decision serves two goals:
  136. - DNS_BEGIN and DNS_RESPONSE will be simpler to implement (lower chance of
  137. bugs)
  138. - in practice it's often useful do AXFR queries on secondary authoritative
  139. DNS servers
  140. IXFR will not be supported (infrequent corner case, can be done by manual
  141. tunnel creation over Tor if truly necessary).
  142. 6. Security implications
  143. As proposal 171 mentions, we need mitigate circuit correlation. One solution
  144. would be keeping multiple streams to multiple exit nodes and picking one at
  145. random for DNS resolution. Other would be keeping DNS-resolving circuit open
  146. only for a short time (e.g. 1-2 minutes). Randomly changing the circuits
  147. however means that it would probably incur additional latency since there
  148. would likely be a few cache misses on the newly selected exits.
  149. [This needs more analysis; We need to consider the possible attacks
  150. here. It would be good to have a way to tie requests to
  151. SocksPorts, perhaps? -NM]
  152. 7. TTL normalization idea
  153. A bit complex on implementation, because it requires parsing DNS packets at
  154. exit node.
  155. TTL in reply DNS packet MUST be normalized at exit node so that client won't
  156. learn what other clients queried. The normalization is done in following
  157. way:
  158. - for a RR, the original TTL value received from authoritative DNS server
  159. should be used when sending DNS_RESPONSE, trimming the values to interval
  160. [5, 600]
  161. - does not pose "ghost-cache-attack", since once RR is flushed from
  162. libunbound's cache, it must be fetched anew
  163. 8. DNSSEC notes
  164. 8.1. Where to do the resolution?
  165. DNSSEC is part of the DNS protocol and the most appropriate place for DNSSEC
  166. API would be probably in OS libraries (e.g. libc). However that will
  167. probably take time until it becomes widespread.
  168. On the Tor's side (as opposed to application's side), DNSSEC will provide
  169. protection against DNS cache-poisoning attacks (provided that exit is not
  170. malicious itself, but still reduces attack surface).
  171. 8.2. Round trips and serialization
  172. Following are two examples of resolving two A records. The one for
  173. addons.mozila.org is an example of a "common" RR without CNAME/DNAME, the
  174. other for www.gov.cn an extreme example chained through 5 CNAMEs and 3 TLDs.
  175. The examples below are shown for resolving that started with an empty DNS
  176. cache.
  177. Note that multiple queries are made by libunbound as it tries to adjust for
  178. the latency of network. "Standard query response" below that does not list
  179. RR type is a negative NOERROR reply with NSEC/NSEC3 (usually reply to DS
  180. query).
  181. The effect of DNS cache plays a great role - once DS/DNSKEY for root and a
  182. TLD is cached, at most 3 records usually need to be fetched for a record
  183. that does not utilize CNAME/DNAME (3 roundtrips for DS, DNSKEY and the
  184. record itself if there are no zone cuts below).
  185. Query for addons.mozilla.org, 6 roundtrips (not counting retries):
  186. Standard query A addons.mozilla.org
  187. Standard query A addons.mozilla.org
  188. Standard query A addons.mozilla.org
  189. Standard query A addons.mozilla.org
  190. Standard query A addons.mozilla.org
  191. Standard query response A 63.245.217.112 RRSIG
  192. Standard query response A 63.245.217.112 RRSIG
  193. Standard query response A 63.245.217.112 RRSIG
  194. Standard query A addons.mozilla.org
  195. Standard query response A 63.245.217.112 RRSIG
  196. Standard query response A 63.245.217.112 RRSIG
  197. Standard query A addons.mozilla.org
  198. Standard query response A 63.245.217.112 RRSIG
  199. Standard query response A 63.245.217.112 RRSIG
  200. Standard query DNSKEY <Root>
  201. Standard query DNSKEY <Root>
  202. Standard query response DNSKEY DNSKEY RRSIG
  203. Standard query response DNSKEY DNSKEY RRSIG
  204. Standard query DS org
  205. Standard query response DS DS RRSIG
  206. Standard query DNSKEY org
  207. Standard query response DNSKEY DNSKEY DNSKEY DNSKEY RRSIG RRSIG
  208. Standard query DS mozilla.org
  209. Standard query response DS RRSIG
  210. Standard query DNSKEY mozilla.org
  211. Standard query response DNSKEY DNSKEY DNSKEY RRSIG RRSIG
  212. Query for www.gov.cn, 16 roundtrips (not counting retries):
  213. Standard query A www.gov.cn
  214. Standard query A www.gov.cn
  215. Standard query A www.gov.cn
  216. Standard query A www.gov.cn
  217. Standard query A www.gov.cn
  218. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  219. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  220. Standard query A www.gov.cn
  221. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  222. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  223. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  224. Standard query A www.gov.cn
  225. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  226. Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  227. Standard query A www.gov.chinacache.net
  228. Standard query response CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  229. Standard query A www.gov.cncssr.chinacache.net
  230. Standard query response CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  231. Standard query A www.gov.foreign.ccgslb.com
  232. Standard query response CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119
  233. Standard query A wac.0b51.edgecastcdn.net
  234. Standard query response CNAME gp1.wac.v2cdn.net A 68.232.35.119
  235. Standard query A gp1.wac.v2cdn.net
  236. Standard query response A 68.232.35.119
  237. Standard query DNSKEY <Root>
  238. Standard query response DNSKEY DNSKEY RRSIG
  239. Standard query DS cn
  240. Standard query response
  241. Standard query DS net
  242. Standard query response DS RRSIG
  243. Standard query DNSKEY net
  244. Standard query response DNSKEY DNSKEY RRSIG
  245. Standard query DS chinacache.net
  246. Standard query response
  247. Standard query DS com
  248. Standard query response DS RRSIG
  249. Standard query DNSKEY com
  250. Standard query response DNSKEY DNSKEY RRSIG
  251. Standard query DS ccgslb.com
  252. Standard query response
  253. Standard query DS edgecastcdn.net
  254. Standard query response
  255. Standard query DS v2cdn.net
  256. Standard query response
  257. An obvious idea to avoid so many roundtrips is to serialize them together.
  258. There has been an attempt to standardize such "DNSSEC stapling" [1], however
  259. it's incomplete for the general case, mainly due to various intricacies -
  260. proofs of non-existence, NSEC3 opt-out zones, TTL handling (see RFC 4035
  261. section 5).
  262. References:
  263. [1] https://www.ietf.org/mail-archive/web/dane/current/msg02823.html