rpc-cache.txt 8.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203
  1. This document gives a brief introduction to the caching
  2. mechanisms in the sunrpc layer that is used, in particular,
  3. for NFS authentication.
  4. CACHES
  5. ======
  6. The caching replaces the old exports table and allows for
  7. a wide variety of values to be caches.
  8. There are a number of caches that are similar in structure though
  9. quite possibly very different in content and use. There is a corpus
  10. of common code for managing these caches.
  11. Examples of caches that are likely to be needed are:
  12. - mapping from IP address to client name
  13. - mapping from client name and filesystem to export options
  14. - mapping from UID to list of GIDs, to work around NFS's limitation
  15. of 16 gids.
  16. - mappings between local UID/GID and remote UID/GID for sites that
  17. do not have uniform uid assignment
  18. - mapping from network identify to public key for crypto authentication.
  19. The common code handles such things as:
  20. - general cache lookup with correct locking
  21. - supporting 'NEGATIVE' as well as positive entries
  22. - allowing an EXPIRED time on cache items, and removing
  23. items after they expire, and are no longer in-use.
  24. - making requests to user-space to fill in cache entries
  25. - allowing user-space to directly set entries in the cache
  26. - delaying RPC requests that depend on as-yet incomplete
  27. cache entries, and replaying those requests when the cache entry
  28. is complete.
  29. - clean out old entries as they expire.
  30. Creating a Cache
  31. ----------------
  32. 1/ A cache needs a datum to store. This is in the form of a
  33. structure definition that must contain a
  34. struct cache_head
  35. as an element, usually the first.
  36. It will also contain a key and some content.
  37. Each cache element is reference counted and contains
  38. expiry and update times for use in cache management.
  39. 2/ A cache needs a "cache_detail" structure that
  40. describes the cache. This stores the hash table, some
  41. parameters for cache management, and some operations detailing how
  42. to work with particular cache items.
  43. The operations requires are:
  44. struct cache_head *alloc(void)
  45. This simply allocates appropriate memory and returns
  46. a pointer to the cache_detail embedded within the
  47. structure
  48. void cache_put(struct kref *)
  49. This is called when the last reference to an item is
  50. dropped. The pointer passed is to the 'ref' field
  51. in the cache_head. cache_put should release any
  52. references create by 'cache_init' and, if CACHE_VALID
  53. is set, any references created by cache_update.
  54. It should then release the memory allocated by
  55. 'alloc'.
  56. int match(struct cache_head *orig, struct cache_head *new)
  57. test if the keys in the two structures match. Return
  58. 1 if they do, 0 if they don't.
  59. void init(struct cache_head *orig, struct cache_head *new)
  60. Set the 'key' fields in 'new' from 'orig'. This may
  61. include taking references to shared objects.
  62. void update(struct cache_head *orig, struct cache_head *new)
  63. Set the 'content' fileds in 'new' from 'orig'.
  64. int cache_show(struct seq_file *m, struct cache_detail *cd,
  65. struct cache_head *h)
  66. Optional. Used to provide a /proc file that lists the
  67. contents of a cache. This should show one item,
  68. usually on just one line.
  69. int cache_request(struct cache_detail *cd, struct cache_head *h,
  70. char **bpp, int *blen)
  71. Format a request to be send to user-space for an item
  72. to be instantiated. *bpp is a buffer of size *blen.
  73. bpp should be moved forward over the encoded message,
  74. and *blen should be reduced to show how much free
  75. space remains. Return 0 on success or <0 if not
  76. enough room or other problem.
  77. int cache_parse(struct cache_detail *cd, char *buf, int len)
  78. A message from user space has arrived to fill out a
  79. cache entry. It is in 'buf' of length 'len'.
  80. cache_parse should parse this, find the item in the
  81. cache with sunrpc_cache_lookup, and update the item
  82. with sunrpc_cache_update.
  83. 3/ A cache needs to be registered using cache_register(). This
  84. includes it on a list of caches that will be regularly
  85. cleaned to discard old data.
  86. Using a cache
  87. -------------
  88. To find a value in a cache, call sunrpc_cache_lookup passing a pointer
  89. to the cache_head in a sample item with the 'key' fields filled in.
  90. This will be passed to ->match to identify the target entry. If no
  91. entry is found, a new entry will be create, added to the cache, and
  92. marked as not containing valid data.
  93. The item returned is typically passed to cache_check which will check
  94. if the data is valid, and may initiate an up-call to get fresh data.
  95. cache_check will return -ENOENT in the entry is negative or if an up
  96. call is needed but not possible, -EAGAIN if an upcall is pending,
  97. or 0 if the data is valid;
  98. cache_check can be passed a "struct cache_req *". This structure is
  99. typically embedded in the actual request and can be used to create a
  100. deferred copy of the request (struct cache_deferred_req). This is
  101. done when the found cache item is not uptodate, but the is reason to
  102. believe that userspace might provide information soon. When the cache
  103. item does become valid, the deferred copy of the request will be
  104. revisited (->revisit). It is expected that this method will
  105. reschedule the request for processing.
  106. The value returned by sunrpc_cache_lookup can also be passed to
  107. sunrpc_cache_update to set the content for the item. A second item is
  108. passed which should hold the content. If the item found by _lookup
  109. has valid data, then it is discarded and a new item is created. This
  110. saves any user of an item from worrying about content changing while
  111. it is being inspected. If the item found by _lookup does not contain
  112. valid data, then the content is copied across and CACHE_VALID is set.
  113. Populating a cache
  114. ------------------
  115. Each cache has a name, and when the cache is registered, a directory
  116. with that name is created in /proc/net/rpc
  117. This directory contains a file called 'channel' which is a channel
  118. for communicating between kernel and user for populating the cache.
  119. This directory may later contain other files of interacting
  120. with the cache.
  121. The 'channel' works a bit like a datagram socket. Each 'write' is
  122. passed as a whole to the cache for parsing and interpretation.
  123. Each cache can treat the write requests differently, but it is
  124. expected that a message written will contain:
  125. - a key
  126. - an expiry time
  127. - a content.
  128. with the intention that an item in the cache with the give key
  129. should be create or updated to have the given content, and the
  130. expiry time should be set on that item.
  131. Reading from a channel is a bit more interesting. When a cache
  132. lookup fails, or when it succeeds but finds an entry that may soon
  133. expire, a request is lodged for that cache item to be updated by
  134. user-space. These requests appear in the channel file.
  135. Successive reads will return successive requests.
  136. If there are no more requests to return, read will return EOF, but a
  137. select or poll for read will block waiting for another request to be
  138. added.
  139. Thus a user-space helper is likely to:
  140. open the channel.
  141. select for readable
  142. read a request
  143. write a response
  144. loop.
  145. If it dies and needs to be restarted, any requests that have not been
  146. answered will still appear in the file and will be read by the new
  147. instance of the helper.
  148. Each cache should define a "cache_parse" method which takes a message
  149. written from user-space and processes it. It should return an error
  150. (which propagates back to the write syscall) or 0.
  151. Each cache should also define a "cache_request" method which
  152. takes a cache item and encodes a request into the buffer
  153. provided.
  154. Note: If a cache has no active readers on the channel, and has had not
  155. active readers for more than 60 seconds, further requests will not be
  156. added to the channel but instead all lookups that do not find a valid
  157. entry will fail. This is partly for backward compatibility: The
  158. previous nfs exports table was deemed to be authoritative and a
  159. failed lookup meant a definite 'no'.
  160. request/response format
  161. -----------------------
  162. While each cache is free to use its own format for requests
  163. and responses over channel, the following is recommended as
  164. appropriate and support routines are available to help:
  165. Each request or response record should be printable ASCII
  166. with precisely one newline character which should be at the end.
  167. Fields within the record should be separated by spaces, normally one.
  168. If spaces, newlines, or nul characters are needed in a field they
  169. much be quoted. two mechanisms are available:
  170. 1/ If a field begins '\x' then it must contain an even number of
  171. hex digits, and pairs of these digits provide the bytes in the
  172. field.
  173. 2/ otherwise a \ in the field must be followed by 3 octal digits
  174. which give the code for a byte. Other characters are treated
  175. as them selves. At the very least, space, newline, nul, and
  176. '\' must be quoted in this way.