123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111 |
- Reference counting in pnfs:
- ==========================
- The are several inter-related caches. We have layouts which can
- reference multiple devices, each of which can reference multiple data servers.
- Each data server can be referenced by multiple devices. Each device
- can be referenced by multiple layouts. To keep all of this straight,
- we need to reference count.
- struct pnfs_layout_hdr
- ----------------------
- The on-the-wire command LAYOUTGET corresponds to struct
- pnfs_layout_segment, usually referred to by the variable name lseg.
- Each nfs_inode may hold a pointer to a cache of these layout
- segments in nfsi->layout, of type struct pnfs_layout_hdr.
- We reference the header for the inode pointing to it, across each
- outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN,
- LAYOUTCOMMIT), and for each lseg held within.
- Each header is also (when non-empty) put on a list associated with
- struct nfs_client (cl_layouts). Being put on this list does not bump
- the reference count, as the layout is kept around by the lseg that
- keeps it in the list.
- deviceid_cache
- --------------
- lsegs reference device ids, which are resolved per nfs_client and
- layout driver type. The device ids are held in a RCU cache (struct
- nfs4_deviceid_cache). The cache itself is referenced across each
- mount. The entries (struct nfs4_deviceid) themselves are held across
- the lifetime of each lseg referencing them.
- RCU is used because the deviceid is basically a write once, read many
- data structure. The hlist size of 32 buckets needs better
- justification, but seems reasonable given that we can have multiple
- deviceid's per filesystem, and multiple filesystems per nfs_client.
- The hash code is copied from the nfsd code base. A discussion of
- hashing and variations of this algorithm can be found at:
- http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809
- data server cache
- -----------------
- file driver devices refer to data servers, which are kept in a module
- level cache. Its reference is held over the lifetime of the deviceid
- pointing to it.
- lseg
- ----
- lseg maintains an extra reference corresponding to the NFS_LSEG_VALID
- bit which holds it in the pnfs_layout_hdr's list. When the final lseg
- is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED
- bit is set, preventing any new lsegs from being added.
- layout drivers
- --------------
- PNFS utilizes what is called layout drivers. The STD defines 4 basic
- layout types: "files", "objects", "blocks", and "flexfiles". For each
- of these types there is a layout-driver with a common function-vectors
- table which are called by the nfs-client pnfs-core to implement the
- different layout types.
- Files-layout-driver code is in: fs/nfs/filelayout/.. directory
- Objects-layout-driver code is in: fs/nfs/objlayout/.. directory
- Blocks-layout-driver code is in: fs/nfs/blocklayout/.. directory
- Flexfiles-layout-driver code is in: fs/nfs/flexfilelayout/.. directory
- objects-layout setup
- --------------------
- As part of the full STD implementation the objlayoutdriver.ko needs, at times,
- to automatically login to yet undiscovered iscsi/osd devices. For this the
- driver makes up-calles to a user-mode script called *osd_login*
- The path_name of the script to use is by default:
- /sbin/osd_login.
- This name can be overridden by the Kernel module parameter:
- objlayoutdriver.osd_login_prog
- If Kernel does not find the osd_login_prog path it will zero it out
- and will not attempt farther logins. An admin can then write new value
- to the objlayoutdriver.osd_login_prog Kernel parameter to re-enable it.
- The /sbin/osd_login is part of the nfs-utils package, and should usually
- be installed on distributions that support this Kernel version.
- The API to the login script is as follows:
- Usage: $0 -u <URI> -o <OSDNAME> -s <SYSTEMID>
- Options:
- -u target uri e.g. iscsi://<ip>:<port>
- (always exists)
- (More protocols can be defined in the future.
- The client does not interpret this string it is
- passed unchanged as received from the Server)
- -o osdname of the requested target OSD
- (Might be empty)
- (A string which denotes the OSD name, there is a
- limit of 64 chars on this string)
- -s systemid of the requested target OSD
- (Might be empty)
- (This string, if not empty is always an hex
- representation of the 20 bytes osd_system_id)
- blocks-layout setup
- -------------------
- TODO: Document the setup needs of the blocks layout driver
|