123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674 |
- NOTE:
- This is one of the technical documents describing a component of
- Coda -- this document describes the client kernel-Venus interface.
- For more information:
- http://www.coda.cs.cmu.edu
- For user level software needed to run Coda:
- ftp://ftp.coda.cs.cmu.edu
- To run Coda you need to get a user level cache manager for the client,
- named Venus, as well as tools to manipulate ACLs, to log in, etc. The
- client needs to have the Coda filesystem selected in the kernel
- configuration.
- The server needs a user level server and at present does not depend on
- kernel support.
- The Venus kernel interface
- Peter J. Braam
- v1.0, Nov 9, 1997
- This document describes the communication between Venus and kernel
- level filesystem code needed for the operation of the Coda file sys-
- tem. This document version is meant to describe the current interface
- (version 1.0) as well as improvements we envisage.
- ______________________________________________________________________
- Table of Contents
- 1. Introduction
- 2. Servicing Coda filesystem calls
- 3. The message layer
- 3.1 Implementation details
- 4. The interface at the call level
- 4.1 Data structures shared by the kernel and Venus
- 4.2 The pioctl interface
- 4.3 root
- 4.4 lookup
- 4.5 getattr
- 4.6 setattr
- 4.7 access
- 4.8 create
- 4.9 mkdir
- 4.10 link
- 4.11 symlink
- 4.12 remove
- 4.13 rmdir
- 4.14 readlink
- 4.15 open
- 4.16 close
- 4.17 ioctl
- 4.18 rename
- 4.19 readdir
- 4.20 vget
- 4.21 fsync
- 4.22 inactive
- 4.23 rdwr
- 4.24 odymount
- 4.25 ody_lookup
- 4.26 ody_expand
- 4.27 prefetch
- 4.28 signal
- 5. The minicache and downcalls
- 5.1 INVALIDATE
- 5.2 FLUSH
- 5.3 PURGEUSER
- 5.4 ZAPFILE
- 5.5 ZAPDIR
- 5.6 ZAPVNODE
- 5.7 PURGEFID
- 5.8 REPLACE
- 6. Initialization and cleanup
- 6.1 Requirements
- ______________________________________________________________________
- 0wpage
- 11.. IInnttrroodduuccttiioonn
- A key component in the Coda Distributed File System is the cache
- manager, _V_e_n_u_s.
- When processes on a Coda enabled system access files in the Coda
- filesystem, requests are directed at the filesystem layer in the
- operating system. The operating system will communicate with Venus to
- service the request for the process. Venus manages a persistent
- client cache and makes remote procedure calls to Coda file servers and
- related servers (such as authentication servers) to service these
- requests it receives from the operating system. When Venus has
- serviced a request it replies to the operating system with appropriate
- return codes, and other data related to the request. Optionally the
- kernel support for Coda may maintain a minicache of recently processed
- requests to limit the number of interactions with Venus. Venus
- possesses the facility to inform the kernel when elements from its
- minicache are no longer valid.
- This document describes precisely this communication between the
- kernel and Venus. The definitions of so called upcalls and downcalls
- will be given with the format of the data they handle. We shall also
- describe the semantic invariants resulting from the calls.
- Historically Coda was implemented in a BSD file system in Mach 2.6.
- The interface between the kernel and Venus is very similar to the BSD
- VFS interface. Similar functionality is provided, and the format of
- the parameters and returned data is very similar to the BSD VFS. This
- leads to an almost natural environment for implementing a kernel-level
- filesystem driver for Coda in a BSD system. However, other operating
- systems such as Linux and Windows 95 and NT have virtual filesystem
- with different interfaces.
- To implement Coda on these systems some reverse engineering of the
- Venus/Kernel protocol is necessary. Also it came to light that other
- systems could profit significantly from certain small optimizations
- and modifications to the protocol. To facilitate this work as well as
- to make future ports easier, communication between Venus and the
- kernel should be documented in great detail. This is the aim of this
- document.
- 0wpage
- 22.. SSeerrvviicciinngg CCooddaa ffiilleessyysstteemm ccaallllss
- The service of a request for a Coda file system service originates in
- a process PP which accessing a Coda file. It makes a system call which
- traps to the OS kernel. Examples of such calls trapping to the kernel
- are _r_e_a_d_, _w_r_i_t_e_, _o_p_e_n_, _c_l_o_s_e_, _c_r_e_a_t_e_, _m_k_d_i_r_, _r_m_d_i_r_, _c_h_m_o_d in a Unix
- context. Similar calls exist in the Win32 environment, and are named
- _C_r_e_a_t_e_F_i_l_e_, .
- Generally the operating system handles the request in a virtual
- filesystem (VFS) layer, which is named I/O Manager in NT and IFS
- manager in Windows 95. The VFS is responsible for partial processing
- of the request and for locating the specific filesystem(s) which will
- service parts of the request. Usually the information in the path
- assists in locating the correct FS drivers. Sometimes after extensive
- pre-processing, the VFS starts invoking exported routines in the FS
- driver. This is the point where the FS specific processing of the
- request starts, and here the Coda specific kernel code comes into
- play.
- The FS layer for Coda must expose and implement several interfaces.
- First and foremost the VFS must be able to make all necessary calls to
- the Coda FS layer, so the Coda FS driver must expose the VFS interface
- as applicable in the operating system. These differ very significantly
- among operating systems, but share features such as facilities to
- read/write and create and remove objects. The Coda FS layer services
- such VFS requests by invoking one or more well defined services
- offered by the cache manager Venus. When the replies from Venus have
- come back to the FS driver, servicing of the VFS call continues and
- finishes with a reply to the kernel's VFS. Finally the VFS layer
- returns to the process.
- As a result of this design a basic interface exposed by the FS driver
- must allow Venus to manage message traffic. In particular Venus must
- be able to retrieve and place messages and to be notified of the
- arrival of a new message. The notification must be through a mechanism
- which does not block Venus since Venus must attend to other tasks even
- when no messages are waiting or being processed.
- Interfaces of the Coda FS Driver
- Furthermore the FS layer provides for a special path of communication
- between a user process and Venus, called the pioctl interface. The
- pioctl interface is used for Coda specific services, such as
- requesting detailed information about the persistent cache managed by
- Venus. Here the involvement of the kernel is minimal. It identifies
- the calling process and passes the information on to Venus. When
- Venus replies the response is passed back to the caller in unmodified
- form.
- Finally Venus allows the kernel FS driver to cache the results from
- certain services. This is done to avoid excessive context switches
- and results in an efficient system. However, Venus may acquire
- information, for example from the network which implies that cached
- information must be flushed or replaced. Venus then makes a downcall
- to the Coda FS layer to request flushes or updates in the cache. The
- kernel FS driver handles such requests synchronously.
- Among these interfaces the VFS interface and the facility to place,
- receive and be notified of messages are platform specific. We will
- not go into the calls exported to the VFS layer but we will state the
- requirements of the message exchange mechanism.
- 0wpage
- 33.. TThhee mmeessssaaggee llaayyeerr
- At the lowest level the communication between Venus and the FS driver
- proceeds through messages. The synchronization between processes
- requesting Coda file service and Venus relies on blocking and waking
- up processes. The Coda FS driver processes VFS- and pioctl-requests
- on behalf of a process P, creates messages for Venus, awaits replies
- and finally returns to the caller. The implementation of the exchange
- of messages is platform specific, but the semantics have (so far)
- appeared to be generally applicable. Data buffers are created by the
- FS Driver in kernel memory on behalf of P and copied to user memory in
- Venus.
- The FS Driver while servicing P makes upcalls to Venus. Such an
- upcall is dispatched to Venus by creating a message structure. The
- structure contains the identification of P, the message sequence
- number, the size of the request and a pointer to the data in kernel
- memory for the request. Since the data buffer is re-used to hold the
- reply from Venus, there is a field for the size of the reply. A flags
- field is used in the message to precisely record the status of the
- message. Additional platform dependent structures involve pointers to
- determine the position of the message on queues and pointers to
- synchronization objects. In the upcall routine the message structure
- is filled in, flags are set to 0, and it is placed on the _p_e_n_d_i_n_g
- queue. The routine calling upcall is responsible for allocating the
- data buffer; its structure will be described in the next section.
- A facility must exist to notify Venus that the message has been
- created, and implemented using available synchronization objects in
- the OS. This notification is done in the upcall context of the process
- P. When the message is on the pending queue, process P cannot proceed
- in upcall. The (kernel mode) processing of P in the filesystem
- request routine must be suspended until Venus has replied. Therefore
- the calling thread in P is blocked in upcall. A pointer in the
- message structure will locate the synchronization object on which P is
- sleeping.
- Venus detects the notification that a message has arrived, and the FS
- driver allow Venus to retrieve the message with a getmsg_from_kernel
- call. This action finishes in the kernel by putting the message on the
- queue of processing messages and setting flags to READ. Venus is
- passed the contents of the data buffer. The getmsg_from_kernel call
- now returns and Venus processes the request.
- At some later point the FS driver receives a message from Venus,
- namely when Venus calls sendmsg_to_kernel. At this moment the Coda FS
- driver looks at the contents of the message and decides if:
- +o the message is a reply for a suspended thread P. If so it removes
- the message from the processing queue and marks the message as
- WRITTEN. Finally, the FS driver unblocks P (still in the kernel
- mode context of Venus) and the sendmsg_to_kernel call returns to
- Venus. The process P will be scheduled at some point and continues
- processing its upcall with the data buffer replaced with the reply
- from Venus.
- +o The message is a _d_o_w_n_c_a_l_l. A downcall is a request from Venus to
- the FS Driver. The FS driver processes the request immediately
- (usually a cache eviction or replacement) and when it finishes
- sendmsg_to_kernel returns.
- Now P awakes and continues processing upcall. There are some
- subtleties to take account of. First P will determine if it was woken
- up in upcall by a signal from some other source (for example an
- attempt to terminate P) or as is normally the case by Venus in its
- sendmsg_to_kernel call. In the normal case, the upcall routine will
- deallocate the message structure and return. The FS routine can proceed
- with its processing.
- Sleeping and IPC arrangements
- In case P is woken up by a signal and not by Venus, it will first look
- at the flags field. If the message is not yet READ, the process P can
- handle its signal without notifying Venus. If Venus has READ, and
- the request should not be processed, P can send Venus a signal message
- to indicate that it should disregard the previous message. Such
- signals are put in the queue at the head, and read first by Venus. If
- the message is already marked as WRITTEN it is too late to stop the
- processing. The VFS routine will now continue. (-- If a VFS request
- involves more than one upcall, this can lead to complicated state, an
- extra field "handle_signals" could be added in the message structure
- to indicate points of no return have been passed.--)
- 33..11.. IImmpplleemmeennttaattiioonn ddeettaaiillss
- The Unix implementation of this mechanism has been through the
- implementation of a character device associated with Coda. Venus
- retrieves messages by doing a read on the device, replies are sent
- with a write and notification is through the select system call on the
- file descriptor for the device. The process P is kept waiting on an
- interruptible wait queue object.
- In Windows NT and the DPMI Windows 95 implementation a DeviceIoControl
- call is used. The DeviceIoControl call is designed to copy buffers
- from user memory to kernel memory with OPCODES. The sendmsg_to_kernel
- is issued as a synchronous call, while the getmsg_from_kernel call is
- asynchronous. Windows EventObjects are used for notification of
- message arrival. The process P is kept waiting on a KernelEvent
- object in NT and a semaphore in Windows 95.
- 0wpage
- 44.. TThhee iinntteerrffaaccee aatt tthhee ccaallll lleevveell
- This section describes the upcalls a Coda FS driver can make to Venus.
- Each of these upcalls make use of two structures: inputArgs and
- outputArgs. In pseudo BNF form the structures take the following
- form:
- struct inputArgs {
- u_long opcode;
- u_long unique; /* Keep multiple outstanding msgs distinct */
- u_short pid; /* Common to all */
- u_short pgid; /* Common to all */
- struct CodaCred cred; /* Common to all */
- <union "in" of call dependent parts of inputArgs>
- };
- struct outputArgs {
- u_long opcode;
- u_long unique; /* Keep multiple outstanding msgs distinct */
- u_long result;
- <union "out" of call dependent parts of inputArgs>
- };
- Before going on let us elucidate the role of the various fields. The
- inputArgs start with the opcode which defines the type of service
- requested from Venus. There are approximately 30 upcalls at present
- which we will discuss. The unique field labels the inputArg with a
- unique number which will identify the message uniquely. A process and
- process group id are passed. Finally the credentials of the caller
- are included.
- Before delving into the specific calls we need to discuss a variety of
- data structures shared by the kernel and Venus.
- 44..11.. DDaattaa ssttrruuccttuurreess sshhaarreedd bbyy tthhee kkeerrnneell aanndd VVeennuuss
- The CodaCred structure defines a variety of user and group ids as
- they are set for the calling process. The vuid_t and guid_t are 32 bit
- unsigned integers. It also defines group membership in an array. On
- Unix the CodaCred has proven sufficient to implement good security
- semantics for Coda but the structure may have to undergo modification
- for the Windows environment when these mature.
- struct CodaCred {
- vuid_t cr_uid, cr_euid, cr_suid, cr_fsuid; /* Real, effective, set, fs uid*/
- vgid_t cr_gid, cr_egid, cr_sgid, cr_fsgid; /* same for groups */
- vgid_t cr_groups[NGROUPS]; /* Group membership for caller */
- };
- NNOOTTEE It is questionable if we need CodaCreds in Venus. Finally Venus
- doesn't know about groups, although it does create files with the
- default uid/gid. Perhaps the list of group membership is superfluous.
- The next item is the fundamental identifier used to identify Coda
- files, the ViceFid. A fid of a file uniquely defines a file or
- directory in the Coda filesystem within a _c_e_l_l. (-- A _c_e_l_l is a
- group of Coda servers acting under the aegis of a single system
- control machine or SCM. See the Coda Administration manual for a
- detailed description of the role of the SCM.--)
- typedef struct ViceFid {
- VolumeId Volume;
- VnodeId Vnode;
- Unique_t Unique;
- } ViceFid;
- Each of the constituent fields: VolumeId, VnodeId and Unique_t are
- unsigned 32 bit integers. We envisage that a further field will need
- to be prefixed to identify the Coda cell; this will probably take the
- form of a Ipv6 size IP address naming the Coda cell through DNS.
- The next important structure shared between Venus and the kernel is
- the attributes of the file. The following structure is used to
- exchange information. It has room for future extensions such as
- support for device files (currently not present in Coda).
- struct coda_vattr {
- enum coda_vtype va_type; /* vnode type (for create) */
- u_short va_mode; /* files access mode and type */
- short va_nlink; /* number of references to file */
- vuid_t va_uid; /* owner user id */
- vgid_t va_gid; /* owner group id */
- long va_fsid; /* file system id (dev for now) */
- long va_fileid; /* file id */
- u_quad_t va_size; /* file size in bytes */
- long va_blocksize; /* blocksize preferred for i/o */
- struct timespec va_atime; /* time of last access */
- struct timespec va_mtime; /* time of last modification */
- struct timespec va_ctime; /* time file changed */
- u_long va_gen; /* generation number of file */
- u_long va_flags; /* flags defined for file */
- dev_t va_rdev; /* device special file represents */
- u_quad_t va_bytes; /* bytes of disk space held by file */
- u_quad_t va_filerev; /* file modification number */
- u_int va_vaflags; /* operations flags, see below */
- long va_spare; /* remain quad aligned */
- };
- 44..22.. TThhee ppiiooccttll iinntteerrffaaccee
- Coda specific requests can be made by application through the pioctl
- interface. The pioctl is implemented as an ordinary ioctl on a
- fictitious file /coda/.CONTROL. The pioctl call opens this file, gets
- a file handle and makes the ioctl call. Finally it closes the file.
- The kernel involvement in this is limited to providing the facility to
- open and close and pass the ioctl message _a_n_d to verify that a path in
- the pioctl data buffers is a file in a Coda filesystem.
- The kernel is handed a data packet of the form:
- struct {
- const char *path;
- struct ViceIoctl vidata;
- int follow;
- } data;
- where
- struct ViceIoctl {
- caddr_t in, out; /* Data to be transferred in, or out */
- short in_size; /* Size of input buffer <= 2K */
- short out_size; /* Maximum size of output buffer, <= 2K */
- };
- The path must be a Coda file, otherwise the ioctl upcall will not be
- made.
- NNOOTTEE The data structures and code are a mess. We need to clean this
- up.
- We now proceed to document the individual calls:
- 0wpage
- 44..33.. rroooott
- AArrgguummeennttss
- iinn empty
- oouutt
- struct cfs_root_out {
- ViceFid VFid;
- } cfs_root;
- DDeessccrriippttiioonn This call is made to Venus during the initialization of
- the Coda filesystem. If the result is zero, the cfs_root structure
- contains the ViceFid of the root of the Coda filesystem. If a non-zero
- result is generated, its value is a platform dependent error code
- indicating the difficulty Venus encountered in locating the root of
- the Coda filesystem.
- 0wpage
- 44..44.. llooookkuupp
- SSuummmmaarryy Find the ViceFid and type of an object in a directory if it
- exists.
- AArrgguummeennttss
- iinn
- struct cfs_lookup_in {
- ViceFid VFid;
- char *name; /* Place holder for data. */
- } cfs_lookup;
- oouutt
- struct cfs_lookup_out {
- ViceFid VFid;
- int vtype;
- } cfs_lookup;
- DDeessccrriippttiioonn This call is made to determine the ViceFid and filetype of
- a directory entry. The directory entry requested carries name name
- and Venus will search the directory identified by cfs_lookup_in.VFid.
- The result may indicate that the name does not exist, or that
- difficulty was encountered in finding it (e.g. due to disconnection).
- If the result is zero, the field cfs_lookup_out.VFid contains the
- targets ViceFid and cfs_lookup_out.vtype the coda_vtype giving the
- type of object the name designates.
- The name of the object is an 8 bit character string of maximum length
- CFS_MAXNAMLEN, currently set to 256 (including a 0 terminator.)
- It is extremely important to realize that Venus bitwise ors the field
- cfs_lookup.vtype with CFS_NOCACHE to indicate that the object should
- not be put in the kernel name cache.
- NNOOTTEE The type of the vtype is currently wrong. It should be
- coda_vtype. Linux does not take note of CFS_NOCACHE. It should.
- 0wpage
- 44..55.. ggeettaattttrr
- SSuummmmaarryy Get the attributes of a file.
- AArrgguummeennttss
- iinn
- struct cfs_getattr_in {
- ViceFid VFid;
- struct coda_vattr attr; /* XXXXX */
- } cfs_getattr;
- oouutt
- struct cfs_getattr_out {
- struct coda_vattr attr;
- } cfs_getattr;
- DDeessccrriippttiioonn This call returns the attributes of the file identified by
- fid.
- EErrrroorrss Errors can occur if the object with fid does not exist, is
- unaccessible or if the caller does not have permission to fetch
- attributes.
- NNoottee Many kernel FS drivers (Linux, NT and Windows 95) need to acquire
- the attributes as well as the Fid for the instantiation of an internal
- "inode" or "FileHandle". A significant improvement in performance on
- such systems could be made by combining the _l_o_o_k_u_p and _g_e_t_a_t_t_r calls
- both at the Venus/kernel interaction level and at the RPC level.
- The vattr structure included in the input arguments is superfluous and
- should be removed.
- 0wpage
- 44..66.. sseettaattttrr
- SSuummmmaarryy Set the attributes of a file.
- AArrgguummeennttss
- iinn
- struct cfs_setattr_in {
- ViceFid VFid;
- struct coda_vattr attr;
- } cfs_setattr;
- oouutt
- empty
- DDeessccrriippttiioonn The structure attr is filled with attributes to be changed
- in BSD style. Attributes not to be changed are set to -1, apart from
- vtype which is set to VNON. Other are set to the value to be assigned.
- The only attributes which the FS driver may request to change are the
- mode, owner, groupid, atime, mtime and ctime. The return value
- indicates success or failure.
- EErrrroorrss A variety of errors can occur. The object may not exist, may
- be inaccessible, or permission may not be granted by Venus.
- 0wpage
- 44..77.. aacccceessss
- SSuummmmaarryy
- AArrgguummeennttss
- iinn
- struct cfs_access_in {
- ViceFid VFid;
- int flags;
- } cfs_access;
- oouutt
- empty
- DDeessccrriippttiioonn Verify if access to the object identified by VFid for
- operations described by flags is permitted. The result indicates if
- access will be granted. It is important to remember that Coda uses
- ACLs to enforce protection and that ultimately the servers, not the
- clients enforce the security of the system. The result of this call
- will depend on whether a _t_o_k_e_n is held by the user.
- EErrrroorrss The object may not exist, or the ACL describing the protection
- may not be accessible.
- 0wpage
- 44..88.. ccrreeaattee
- SSuummmmaarryy Invoked to create a file
- AArrgguummeennttss
- iinn
- struct cfs_create_in {
- ViceFid VFid;
- struct coda_vattr attr;
- int excl;
- int mode;
- char *name; /* Place holder for data. */
- } cfs_create;
- oouutt
- struct cfs_create_out {
- ViceFid VFid;
- struct coda_vattr attr;
- } cfs_create;
- DDeessccrriippttiioonn This upcall is invoked to request creation of a file.
- The file will be created in the directory identified by VFid, its name
- will be name, and the mode will be mode. If excl is set an error will
- be returned if the file already exists. If the size field in attr is
- set to zero the file will be truncated. The uid and gid of the file
- are set by converting the CodaCred to a uid using a macro CRTOUID
- (this macro is platform dependent). Upon success the VFid and
- attributes of the file are returned. The Coda FS Driver will normally
- instantiate a vnode, inode or file handle at kernel level for the new
- object.
- EErrrroorrss A variety of errors can occur. Permissions may be insufficient.
- If the object exists and is not a file the error EISDIR is returned
- under Unix.
- NNOOTTEE The packing of parameters is very inefficient and appears to
- indicate confusion between the system call creat and the VFS operation
- create. The VFS operation create is only called to create new objects.
- This create call differs from the Unix one in that it is not invoked
- to return a file descriptor. The truncate and exclusive options,
- together with the mode, could simply be part of the mode as it is
- under Unix. There should be no flags argument; this is used in open
- (2) to return a file descriptor for READ or WRITE mode.
- The attributes of the directory should be returned too, since the size
- and mtime changed.
- 0wpage
- 44..99.. mmkkddiirr
- SSuummmmaarryy Create a new directory.
- AArrgguummeennttss
- iinn
- struct cfs_mkdir_in {
- ViceFid VFid;
- struct coda_vattr attr;
- char *name; /* Place holder for data. */
- } cfs_mkdir;
- oouutt
- struct cfs_mkdir_out {
- ViceFid VFid;
- struct coda_vattr attr;
- } cfs_mkdir;
- DDeessccrriippttiioonn This call is similar to create but creates a directory.
- Only the mode field in the input parameters is used for creation.
- Upon successful creation, the attr returned contains the attributes of
- the new directory.
- EErrrroorrss As for create.
- NNOOTTEE The input parameter should be changed to mode instead of
- attributes.
- The attributes of the parent should be returned since the size and
- mtime changes.
- 0wpage
- 44..1100.. lliinnkk
- SSuummmmaarryy Create a link to an existing file.
- AArrgguummeennttss
- iinn
- struct cfs_link_in {
- ViceFid sourceFid; /* cnode to link *to* */
- ViceFid destFid; /* Directory in which to place link */
- char *tname; /* Place holder for data. */
- } cfs_link;
- oouutt
- empty
- DDeessccrriippttiioonn This call creates a link to the sourceFid in the directory
- identified by destFid with name tname. The source must reside in the
- target's parent, i.e. the source must be have parent destFid, i.e. Coda
- does not support cross directory hard links. Only the return value is
- relevant. It indicates success or the type of failure.
- EErrrroorrss The usual errors can occur.0wpage
- 44..1111.. ssyymmlliinnkk
- SSuummmmaarryy create a symbolic link
- AArrgguummeennttss
- iinn
- struct cfs_symlink_in {
- ViceFid VFid; /* Directory to put symlink in */
- char *srcname;
- struct coda_vattr attr;
- char *tname;
- } cfs_symlink;
- oouutt
- none
- DDeessccrriippttiioonn Create a symbolic link. The link is to be placed in the
- directory identified by VFid and named tname. It should point to the
- pathname srcname. The attributes of the newly created object are to
- be set to attr.
- EErrrroorrss
- NNOOTTEE The attributes of the target directory should be returned since
- its size changed.
- 0wpage
- 44..1122.. rreemmoovvee
- SSuummmmaarryy Remove a file
- AArrgguummeennttss
- iinn
- struct cfs_remove_in {
- ViceFid VFid;
- char *name; /* Place holder for data. */
- } cfs_remove;
- oouutt
- none
- DDeessccrriippttiioonn Remove file named cfs_remove_in.name in directory
- identified by VFid.
- EErrrroorrss
- NNOOTTEE The attributes of the directory should be returned since its
- mtime and size may change.
- 0wpage
- 44..1133.. rrmmddiirr
- SSuummmmaarryy Remove a directory
- AArrgguummeennttss
- iinn
- struct cfs_rmdir_in {
- ViceFid VFid;
- char *name; /* Place holder for data. */
- } cfs_rmdir;
- oouutt
- none
- DDeessccrriippttiioonn Remove the directory with name name from the directory
- identified by VFid.
- EErrrroorrss
- NNOOTTEE The attributes of the parent directory should be returned since
- its mtime and size may change.
- 0wpage
- 44..1144.. rreeaaddlliinnkk
- SSuummmmaarryy Read the value of a symbolic link.
- AArrgguummeennttss
- iinn
- struct cfs_readlink_in {
- ViceFid VFid;
- } cfs_readlink;
- oouutt
- struct cfs_readlink_out {
- int count;
- caddr_t data; /* Place holder for data. */
- } cfs_readlink;
- DDeessccrriippttiioonn This routine reads the contents of symbolic link
- identified by VFid into the buffer data. The buffer data must be able
- to hold any name up to CFS_MAXNAMLEN (PATH or NAM??).
- EErrrroorrss No unusual errors.
- 0wpage
- 44..1155.. ooppeenn
- SSuummmmaarryy Open a file.
- AArrgguummeennttss
- iinn
- struct cfs_open_in {
- ViceFid VFid;
- int flags;
- } cfs_open;
- oouutt
- struct cfs_open_out {
- dev_t dev;
- ino_t inode;
- } cfs_open;
- DDeessccrriippttiioonn This request asks Venus to place the file identified by
- VFid in its cache and to note that the calling process wishes to open
- it with flags as in open(2). The return value to the kernel differs
- for Unix and Windows systems. For Unix systems the Coda FS Driver is
- informed of the device and inode number of the container file in the
- fields dev and inode. For Windows the path of the container file is
- returned to the kernel.
- EErrrroorrss
- NNOOTTEE Currently the cfs_open_out structure is not properly adapted to
- deal with the Windows case. It might be best to implement two
- upcalls, one to open aiming at a container file name, the other at a
- container file inode.
- 0wpage
- 44..1166.. cclloossee
- SSuummmmaarryy Close a file, update it on the servers.
- AArrgguummeennttss
- iinn
- struct cfs_close_in {
- ViceFid VFid;
- int flags;
- } cfs_close;
- oouutt
- none
- DDeessccrriippttiioonn Close the file identified by VFid.
- EErrrroorrss
- NNOOTTEE The flags argument is bogus and not used. However, Venus' code
- has room to deal with an execp input field, probably this field should
- be used to inform Venus that the file was closed but is still memory
- mapped for execution. There are comments about fetching versus not
- fetching the data in Venus vproc_vfscalls. This seems silly. If a
- file is being closed, the data in the container file is to be the new
- data. Here again the execp flag might be in play to create confusion:
- currently Venus might think a file can be flushed from the cache when
- it is still memory mapped. This needs to be understood.
- 0wpage
- 44..1177.. iiooccttll
- SSuummmmaarryy Do an ioctl on a file. This includes the pioctl interface.
- AArrgguummeennttss
- iinn
- struct cfs_ioctl_in {
- ViceFid VFid;
- int cmd;
- int len;
- int rwflag;
- char *data; /* Place holder for data. */
- } cfs_ioctl;
- oouutt
- struct cfs_ioctl_out {
- int len;
- caddr_t data; /* Place holder for data. */
- } cfs_ioctl;
- DDeessccrriippttiioonn Do an ioctl operation on a file. The command, len and
- data arguments are filled as usual. flags is not used by Venus.
- EErrrroorrss
- NNOOTTEE Another bogus parameter. flags is not used. What is the
- business about PREFETCHING in the Venus code?
- 0wpage
- 44..1188.. rreennaammee
- SSuummmmaarryy Rename a fid.
- AArrgguummeennttss
- iinn
- struct cfs_rename_in {
- ViceFid sourceFid;
- char *srcname;
- ViceFid destFid;
- char *destname;
- } cfs_rename;
- oouutt
- none
- DDeessccrriippttiioonn Rename the object with name srcname in directory
- sourceFid to destname in destFid. It is important that the names
- srcname and destname are 0 terminated strings. Strings in Unix
- kernels are not always null terminated.
- EErrrroorrss
- 0wpage
- 44..1199.. rreeaaddddiirr
- SSuummmmaarryy Read directory entries.
- AArrgguummeennttss
- iinn
- struct cfs_readdir_in {
- ViceFid VFid;
- int count;
- int offset;
- } cfs_readdir;
- oouutt
- struct cfs_readdir_out {
- int size;
- caddr_t data; /* Place holder for data. */
- } cfs_readdir;
- DDeessccrriippttiioonn Read directory entries from VFid starting at offset and
- read at most count bytes. Returns the data in data and returns
- the size in size.
- EErrrroorrss
- NNOOTTEE This call is not used. Readdir operations exploit container
- files. We will re-evaluate this during the directory revamp which is
- about to take place.
- 0wpage
- 44..2200.. vvggeett
- SSuummmmaarryy instructs Venus to do an FSDB->Get.
- AArrgguummeennttss
- iinn
- struct cfs_vget_in {
- ViceFid VFid;
- } cfs_vget;
- oouutt
- struct cfs_vget_out {
- ViceFid VFid;
- int vtype;
- } cfs_vget;
- DDeessccrriippttiioonn This upcall asks Venus to do a get operation on an fsobj
- labelled by VFid.
- EErrrroorrss
- NNOOTTEE This operation is not used. However, it is extremely useful
- since it can be used to deal with read/write memory mapped files.
- These can be "pinned" in the Venus cache using vget and released with
- inactive.
- 0wpage
- 44..2211.. ffssyynncc
- SSuummmmaarryy Tell Venus to update the RVM attributes of a file.
- AArrgguummeennttss
- iinn
- struct cfs_fsync_in {
- ViceFid VFid;
- } cfs_fsync;
- oouutt
- none
- DDeessccrriippttiioonn Ask Venus to update RVM attributes of object VFid. This
- should be called as part of kernel level fsync type calls. The
- result indicates if the syncing was successful.
- EErrrroorrss
- NNOOTTEE Linux does not implement this call. It should.
- 0wpage
- 44..2222.. iinnaaccttiivvee
- SSuummmmaarryy Tell Venus a vnode is no longer in use.
- AArrgguummeennttss
- iinn
- struct cfs_inactive_in {
- ViceFid VFid;
- } cfs_inactive;
- oouutt
- none
- DDeessccrriippttiioonn This operation returns EOPNOTSUPP.
- EErrrroorrss
- NNOOTTEE This should perhaps be removed.
- 0wpage
- 44..2233.. rrddwwrr
- SSuummmmaarryy Read or write from a file
- AArrgguummeennttss
- iinn
- struct cfs_rdwr_in {
- ViceFid VFid;
- int rwflag;
- int count;
- int offset;
- int ioflag;
- caddr_t data; /* Place holder for data. */
- } cfs_rdwr;
- oouutt
- struct cfs_rdwr_out {
- int rwflag;
- int count;
- caddr_t data; /* Place holder for data. */
- } cfs_rdwr;
- DDeessccrriippttiioonn This upcall asks Venus to read or write from a file.
- EErrrroorrss
- NNOOTTEE It should be removed since it is against the Coda philosophy that
- read/write operations never reach Venus. I have been told the
- operation does not work. It is not currently used.
- 0wpage
- 44..2244.. ooddyymmoouunntt
- SSuummmmaarryy Allows mounting multiple Coda "filesystems" on one Unix mount
- point.
- AArrgguummeennttss
- iinn
- struct ody_mount_in {
- char *name; /* Place holder for data. */
- } ody_mount;
- oouutt
- struct ody_mount_out {
- ViceFid VFid;
- } ody_mount;
- DDeessccrriippttiioonn Asks Venus to return the rootfid of a Coda system named
- name. The fid is returned in VFid.
- EErrrroorrss
- NNOOTTEE This call was used by David for dynamic sets. It should be
- removed since it causes a jungle of pointers in the VFS mounting area.
- It is not used by Coda proper. Call is not implemented by Venus.
- 0wpage
- 44..2255.. ooddyy__llooookkuupp
- SSuummmmaarryy Looks up something.
- AArrgguummeennttss
- iinn irrelevant
- oouutt
- irrelevant
- DDeessccrriippttiioonn
- EErrrroorrss
- NNOOTTEE Gut it. Call is not implemented by Venus.
- 0wpage
- 44..2266.. ooddyy__eexxppaanndd
- SSuummmmaarryy expands something in a dynamic set.
- AArrgguummeennttss
- iinn irrelevant
- oouutt
- irrelevant
- DDeessccrriippttiioonn
- EErrrroorrss
- NNOOTTEE Gut it. Call is not implemented by Venus.
- 0wpage
- 44..2277.. pprreeffeettcchh
- SSuummmmaarryy Prefetch a dynamic set.
- AArrgguummeennttss
- iinn Not documented.
- oouutt
- Not documented.
- DDeessccrriippttiioonn Venus worker.cc has support for this call, although it is
- noted that it doesn't work. Not surprising, since the kernel does not
- have support for it. (ODY_PREFETCH is not a defined operation).
- EErrrroorrss
- NNOOTTEE Gut it. It isn't working and isn't used by Coda.
- 0wpage
- 44..2288.. ssiiggnnaall
- SSuummmmaarryy Send Venus a signal about an upcall.
- AArrgguummeennttss
- iinn none
- oouutt
- not applicable.
- DDeessccrriippttiioonn This is an out-of-band upcall to Venus to inform Venus
- that the calling process received a signal after Venus read the
- message from the input queue. Venus is supposed to clean up the
- operation.
- EErrrroorrss No reply is given.
- NNOOTTEE We need to better understand what Venus needs to clean up and if
- it is doing this correctly. Also we need to handle multiple upcall
- per system call situations correctly. It would be important to know
- what state changes in Venus take place after an upcall for which the
- kernel is responsible for notifying Venus to clean up (e.g. open
- definitely is such a state change, but many others are maybe not).
- 0wpage
- 55.. TThhee mmiinniiccaacchhee aanndd ddoowwnnccaallllss
- The Coda FS Driver can cache results of lookup and access upcalls, to
- limit the frequency of upcalls. Upcalls carry a price since a process
- context switch needs to take place. The counterpart of caching the
- information is that Venus will notify the FS Driver that cached
- entries must be flushed or renamed.
- The kernel code generally has to maintain a structure which links the
- internal file handles (called vnodes in BSD, inodes in Linux and
- FileHandles in Windows) with the ViceFid's which Venus maintains. The
- reason is that frequent translations back and forth are needed in
- order to make upcalls and use the results of upcalls. Such linking
- objects are called ccnnooddeess.
- The current minicache implementations have cache entries which record
- the following:
- 1. the name of the file
- 2. the cnode of the directory containing the object
- 3. a list of CodaCred's for which the lookup is permitted.
- 4. the cnode of the object
- The lookup call in the Coda FS Driver may request the cnode of the
- desired object from the cache, by passing its name, directory and the
- CodaCred's of the caller. The cache will return the cnode or indicate
- that it cannot be found. The Coda FS Driver must be careful to
- invalidate cache entries when it modifies or removes objects.
- When Venus obtains information that indicates that cache entries are
- no longer valid, it will make a downcall to the kernel. Downcalls are
- intercepted by the Coda FS Driver and lead to cache invalidations of
- the kind described below. The Coda FS Driver does not return an error
- unless the downcall data could not be read into kernel memory.
- 55..11.. IINNVVAALLIIDDAATTEE
- No information is available on this call.
- 55..22.. FFLLUUSSHH
- AArrgguummeennttss None
- SSuummmmaarryy Flush the name cache entirely.
- DDeessccrriippttiioonn Venus issues this call upon startup and when it dies. This
- is to prevent stale cache information being held. Some operating
- systems allow the kernel name cache to be switched off dynamically.
- When this is done, this downcall is made.
- 55..33.. PPUURRGGEEUUSSEERR
- AArrgguummeennttss
- struct cfs_purgeuser_out {/* CFS_PURGEUSER is a venus->kernel call */
- struct CodaCred cred;
- } cfs_purgeuser;
- DDeessccrriippttiioonn Remove all entries in the cache carrying the Cred. This
- call is issued when tokens for a user expire or are flushed.
- 55..44.. ZZAAPPFFIILLEE
- AArrgguummeennttss
- struct cfs_zapfile_out { /* CFS_ZAPFILE is a venus->kernel call */
- ViceFid CodaFid;
- } cfs_zapfile;
- DDeessccrriippttiioonn Remove all entries which have the (dir vnode, name) pair.
- This is issued as a result of an invalidation of cached attributes of
- a vnode.
- NNOOTTEE Call is not named correctly in NetBSD and Mach. The minicache
- zapfile routine takes different arguments. Linux does not implement
- the invalidation of attributes correctly.
- 55..55.. ZZAAPPDDIIRR
- AArrgguummeennttss
- struct cfs_zapdir_out { /* CFS_ZAPDIR is a venus->kernel call */
- ViceFid CodaFid;
- } cfs_zapdir;
- DDeessccrriippttiioonn Remove all entries in the cache lying in a directory
- CodaFid, and all children of this directory. This call is issued when
- Venus receives a callback on the directory.
- 55..66.. ZZAAPPVVNNOODDEE
- AArrgguummeennttss
- struct cfs_zapvnode_out { /* CFS_ZAPVNODE is a venus->kernel call */
- struct CodaCred cred;
- ViceFid VFid;
- } cfs_zapvnode;
- DDeessccrriippttiioonn Remove all entries in the cache carrying the cred and VFid
- as in the arguments. This downcall is probably never issued.
- 55..77.. PPUURRGGEEFFIIDD
- SSuummmmaarryy
- AArrgguummeennttss
- struct cfs_purgefid_out { /* CFS_PURGEFID is a venus->kernel call */
- ViceFid CodaFid;
- } cfs_purgefid;
- DDeessccrriippttiioonn Flush the attribute for the file. If it is a dir (odd
- vnode), purge its children from the namecache and remove the file from the
- namecache.
- 55..88.. RREEPPLLAACCEE
- SSuummmmaarryy Replace the Fid's for a collection of names.
- AArrgguummeennttss
- struct cfs_replace_out { /* cfs_replace is a venus->kernel call */
- ViceFid NewFid;
- ViceFid OldFid;
- } cfs_replace;
- DDeessccrriippttiioonn This routine replaces a ViceFid in the name cache with
- another. It is added to allow Venus during reintegration to replace
- locally allocated temp fids while disconnected with global fids even
- when the reference counts on those fids are not zero.
- 0wpage
- 66.. IInniittiiaalliizzaattiioonn aanndd cclleeaannuupp
- This section gives brief hints as to desirable features for the Coda
- FS Driver at startup and upon shutdown or Venus failures. Before
- entering the discussion it is useful to repeat that the Coda FS Driver
- maintains the following data:
- 1. message queues
- 2. cnodes
- 3. name cache entries
- The name cache entries are entirely private to the driver, so they
- can easily be manipulated. The message queues will generally have
- clear points of initialization and destruction. The cnodes are
- much more delicate. User processes hold reference counts in Coda
- filesystems and it can be difficult to clean up the cnodes.
- It can expect requests through:
- 1. the message subsystem
- 2. the VFS layer
- 3. pioctl interface
- Currently the _p_i_o_c_t_l passes through the VFS for Coda so we can
- treat these similarly.
- 66..11.. RReeqquuiirreemmeennttss
- The following requirements should be accommodated:
- 1. The message queues should have open and close routines. On Unix
- the opening of the character devices are such routines.
- +o Before opening, no messages can be placed.
- +o Opening will remove any old messages still pending.
- +o Close will notify any sleeping processes that their upcall cannot
- be completed.
- +o Close will free all memory allocated by the message queues.
- 2. At open the namecache shall be initialized to empty state.
- 3. Before the message queues are open, all VFS operations will fail.
- Fortunately this can be achieved by making sure than mounting the
- Coda filesystem cannot succeed before opening.
- 4. After closing of the queues, no VFS operations can succeed. Here
- one needs to be careful, since a few operations (lookup,
- read/write, readdir) can proceed without upcalls. These must be
- explicitly blocked.
- 5. Upon closing the namecache shall be flushed and disabled.
- 6. All memory held by cnodes can be freed without relying on upcalls.
- 7. Unmounting the file system can be done without relying on upcalls.
- 8. Mounting the Coda filesystem should fail gracefully if Venus cannot
- get the rootfid or the attributes of the rootfid. The latter is
- best implemented by Venus fetching these objects before attempting
- to mount.
- NNOOTTEE NetBSD in particular but also Linux have not implemented the
- above requirements fully. For smooth operation this needs to be
- corrected.
|