123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230 |
- APACHE 2.x ROADMAP
- ==================
- Last modified at [$Date: 2005-03-14 05:24:22 +0000 (Mon, 14 Mar 2005) $]
- WORKS IN PROGRESS
- -----------------
- * Source code should follow style guidelines.
- OK, we all agree pretty code is good. Probably best to clean this
- up by hand immediately upon branching a 2.1 tree.
- Status: Justin volunteers to hand-edit the entire source tree ;)
- Justin says:
- Recall when the release plan for 2.0 was written:
- Absolute Enforcement of an "Apache Style" for code.
- Watch this slip into 3.0.
- David says:
- The style guide needs to be reviewed before this can be done.
- http://httpd.apache.org/dev/styleguide.html
- The current file is dated April 20th 1998!
- OtherBill offers:
- It's survived since '98 because it's welldone :-) Suggest we
- simply follow whatever is documented in styleguide.html as we
- branch the next tree. Really sort of straightforward, if you
- dislike a bit within that doc, bring it up on the dev@httpd
- list prior to the next branch.
- So Bill sums up ... let's get the code cleaned up in CVS head.
- Remember, it just takes cvs diff -b (that is, --ignore-space-change)
- to see the code changes and ignore that cruft. Get editing Justin :)
- * Replace stat [deferred open] with open/fstat in directory_walk.
- Justin, Ian, OtherBill all interested in this. Implies setting up
- the apr_file_t member in request_rec, and having all modules use
- that file, and allow the cleanup to close it [if it isn't a shared,
- cached file handle.]
- * The Async Apache Server implemented in terms of APR.
- [Bill Stoddard's pet project.]
- Message-ID: <008301c17d42$9b446970$01000100@sashimi> (dev@apr)
- OtherBill notes that this can proceed in two parts...
- Async accept, setup, and tear-down of the request
- e.g. dealing with the incoming request headers, prior to
- dispatching the request to a thread for processing.
- This doesn't need to wait for a 2.x/3.0 bump.
- Async delegation of the entire request processing chain
- Too many handlers use stack storage and presume it is
- available for the life of the request, so a complete
- async implementation would need to happen 3.0 release.
- Brian notes that async writes will provide a bigger
- scalability win than async reads for most servers.
- We may want to try a hybrid sync-read/async-write MPM
- as a next step. This should be relatively easy to
- build: start with the current worker or leader/followers
- model, but hand off each response brigade to a "completion
- thread" that multiplexes writes on many connections, so
- that the worker thread doesn't have to wait around for
- the sendfile to complete.
- MAKING APACHE REPOSITORY-AGNOSTIC
- (or: remove knowledge of the filesystem)
- [ 2002/10/01: discussion in progress on items below; this isn't
- planned yet ]
- * dav_resource concept for an HTTP resource ("ap_resource")
- * r->filename, r->canonical_filename, r->finfo need to
- disappear. All users need to use new APIs on the ap_resource
- object.
-
- (backwards compat: today, when this occurs with mod_dav and a
- custom backend, the above items refer to the topmost directory
- mapped by a location; e.g. docroot)
- Need to preserve a 'filename'-like string for mime-by-name
- sorts of operations. But this only needs to be the name itself
- and not a full path.
- Justin: Can we leverage the path info, or do we not trust the
- user?
- gstein: well, it isn't the "path info", but the actual URI of
- the resource. And of course we trust the user... that is
- the resource they requested.
-
- dav_resource->uri is the field you want. path_info might
- still exist, but that portion might be related to the
- CGI concept of "path translated" or some other further
- resolution.
-
- To continue, I would suggest that "path translated" and
- having *any* path info is Badness. It means that you did
- not fully resolve a resource for the given URI. The
- "abs_path" in a URI identifies a resource, and that
- should get fully resolved. None of this "resolve to
- <here> and then we have a magical second resolution
- (inside the CGI script)" or somesuch.
-
- Justin: Well, let's consider mod_mbox for a second. It is sort of
- a virtual filesystem in its own right - as it introduces
- it's own notion of a URI space, but it is intrinsically
- tied to the filesystem to do the lookups. But, for the
- portion that isn't resolved on the file system, it has
- its own addressing scheme. Do we need the ability to
- layer resolution?
- * The translate_name hook goes away
- Wrowe altogether disagrees. translate_name today even operates
- on URIs ... this mechansim needs to be preserved.
-
- * The doc for map_to_storage is totally opaque to me. It has
- something to do with filesystems, but it also talks about
- security and per_dir_config and other stuff. I presume something
- needs to happen there -- at least better doc.
- Wrowe agrees and will write it up.
- * The directory_walk concept disappears. All configuration is
- tagged to Locations. The "mod_filesystem" module might have some
- internal concept of the same config appearing in multiple
- places, but that is handled internally rather than by Apache
- core.
- Wrowe suggests this is wrong, instead it's private to filesystem
- requests, and is already invoked from map_to_storage, not the core
- handler. <Directory > and <Files > blocks are preserved as-is,
- but <Directory > sections become specific to the filesystem handler
- alone. Because alternate filesystem schemes could be loaded, this
- should be exposed, from the core, for other file-based stores to
- share. Consider an archive store where the layers become
- <Directory path> -> <Archive store> -> <File name>
-
- Justin: How do we map Directory entries to Locations?
-
- * The "Location tree" is an in-memory representation of the URL
- namespace. Nodes of the tree have configuration specific to that
- location in the namespace.
-
- Something like:
-
- typedef struct {
- const char *name; /* name of this node relative to parent */
- struct ap_conf_vector_t *locn_config;
- apr_hash_t *children; /* NULL if no child configs */
- } ap_locn_node;
- The following config:
-
- <Location /server-status>
- SetHandler server-status
- Order deny,allow
- Deny from all
- Allow from 127.0.0.1
- </Location>
-
- Creates a node with name=="server_status", and the node is a
- child of the "/" node. (hmm. node->name is redundant with the
- hash key; maybe drop node->name)
-
- In the config vector, mod_access has stored its Order, Deny, and
- Allow configs. mod_core has stored the SetHandler.
-
- During the Location walk, we merge the config vectors normally.
-
- Note that an Alias simply associates a filesystem path (in
- mod_filesystem) with that Location in the tree. Merging
- continues with child locations, but a merge is never done
- through filesystem locations. Config on a specific subdir needs
- to be mapped back into the corresponding point in the Location
- tree for proper merging.
- * Config is parsed into a tree, as we did for the 2.0 timeframe,
- but that tree is just a representation of the config (for
- multiple runs and for in-memory manipulation and usage). It is
- unrelated to the "Location tree".
- * Calls to apr_file_io functions generally need to be replaced
- with operations against the ap_resource. For example, rather
- than calling apr_dir_open/read/close(), a caller uses
- resource->repos->get_children() or somesuch.
- Note that things like mod_dir, mod_autoindex, and mod_negotation
- need to be converted to use these mechanisms so that their
- functions will work on logical repositories rather than just
- filesystems.
- * How do we handle CGI scripts? Especially when the resource may
- not be backed by a file? Ideally, we should be able to come up
- with some mechanism to allow CGIs to work in a
- repository-independent manner.
- - Writing the virtual data as a file and then executing it?
- - Can a shell be executed in a streamy manner? (Portably?)
- - Have an 'execute_resource' hook/func that allows the
- repository to choose its manner - be it exec() or whatever.
- - Won't this approach lead to duplication of code? Helper fns?
- gstein: PHP, Perl, and Python scripts are nominally executed by
- a filter inserted by mod_php/perl/python. I'd suggest
- that shell/batch scripts are similar.
- But to ask further: what if it is an executable
- *program* rather than just a script? Do we yank that out
- of the repository, drop it onto the filesystem, and run
- it? eeewwwww...
-
- I'll vote -0.9 for CGIs as a filter. Keep 'em handlers.
- Justin: So, do we give up executing CGIs from virtual repositories?
- That seems like a sad tradeoff to make. I'd like to have
- my CGI scripts under DAV (SVN) control.
- * How do we handle overlaying of Location and Directory entries?
- Right now, we have a problem when /cgi-bin/ is ScriptAlias'd and
- mod_dav has control over /. Some people believe that /cgi-bin/
- shouldn't be under DAV control, while others do believe it
- should be. What's the right strategy?
|