123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233 |
- suppose rust uses a module system similar to the flatt & owens one.
- we have 2 types that make it up: units and modules.
- both are 1st class values: they can be passed around as values.
- a module defines a single letrec of types and constants. modules are
- *always* concrete, usable values. the types can be inhabited by
- values, the constants can be accessed and used. A top-level module
- needs to have all its dependencies satisfied to compile properly.
- a unit defines a statically-parameterized module. units cannot be
- "reached into". units have imports and exports. the set of imports and
- exports, as well as any relationships between those, makes up a unit
- type for the unit.
- units can be wired together into composite units, in a specific
- expression form, and if a unit value has no imports remaining it can
- be instantiated into a module containing all of its exports.
- the compilation model of rust is then that you write your code in a
- bunch of files and then you write a top-level "compilation entry"
- module that implements a standard tooling interface: one that, when
- it is itself compiled, produces a data structure that the compiler
- uses to drive compilation of the rest of your files into a module.
- the module produced by compile time should probably itself be one that
- implements a load-time tooling interface, and only *that* causes the
- construction of runtime artifacts (rather than baking a set of
- load-time actions into the language semantics).
- the module produced by compilation, and the subsequent module produced
- by loading, may not be done of course; you can always have it depend
- on some further runtime parameterization. but this organization gives
- you the *option* to do quite a lot of compile-time and load-time
- metaprogramming.
- so erm...
- units: static parts with imports and exports; what the compiler produces and the linker consumes
- modules: records of constants (including functions and types) that can be projected to interfaces
- programs: templates for processes
- processes: identities, running programs; each copy differs, arranged in a tree
- pure values: values constructed from literals and type-constructors alone
- general values: pure values AND channels to processes (that may go dead)
- limited values: general values AND processes
- all copies of general values are identical; they may be freely
- copied. limited values may not be copied, as they represent
- identities.
- All type constructors (including modules and functions) may be pure,
- general or limited. Purity requires that all subcomponents -- and all
- arguments to subcomponents -- are pure. Limitation requires that all
- *supercomponents* -- and all functions taking or returning the limited
- type -- are limited. Purity is delicate, limitation is
- infectious.
- Modules may implement a particular interface. There's an explicit
- structural subtyping "action" that can coerce a module of one
- interface to a subtype-compatible interface. This produces a remapping
- table on the fly.
- Note that functions can accept or return limited values, whereas
- channels cannot; channels can also not be pure, by definition. A
- general channel can therefore be promoted to a general function, and
- this is performed automatically by the explicit subtyping operation on
- interfaces (as is every form of type-state weakening on function
- signatures that follows subtype logic).
- When you spawn a process, you get a limited module holding the
- process, and a collection of channels for all the process' ports.
- This limited module can be coerced down to a non-limited (general)
- interface full of functions, which is the usual mechanism.
- ---- snip ----
- No. Let's back off from recursive "modules". It's a difficult and
- researchy area and we're not in the business of doing research. Rust
- is supposed to *work* predictably and reliably, even if that means
- being a little sub-optimal. Let's go back to the original design:
- A crate is a unit of compilation, loading, naming and distribution:
- an ELF .so or PE .dll.
- A crate depends on other crates in a strict DAG.
- crate dependencies are expressed by label, label+version, or
- label+version+authenticators (SHA1 hashes or such). We call each of
- these a crate specifier. A given specifier gets a versioned list of
- crates at runtime, so you can do a little bit of hot-code
- upgrade/rollback.
- crate linkage requires type compatibility on transparent exports, and
- specifier refinement for opaque exports. If P and Q both require
- specifiers that can be satisfied by the same crate R, then their
- dependencies on R are merged into a composite dependency that has the
- more-specific of the two dependencies.
- inter-crate calls twiddle crate refcounts.
- double vision: A uses B and C. B and C use D. D exports T. B defines
- a function f(int)->T and C defines a function g(T)->int. Code in A
- tries to do g(f(10)). Does this typecheck?
- Let's assume T is opaquely exported because if it's transparent, it's
- simple. So for the opaque case, typechecking succeeds if we decide
- the two Ts are the same. How?
- Every name reference (type or otherwise) resolves to a number inside a
- specifier. Every specifier in the dependency set of a crate causes the
- allocation of a crate-slot at runtime and is filled in by a crate that
- satisfies the specifier. For B or C to export f or g, they have to
- export their view of T, which means they have to export their
- specifiers for D. In other words: each crate exports a set of names
- and a map from those names to offsets in runtime crate-slots, each of
- which is loaded in order to satisfy a specifier. So the crate has to
- statically export every specifier used to support a name it's
- exporting, as part of the description of the exported name.
- A imports B and C, so its set of specifiers is at least the union of
- the exported specifiers of B and C. This includes two specifiers for
- D, of possibly-varying specificity. The expression g(f(10)) checks iff
- the specifiers for D from B and C can be merged into a single
- specifier / dependency for D.
- Conventional "double vision" as dreyer complains about is only a
- problem in the context of *recursive* modules, and we're doing a
- system that is not concerned with that. Recursion *within* a crate is
- fine, recursion *between* crates is verboten.
- This is a perfectly reasonable design, allows medium-level static
- recursion (within the crate), and has important practical
- interpretation (fast loading, clear load boundaries, hot loading, hot
- unloading, etc.) so I am happy with the restriction.
- Question: do we support existential ADTs? Or just plugs? I would say
- the latter, but what about "pure data structures", eg. "associative
- map"? How do you abstract it away w/o abstracting away from
- concurrency / direct function-call access? An existential would be
- ok... but syntactically perhaps awkward. Seems also to duplicate some
- of the machinery of crates. And plugs. Bleah.
- Maybe a plug should be an existential? Well sure. But let's call it a
- module.
- assoc_map = mod[T] { type C[T];
- fun* each(C[T])->T;
- fun* has(C[T],T)->bool; }
- Here is the key point: a module is not a functor, is not recursive,
- and does not imply independent compilation. A crate, a record *or* a
- process can be used to implement a module type.
- --- aug 2009 ---
- a worked-out container module example
- type map[k,v] = mod { type t;
- fn new(fn compare(~k a,~k b)->bool) -> t;
- fn set(^t m, ~k key, ~v val) -> ();
- fn? find(~t m, ~k key) -> ~v; };
- mod treemap[k,v] {
- type t = rec(...);
- fn new() -> t {
- ...
- }
- fn set(^t m, ~k key, ~v val) -> () {
- ...
- }
- fn? find(~t m, ~k key) -> ~v {
- ...
- }
- }
- fn get_a_map_impl[k,v]() -> map[k,v] {
- typecase (k) {
- ...
- }
- }
- fn compare(~int a, ~int b) -> bool { ... }
- let map[int,str] mv = get_a_map_impl[int,str]();
- mod m = mapv;
- // or ...
- mod m = get_a_map_impl[int,str]();
- auto t = m.new(compare);
- m.set(t, 10, "hello");
- m.set(t, 12, "there");
- // Note that from a semantics level, ophel suggests can treat generic
- // items as modules with an opaque type:
- type modt = mod { type t; fn f(~t) -> (); }
- ~=
- fn[t] f(~t) -> () {...}
- at least for the purposes of passing them around. I think this is a possible de-sugaring if you're
- doing a proof in coq, but I don't want to force users to write this way. The "throwaway module whose
- only purpose is to parameterize its contained item" idiom seems clunky at best. Also it has a
- weird falling-apart-ness when it comes to generic *types*:
- eg: how would you do the map case?
- fn mk_map(mod {type k; type v; } params) ->
- mod { type t;
- fn new() -> t;
- fn add(~params.k, ~params.v) -> ();
- fn? find(~params.k) -> ~params.v; };
- now we run into the sticky question of the type of the return value of mk_map. I think it's not
- denotable in this language; or if it is it involves nutty things like existentials and higher-kinds.
- even ophel defers here and says "let's have a separate binding form map[T]" for a type constructor.
|