module-system.txt 8.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233
  1. suppose rust uses a module system similar to the flatt & owens one.
  2. we have 2 types that make it up: units and modules.
  3. both are 1st class values: they can be passed around as values.
  4. a module defines a single letrec of types and constants. modules are
  5. *always* concrete, usable values. the types can be inhabited by
  6. values, the constants can be accessed and used. A top-level module
  7. needs to have all its dependencies satisfied to compile properly.
  8. a unit defines a statically-parameterized module. units cannot be
  9. "reached into". units have imports and exports. the set of imports and
  10. exports, as well as any relationships between those, makes up a unit
  11. type for the unit.
  12. units can be wired together into composite units, in a specific
  13. expression form, and if a unit value has no imports remaining it can
  14. be instantiated into a module containing all of its exports.
  15. the compilation model of rust is then that you write your code in a
  16. bunch of files and then you write a top-level "compilation entry"
  17. module that implements a standard tooling interface: one that, when
  18. it is itself compiled, produces a data structure that the compiler
  19. uses to drive compilation of the rest of your files into a module.
  20. the module produced by compile time should probably itself be one that
  21. implements a load-time tooling interface, and only *that* causes the
  22. construction of runtime artifacts (rather than baking a set of
  23. load-time actions into the language semantics).
  24. the module produced by compilation, and the subsequent module produced
  25. by loading, may not be done of course; you can always have it depend
  26. on some further runtime parameterization. but this organization gives
  27. you the *option* to do quite a lot of compile-time and load-time
  28. metaprogramming.
  29. so erm...
  30. units: static parts with imports and exports; what the compiler produces and the linker consumes
  31. modules: records of constants (including functions and types) that can be projected to interfaces
  32. programs: templates for processes
  33. processes: identities, running programs; each copy differs, arranged in a tree
  34. pure values: values constructed from literals and type-constructors alone
  35. general values: pure values AND channels to processes (that may go dead)
  36. limited values: general values AND processes
  37. all copies of general values are identical; they may be freely
  38. copied. limited values may not be copied, as they represent
  39. identities.
  40. All type constructors (including modules and functions) may be pure,
  41. general or limited. Purity requires that all subcomponents -- and all
  42. arguments to subcomponents -- are pure. Limitation requires that all
  43. *supercomponents* -- and all functions taking or returning the limited
  44. type -- are limited. Purity is delicate, limitation is
  45. infectious.
  46. Modules may implement a particular interface. There's an explicit
  47. structural subtyping "action" that can coerce a module of one
  48. interface to a subtype-compatible interface. This produces a remapping
  49. table on the fly.
  50. Note that functions can accept or return limited values, whereas
  51. channels cannot; channels can also not be pure, by definition. A
  52. general channel can therefore be promoted to a general function, and
  53. this is performed automatically by the explicit subtyping operation on
  54. interfaces (as is every form of type-state weakening on function
  55. signatures that follows subtype logic).
  56. When you spawn a process, you get a limited module holding the
  57. process, and a collection of channels for all the process' ports.
  58. This limited module can be coerced down to a non-limited (general)
  59. interface full of functions, which is the usual mechanism.
  60. ---- snip ----
  61. No. Let's back off from recursive "modules". It's a difficult and
  62. researchy area and we're not in the business of doing research. Rust
  63. is supposed to *work* predictably and reliably, even if that means
  64. being a little sub-optimal. Let's go back to the original design:
  65. A crate is a unit of compilation, loading, naming and distribution:
  66. an ELF .so or PE .dll.
  67. A crate depends on other crates in a strict DAG.
  68. crate dependencies are expressed by label, label+version, or
  69. label+version+authenticators (SHA1 hashes or such). We call each of
  70. these a crate specifier. A given specifier gets a versioned list of
  71. crates at runtime, so you can do a little bit of hot-code
  72. upgrade/rollback.
  73. crate linkage requires type compatibility on transparent exports, and
  74. specifier refinement for opaque exports. If P and Q both require
  75. specifiers that can be satisfied by the same crate R, then their
  76. dependencies on R are merged into a composite dependency that has the
  77. more-specific of the two dependencies.
  78. inter-crate calls twiddle crate refcounts.
  79. double vision: A uses B and C. B and C use D. D exports T. B defines
  80. a function f(int)->T and C defines a function g(T)->int. Code in A
  81. tries to do g(f(10)). Does this typecheck?
  82. Let's assume T is opaquely exported because if it's transparent, it's
  83. simple. So for the opaque case, typechecking succeeds if we decide
  84. the two Ts are the same. How?
  85. Every name reference (type or otherwise) resolves to a number inside a
  86. specifier. Every specifier in the dependency set of a crate causes the
  87. allocation of a crate-slot at runtime and is filled in by a crate that
  88. satisfies the specifier. For B or C to export f or g, they have to
  89. export their view of T, which means they have to export their
  90. specifiers for D. In other words: each crate exports a set of names
  91. and a map from those names to offsets in runtime crate-slots, each of
  92. which is loaded in order to satisfy a specifier. So the crate has to
  93. statically export every specifier used to support a name it's
  94. exporting, as part of the description of the exported name.
  95. A imports B and C, so its set of specifiers is at least the union of
  96. the exported specifiers of B and C. This includes two specifiers for
  97. D, of possibly-varying specificity. The expression g(f(10)) checks iff
  98. the specifiers for D from B and C can be merged into a single
  99. specifier / dependency for D.
  100. Conventional "double vision" as dreyer complains about is only a
  101. problem in the context of *recursive* modules, and we're doing a
  102. system that is not concerned with that. Recursion *within* a crate is
  103. fine, recursion *between* crates is verboten.
  104. This is a perfectly reasonable design, allows medium-level static
  105. recursion (within the crate), and has important practical
  106. interpretation (fast loading, clear load boundaries, hot loading, hot
  107. unloading, etc.) so I am happy with the restriction.
  108. Question: do we support existential ADTs? Or just plugs? I would say
  109. the latter, but what about "pure data structures", eg. "associative
  110. map"? How do you abstract it away w/o abstracting away from
  111. concurrency / direct function-call access? An existential would be
  112. ok... but syntactically perhaps awkward. Seems also to duplicate some
  113. of the machinery of crates. And plugs. Bleah.
  114. Maybe a plug should be an existential? Well sure. But let's call it a
  115. module.
  116. assoc_map = mod[T] { type C[T];
  117. fun* each(C[T])->T;
  118. fun* has(C[T],T)->bool; }
  119. Here is the key point: a module is not a functor, is not recursive,
  120. and does not imply independent compilation. A crate, a record *or* a
  121. process can be used to implement a module type.
  122. --- aug 2009 ---
  123. a worked-out container module example
  124. type map[k,v] = mod { type t;
  125. fn new(fn compare(~k a,~k b)->bool) -> t;
  126. fn set(^t m, ~k key, ~v val) -> ();
  127. fn? find(~t m, ~k key) -> ~v; };
  128. mod treemap[k,v] {
  129. type t = rec(...);
  130. fn new() -> t {
  131. ...
  132. }
  133. fn set(^t m, ~k key, ~v val) -> () {
  134. ...
  135. }
  136. fn? find(~t m, ~k key) -> ~v {
  137. ...
  138. }
  139. }
  140. fn get_a_map_impl[k,v]() -> map[k,v] {
  141. typecase (k) {
  142. ...
  143. }
  144. }
  145. fn compare(~int a, ~int b) -> bool { ... }
  146. let map[int,str] mv = get_a_map_impl[int,str]();
  147. mod m = mapv;
  148. // or ...
  149. mod m = get_a_map_impl[int,str]();
  150. auto t = m.new(compare);
  151. m.set(t, 10, "hello");
  152. m.set(t, 12, "there");
  153. // Note that from a semantics level, ophel suggests can treat generic
  154. // items as modules with an opaque type:
  155. type modt = mod { type t; fn f(~t) -> (); }
  156. ~=
  157. fn[t] f(~t) -> () {...}
  158. at least for the purposes of passing them around. I think this is a possible de-sugaring if you're
  159. doing a proof in coq, but I don't want to force users to write this way. The "throwaway module whose
  160. only purpose is to parameterize its contained item" idiom seems clunky at best. Also it has a
  161. weird falling-apart-ness when it comes to generic *types*:
  162. eg: how would you do the map case?
  163. fn mk_map(mod {type k; type v; } params) ->
  164. mod { type t;
  165. fn new() -> t;
  166. fn add(~params.k, ~params.v) -> ();
  167. fn? find(~params.k) -> ~params.v; };
  168. now we run into the sticky question of the type of the return value of mk_map. I think it's not
  169. denotable in this language; or if it is it involves nutty things like existentials and higher-kinds.
  170. even ophel defers here and says "let's have a separate binding form map[T]" for a type constructor.