12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394 |
- Nested arrays
- ==============
- It would be useful to support the same set of array primitives on nested
- vectors/arrays (nested arrays for short) as we do on n-D arrays. Clearly some
- operations don't make sense if the nested array is ragged,
- e.g. transpose-array. Some operations already work: (tally), (from) in some
- cases, ply with verbs of rank -1. Ply with verbs of rank 0 seems well
- defined but doesn't work now. Etc.
- Sparse arrays
- =============
- It's simple enough to define some data types to store sparse arrays in whatever
- common format (say CSR). As with nested arrays, the interest is in supporting
- the same set of array primitives as on n-D arrays. In many cases the complexity
- would not be the same, but it should be up to the user to determine the
- cost/convenience tradeoff.
- Temporaries
- ===========
- As it is, ply and fold will generate many intermediate array temporaries. Some
- of them are easy to remove, but others may require more work. Here's a
- classification.
- 1. Function return
- These are temporaries returned by cell operations. Removing them requires some
- way to transform (op arg ...) -> result functions into (op! result arg ...)
- functions. It may be possible to do this automatically by having some convention
- on how these functions are declared. It would be harder for general functions
- such as (make-verb op #f #f).
- This also applies to the returns of fold and ply themselves. Those should
- actually be the first to be fixed.
- 2. No oshape
- If a verb doesn't have an oshape, it's necessary to call the cell op at least
- once to compute it. Maybe the default op->verb conversion should be (make-verb
- op '() #f) instead of (make-verb op #f #f), as an encouragement to provide
- oshape. Another possibility is to compute the 1st cell, allocate, and continue
- the loop. Then only a cell temporary is required instead of a whole array.
- 3. Missed loop fusion
- This is part of a larger problem. A ply or fold application may not be the final
- result of the computation, but an intermediate one, such as when reducing an
- outer product.
- Another example is when the cell op can be composed with the ply, for example
- some selection operators:
- (ply (Jv (cut from <> 0) 1) (i. 2 2)) => (from (i. 2 2) #t 0)
- This also shows in the following limitation of amend!. It's cheaper to do
- [1] (ply! (from A i ...) op B ...)
- than it is to do
- [2] (amend! A (ply op B ...) i ...)
- since [2] allocates a temporary for (ply op B ...) while [1] writes directly
- into (from A i ...). However, [1] is bad practice because depending on the
- values of i ..., (from A i ...) may not be a shared array over A, so the
- statement would not amend A in those cases. On the other hand, amend! will
- amend A for any valid i ... . Loop fusion should remove the temporary in [2]
- and allow it to be as efficient as [1] in every case.
- Implementations of rank above the rank of the verb
- ==================================================
- Sometimes it's convenient to implement a verb of rank-k as a function that takes
- arrays of higher rank. This is normally done for efficiency reasons when the
- function is implemented in a foreign module. For example, if a function
- (rotate-vector v) is implemented in C, it makes sense for the function
- to take a whole block of vectors at once, instead of relying on the rank
- extension mechanism to call it for each 1-slice in this block.
- The verb definition for verbs of rank k should support using lambas of rank
- q≥k. When the argument's rank is in [k, q], it suffices to reshape in and out of
- the call. When the argument's rank is >q it may not be possible to reshape
- without copying, depending on the strides of the argument. Therefore the
- argument traversal should use the lambda rank q instead of the verb rank k.
|