A C++20 array and expression template library with some J/APL features

lloda e09b1ba08a Basic deduction guide for SmallArray 1 year ago
.github 71606da876 Fix test/example.test with g++-11 --std=c++2b 1 year ago
bench 43ac8cc085 Basic deduction guide for SmallArray 1 year ago
box 1baf822688 Fix conversion of SmallArray to SmallView 1 year ago
config 9401cdcd15 Accept pointer-to-member as op in expressions 1 year ago
docs 43ac8cc085 Basic deduction guide for SmallArray 1 year ago
examples 9ad78c5b57 Rename int_t/bool_t to int_c/bool_c and add dim_c 1 year ago
ra 43ac8cc085 Basic deduction guide for SmallArray 1 year ago
test 43ac8cc085 Basic deduction guide for SmallArray 1 year ago
.gitattributes a67665fbf4 Rename extensions from .C .H to .cc to .hh 4 years ago
CMakeLists.txt f5c6f24dd5 Fix a dynamic-dynamic case of ra::transpose 5 years ago
LICENSE cf1f41d580 Include license file 9 years ago
README.md 38eb00114e Chip at CellBig vs CellSmall 1 year ago
SConstruct f5c6f24dd5 Fix a dynamic-dynamic case of ra::transpose 5 years ago
TODO 43ac8cc085 Basic deduction guide for SmallArray 1 year ago

README.md

ra-ra

ra-ra is a C++20, header-only multidimensional array library in the spirit of Blitz++.

Multidimensional arrays can be indexed in multiple dimensions. For example, vectors can be seen as rank 1 and matrices are arrays of rank 2. C has built-in multidimensional array types, but even in modern C++ there's very little you can do with those, and anything practical requires a separate library.

ra-ra implements expression templates. This is a C++ technique (pioneered by Blitz++) to delay the execution of expressions involving large array operands, and in this way avoid the unnecessary creation of large temporary array objects.

ra-ra is small (about 6k loc), easy to extend, and generic — there are no arbitrary type restrictions or limits on rank or argument count.

In this example (examples/readme.cc), we add each element of a vector to each row of a matrix, and then print the result.

  #include "ra/ra.hh"
  #include <iostream>

  int main()
  {
    ra::Big<float, 2> A {{1, 2}, {3, 4}};  // compile-time rank, dynamic shape
    A += std::vector<float> {10, 20};      // rank-extending op with STL object
    std::cout << "A: " << A << std::endl;  // shape is dynamic, so it will be printed
  }

  A: 2 2
  11 12
  23 24

Please check the manual online at lloda.github.io/ra-ra, or have a look at the examples/ folder.

ra-ra offers:

  • Array types with arbitrary compile time or runtime rank, and compile time or runtime shape.
  • Memory owning types as well as views over any piece of memory.
  • Rank extension by prefix matching, as in APL/J, for functions of any number of arguments.
  • Compatibility with builtin arrays and with the standard library, including <ranges>.
  • Interoperability with other libraries and/or languages through transparent memory layout.
  • Iterators over cells (subarrays) of any rank.
  • Rank conjunction as in J (compile time ranks only).
  • Slicing with indices of arbitrary rank, beating of linear range indices, index skipping and elision.
  • Outer product operation.
  • Tensor index object.
  • Short-circuiting logical operators.
  • Argument list selection operators (where with bool selector, or pick with integer selector).
  • Axis insertion (e.g. for broadcasting).
  • Reshape, transpose, reverse, collapse/explode, stencils.
  • Arbitrary types as array elements, or as scalar operands.
  • Many predefined array operations. Adding yours is trivial.

constexpr is suported as much as possible. For example:

  constexpr ra::Small<int, 3> a = { 1, 2, 3 };
  static_assert(6==ra::sum(a));

Performance is competitive with hand written scalar (element by element) loops, but probably not with cache-tuned code such as your platform BLAS, or with code using SIMD. Please have a look at the benchmarks in bench/.

Building the tests and the benchmarks

The library itself is header-only and has no dependencies other than a C++20 compiler and the standard library.

The test suite in test/ runs under either SCons (CXXFLAGS=-O3 scons) or CMake (CXXFLAGS=-O3 cmake . && make && make test). Running the test suite will also build and run the examples (examples/) and the benchmarks (bench/). You can also build each of these separately. The performance of ra-ra depends heavily on the optimization level, so although the test suite should pass with -O0, that can take a long time.

  • Some of the benchmarks will try to use BLAS if you have define RA_USE_BLAS=1 in the environment.
  • The test suite is built with -fsanitize=address by default, and this can cause significant slowdown. Disable by passing -fno-sanitize=address to the compiler.

ra-ra requires support for -std=c++20, including <source_location>. The most recent versions tested are:

  • gcc 13.1: f632865f35fefe9e8cf00558cdc75e17f9fb9c2a (-std=c++2b)
  • gcc 12.2: f632865f35fefe9e8cf00558cdc75e17f9fb9c2a (-std=c++2b)
  • gcc 11.3: f632865f35fefe9e8cf00558cdc75e17f9fb9c2a (-std=c++20)

It's not practical for me to test Clang systematically, so some snags with that are likely.

Notes

  • Both index and size types are signed. Index base is 0.
  • The default array order is C or row-major (last dimension changes fastest). You can make array views with other orders, but newly created arrays use C-order.
  • The selection (subscripting) operator is () or [] indistinctly. Multi-argument [] requires __cpp_multidimensional_subscript > 202110L (in gcc 12 with -std=c++2b).
  • Indices are checked by default. This can be disabled with a compilation flag.
  • ra-ra doesn't itself use exceptions, but it provides a hook so you can throw your own exceptions on ra-ra errors. See ‘Error handling’ in the manual.
  • ra-ra uses zero size arrays and VLAs internally. I know standard C++ doesn't allow them, which is irritating.

Bugs & defects

  • Operations that require allocation, such as concatenation or search, are mostly absent.
  • Lack of a good abstraction for reductions. You can write reductions abusing rank extension, but it's awkward.
  • Traversal of arrays is naive – just unrolling of inner dimensions.
  • Handling of nested (‘ragged’) arrays is inconsistent.
  • No support for parallel operations or GPU.
  • No SIMD to speak of.

Please see the TODO file for a concrete list of known bugs.

Out of scope

  • Sparse arrays.
  • Linear algebra, quaternions, etc.

Motivation

I do numerical work in C++, so I need a library of this kind. At the time of C++11 when I started writing it, most C++ array libraries seemed to support only vectors and matrices, or small objects for vector algebra.

Blitz++ was a major inspiration as a generic library. But it was a heroic feat to write such a library in C++ in the late 90s. Variadic templates, lambdas, perfect forwarding, etc. make things much easier, for the library writer as well as for the user.

From APL and J I've taken the rank extension mechanism, and perhaps an inclination for carrying each feature to its logical end.

ra-ra wants to remain simple. I try not to second-guess the compiler and I don't stress performance as much as Blitz++ did. However, I'm wary of adding features that could become an obstacle if I ever tried to make things fast(er). I believe that the implementation of new traversal methods, or perhaps the optimization of specific expression patterns, should be possible without having to turn the library inside out.

Other C++ array libraries

Links