A C++20 array and expression template library with some J/APL features

lloda 8f55eb8feb Copy warning guard in ply.hh 2 jaren geleden
.github ffba5e978d Enable -DNDEBUG 2 jaren geleden
bench 0f41b89e21 Fix alignment issues with RA_DO_OPT_SMALLVECTOR 2 jaren geleden
box 5e5230dedb Fix builtin vector type alignment issue 2 jaren geleden
config 0f41b89e21 Fix alignment issues with RA_DO_OPT_SMALLVECTOR 2 jaren geleden
docs ffba5e978d Enable -DNDEBUG 2 jaren geleden
examples 5e5230dedb Fix builtin vector type alignment issue 2 jaren geleden
ra 377e78092f Copy warning guard in ply.hh 2 jaren geleden
test ffba5e978d Enable -DNDEBUG 2 jaren geleden
.gitattributes a67665fbf4 Rename extensions from .C .H to .cc to .hh 4 jaren geleden
CMakeLists.txt f5c6f24dd5 Fix a dynamic-dynamic case of ra::transpose 5 jaren geleden
LICENSE cf1f41d580 Include license file 9 jaren geleden
README.md 0f41b89e21 Fix alignment issues with RA_DO_OPT_SMALLVECTOR 2 jaren geleden
SConstruct f5c6f24dd5 Fix a dynamic-dynamic case of ra::transpose 5 jaren geleden
TODO 18f0dace38 More header fixes 2 jaren geleden

README.md

ra-ra

ra-ra is a C++20, header-only multidimensional array library in the spirit of Blitz++.

Multidimensional arrays are containers that can be indexed in multiple dimensions. For example, vectors are arrays of rank 1 and matrices are arrays of rank 2. C has built-in multidimensional array types, but even in modern C++ there's very little you can do with those, and a separate library is required for any practical endeavor.

ra-ra implements expression templates. This is a C++ technique (pioneered by Blitz++) to delay the execution of expressions involving large array operands, and in this way avoid the unnecessary creation of large temporary array objects.

ra-ra tries to distinguish itself from established C++ libraries in this space (such as Eigen or Boost.MultiArray) by being more APLish, more general, smaller, and more hackable.

In this example (examples/readme.cc), we add each element of a vector to each row of a matrix, and then print the result.

  #include "ra/ra.hh"
  #include <iostream>

  int main()
  {
    ra::Big<float, 2> A {{1, 2}, {3, 4}};  // compile-time rank, dynamic shape
    A += std::vector<float> {10, 20};      // rank-extending op with STL object
    std::cout << "A: " << A << std::endl;  // shape is dynamic, so it will be printed
  }

  A: 2 2
  11 12
  23 24

Please check the manual online at lloda.github.io/ra-ra, or have a look at the examples/ folder.

ra-ra offers:

  • Array types with arbitrary compile time or runtime rank, and compile time or runtime shape.
  • Memory owning types as well as views over any piece of memory.
  • Rank extension by prefix matching, as in APL/J, for functions of any number of arguments.
  • Compatibility with builtin arrays and with the STL, including ranges.
  • Transparent memory layout, for interoperability with other libraries and/or languages.
  • Iterators over cells (slices/subarrays) of any rank.
  • Rank conjunction as in J (compile time ranks only).
  • Slicing with indices of arbitrary rank, beating of linear range indices, index skipping and elision.
  • Outer product operation.
  • Tensor index object.
  • Short-circuiting logical operators.
  • Argument list selection operators (where with bool selector, or pick with integer selector).
  • Axis insertion (e.g. for broadcasting).
  • Reshape, transpose, reverse, collapse/explode, stencils.
  • Arbitrary types as array elements, or as scalar operands.
  • Multidimensional operator[] (with C++23).
  • Many predefined array operations. Adding yours is trivial.

constexpr is suported as much as possible. For example:

  constexpr ra::Small<int, 3> a = { 1, 2, 3 };
  static_assert(6==ra::sum(a));

Performance is competitive with hand written scalar (element by element) loops, but probably not with cache-tuned code such as your platform BLAS, or with code using SIMD. Please have a look at the benchmarks in bench/.

Building the tests and the benchmarks

The library itself is header-only and has no dependencies other than a C++20 compiler and the standard library.

The test suite in test/ runs under either SCons (CXXFLAGS=-O3 scons) or CMake (CXXFLAGS=-O3 cmake . && make && make test). Running the test suite will also build and run the examples (examples/) and the benchmarks (bench/), although you can build each of these separately. ra-ra depends heavily on inlining, so although the test suite will run fine with -O0, that will take a long time. At least -O2 necessary in practice.

Other notes:

  • Some of the benchmarks will try to use BLAS if you have define RA_USE_BLAS=1 in the environment.
  • The test suite is built with -fsanitize=address by default, which can cause significant slowdown. Disable by passing -fno-sanitize=address to the compiler.

ra-ra requires support for -std=c++20, including <source_location>. The most recent versions tested are:

  • gcc 12.2: 65076211eeeeecd8623877e3e3b5cc0a87af302c (-std=c++2b)
  • gcc 11.3: 65076211eeeeecd8623877e3e3b5cc0a87af302c (-std=c++20)

Clang doesn't currently work (last version I've tried is Clang 10) but the code is meant to be standard C++.

Notes

  • Both index and size types are signed. Index base is 0.
  • Default array order is C or row-major (last dimension changes fastest). You can make array views with other orders, but newly created arrays use C-order.
  • The selection (subscripting) operator is (). [] means exactly the same as (). It's unfortunate that [] was wasted on subscripting when () works perfectly well for that...
  • Indices are checked by default. This can be disabled with a compilation flag.
  • ra-ra doesn't itself use exceptions, but it provides a hook so you can throw your own exceptions on ra-ra errors. See ‘Error handling’ in the manual.

Bugs & defects

  • Lack of good reduction mechanisms.
  • Operations that require allocation, such as concatenation or search, are mostly absent.
  • Traversal of arrays is naive (just unrolling of inner dimensions).
  • Handling of nested (‘ragged’) arrays is inconsistent.
  • No SIMD to speak of.

Please have a look at TODO for a concrete list of known bugs.

Out of scope

  • Parallelization (closer to wish...).
  • GPU / calls to external libraries.
  • Linear algebra, quaternions, etc. Those things belong in other libraries, and calling them with ra-ra objects is trivial.
  • Sparse arrays. You'd still want to mix & match with dense arrays, so maybe at some point.

Motivation

I do numerical work in C++, so I need a library of this kind. Most C++ array libraries seem to support only vectors and matrices, or small objects for low-dimensional vector algebra. Blitz++ was a great early generic array library (even though the focus was numerical) and it hasn't really been replaced as far as I can tell.

It was a heroic feat to write a library such as Blitz++ in C++ in the late 90s, even discounting the fragmented compiler landscape and the patchy support for the standard at that time. Variadic templates, lambdas, rvalue arguments, etc. make things much simpler, for the library writer as well as for the user.

From APL and J I've taken the rank extension mechanism, and perhaps an inclination for carrying each feature to its logical end.

ra-ra wants to remain a simple library. I try not to second-guess the compiler and I don't stress performance as much as Blitz++ did. However, I'm wary of adding features that could become an obstacle if I ever tried to make things fast(er). I believe that the implementation of new traversal methods, or perhaps the optimization of specific expression patterns, should be possible without having to turn the library inside out.

Other C++ array libraries

Links