namark
/
cpp_musings


			
				
					
						
						
							12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
							// std::accumulate is right
// not meant to be read top to bottom, but rather following the function calls, starting from main

#include <vector> // something to accumulate
#include <numeric> // the accumulate is here
#include <cassert> // the side effects

// combines the provided lambdas into an "overload set", using multiple inheritance, yes, that totally useless feature of the language that is long dead and should be removes, ha ha, hoo hoo, oh my java, i can't write a hello world without a garbage collector ._.
template<class... Ts> struct overload_set : Ts... { using Ts::operator()...; };
// deduction guide, not necessary in c++20
template<class... Ts> overload_set(Ts...) -> overload_set<Ts...>;

int main(int argc, char const* argv[])
{
	// everyone and their granma is like "omg, accumulate is so annoying, why the initial value i hate it so much omg, my favorite language dun have no initial value".

	const std::vector x {1,2,3,4,5}; // given some(or none) stuff to accumulate

	// here is how one would write accumulate by hand
	{
		int acc = 0;
		for(auto i = std::begin(x); i != std::end(x); ++i)
			acc = acc + *i;
		assert(acc == 15);
	}
	// see that acc there? it's there. did you know? have you ever written a loop?
	// no you can't make it go away
	// even if you assume non empty and initialize to first element you still have it and initialize it
	{
		assert(std::begin(x) != std::end(x));
		int acc = *std::begin(x); // <-- still here, can't be a reference cause we are modifying it
		for(auto i = std::next(std::begin(x)); i != std::end(x); ++i)
			acc = acc + *i;
		assert(acc == 15);
	}
	// it's just a special case that the previous version covers, it's not some fundamentally different way of doing it
	auto my_stupid_accumulate = [](auto begin, auto end){ assert(begin != end);
		return std::accumulate(std::next(begin), end, *begin); };
	assert(15 == my_stupid_accumulate(std::begin(x), std::end(x)));

	// so we established that acc is there, it's a necessary part of the algorithm, so if you were to introduce an abstraction for this loop, you need to account for it, or your abstraction would be overzealous.
	// But what is acc, well aside from a local scope name, which not very relevant, it's the stuff to the either side of it, the type(int) and the value ( = 0), both of which can be parameters to a function in c++, together as a value with inferred type, hence std::accumulate doing exactly that with the init parameter, along with begin and end iterators, and the operator. It's a natural parameterization, but is it useful?

	// lets first address a common non-arguments for accumulate that comes in many forms, with sophisticated examples of various complicated types:
	// "you can use a different type for the accumulator":
	// chances are if you are accumulating in a different type, than you are also using a custom operator, and said type can be inferred from it.
	// even when custom op is not necessary (it's defined on the element type and you just want some implicit conversion), it makes sense to provide it for improved readability.
	// there might be few exceptions, you might argue that accumulating bunch of chars, the implicit conversion to int should not surprise anyone, but in so far as it is not surprising it is also inferred from the plus operator there, if it wasn't it would have been surprising
	static_assert(std::is_same_v<decltype(char{} + char{}), int>);
	// surprising fundamental type conversions is a different topic
	// and i would accumulate const char* or string_view into a string, but i don't think that's good way of doing it
	// std::accumulate(..., ""s) is fine, std::accumulate(..., std::string{}) is better, but theoretical std::accumulate(..., string_plus_string_view) is the most readable

	// actual advantages
	// 1. choice of identity
	// default constructed type is not necessarily a natural identity for all operations, and for some there might not even be a natural identity, but a choice between several meaningful ones.
	//
	// 2. re-entrancy: you can accumulate some, use the value or just wait, and then accumulate some more, by passing it as a new init parameter, as you would naturally tend to do if writing raw loops, with the final outcome being identical to continuous accumulation in all cases.
	// but i can do diiiis
	// auto part_1 = std::accumulate(first...);
	// ...
	// auto part_2 = part + std::accumulate(next...);
	// sure, but you are assuming the op is associative, which might not be the case (gotta love floats), making std::accumulate more useful interface compared to whatever your accumulate would look like.

	// 3. optimality: with copy elision or move semantics, you can avoid any hidden costs even for complex heap allocated types, it just what you write in the init parameter and the operator, plain and simple. Otherwise you will have to unconditionally pay for the default constructions.

	// disadvantages
	// 1. crybabies: whahaaaa, i can't type a couple of extra words every once in blue moon when i realize this stupid code i was writing over and over again all my life is just an STL algorithm, whahaaaaa, i can't be any more shallooooow

	// 2. sad reality: if as usual you are writing spaghetti code without ever thinking, the init allows you to mess things up with surprising implicit conversions/promotions or identity-operator mismatch

	// now if you did actually want to provide some convenience version that would help all the crybabies out there to properly initialize the accumulator, you shouldn't just default initialize it (you freaking brain dead std::reduce, i freaking hate you so much, if you didn't have the init overload i would have just destroyed your whole career ffs)
	// the proper way to do it is to infer the initial value from the accumulating operator, as its identity. For plus<int> it's 0, for plus<float> it's -0.0f (cause adding positive zero to negative zero results in positive zero, which screws with optimization), for multiply<int> it's 1 and so on... this requires a much more mature algebraic environment overall, so you can't properly do it in confines of a single function, but that's what it takes you freakazoids, do that or don't do anything at all.
	// the minimal solution would be to define a type function that given an operator returns the identity and allows specializations, maybe even use some special magic for the custom operators, by passing a tag type that the user must overload on and return the initial value, nudging them towards keeping the operator and its identity together and thinking in those terms to avoid the common mistakes
	// watch me:
	// TODO
	return 0;
}