N-Dimentional Iteration in C++

C++ is the second computer language I ever learned (after BASIC), and I've been working mainly in C++ for the past few years. It's fast and powerful, but boy, do I miss Python sometimes. Some things are just easier in Python, but sometimes C++ can be convinced to catch up, and that's what we'll do here. One of these gaps that came to mind recently is multi-dimensional iteration.

Say you have a matrix, and each element needs to be treated without connection to the other elements. You could write a nested loop:

for i in range(10):
    for j range(20):
        pass

There are two problem with that, related to how one reads the code. The first is that it increases levels of nesting. Each level of nesting increases the load on our short-term memory, resulting in reduced cognitive function. That is to say, it's harder to read deeply-nested code. And as the dimensionality increases, this becomes worse. I once handled a code to find correspondences between identified objects in four images. That takes six levels of nesting. So you wouldn't be surprised to hear there were bugs lurking in such hard-to-read code, and I did manage to move just one if and make the code 30 times faster, after ~30 years that code existed!

Another problem is that we want to convey the idea that each element is independent, but the nested loop structure seems to give a meaning to rows and columns (and so on). We miss an opportunity to use the code as an explanation for itself.

In Python, you can remedy this with iterators, some of them supplied in the standard library:

import itertools as it

for i,j in it.product(range(10), range(20)):
    pass

And if you're working with a NumPy array, a similar facility is available:

import numpy as np

img = np.imread('lena.jpg')
for pixCoord in np.ndindex(img.shape):
    pass

So how to have this in C++? Let's assume that the number of dimensions are known at compile time. We'll use a template argument for that. Then, if you have the current position in the loop, it's easy to calculate the next one.

#include <array>

template <auto ND, typename Index_t=size_t>
using NDIndex = std::array<Index_t, ND>;

// returns true if overflowed, which means the iteration finished.
template <auto ND, typename Index_t=size_t>
bool advanceOne(NDIndex<ND, Index_t>& current, const NDIndex<ND, Index_t>& bounds) {
    current[0]++;

    unsigned dim = 0;
    while (current[dim] >= bounds[dim]) {
        current[dim++] = 0;
        if (dim >= ND) {
            return true;
        }
        current[dim]++;
    }
    return false;
}

NDIndex<2> current;
NDIndex<2> bounds{10, 20};
while (!advanceOne(current)) {
    // do something with current
}

Once we stash the advanceOne() function in some header file, we can just use it any time we want multi-dimensional iteration. However, this can still be made better. C++ has a range-for loop. This loop is similar to the Python loop in that it takes an iterable class and iterates it from start to finish. We just need to provide the necessities for an iterable class: the class should provide begin() and end() functions that return iterators, and iterators are classes that can, at the very least, be advanced with the ++ operator. Let's do it:

#include <array>

template <size_t ND, typename Index_t=size_t>
class NDIndexer
{
public:
    using NDIndex = std::array<Index_t, ND>;

    NDIndexer(NDIndex bounds) { m_bounds = bounds; }

    // This is the iterator class which provides the '++' behaviour.
    // Note it's a nested class, it's a member of NDIndexer.
    class NDIter {
        NDIndex m_current = {};
        NDIndex m_bounds; // copy to allow default construction.
        bool overflow = false;

    public:
        NDIter() {}
        NDIter(const NDIndex& bounds) { m_bounds = bounds; }

        const NDIndex& operator*() {
            return m_current;
        }

        // this is just like the `advanceOne()` function above.
        NDIter& operator++() {
            m_current[0]++;
            unsigned dim = 0;
            while (m_current[dim] >= m_bounds[dim]) {
                m_current[dim++] = 0;
                if (dim >= ND) {
                    overflow = true;
                    break;
                }
                m_current[dim]++;
            }

            return *this;
        }

        // The postfix case (x++, instead of ++x)
        NDIter operator++(int) {
            NDIter ret = *this;
            operator++();
            return ret;
        }

        bool operator!=(const NDIter& other) {
            return (m_bounds != other.m_bounds) || (m_current != other.m_current) || (overflow != other.overflow);
        }
        bool operator==(const NDIter& other) { return !operator!=(other); }

        friend class NDIndexer; // lets the container class access members so it can create 
                     // a proper end iterator.
    };

    NDIter begin() const {
        return NDIter{ m_bounds };
    }

    NDIter end() const {
        NDIter ret{ m_bounds };
        ret.overflow = true;
        return ret;
    }

private:
    std::array<Index_t, ND> m_bounds;
};

And now we just use it as such:

std::array<size_t, 3> dimBuckets{10, 20, 10};
for (const auto& idx : NDIndexer<3>{ dimBuckets }) {
    // Do something per cell of a 3D grid.
}

Now this already pays off by having only one level of indentation, but there's another payoff. Say the number of grid dimensions is determined by another template parameter - then you can't use nested loops in the normal way. Besides NDIndexer, your only other option is to write a recursive template function, with each level of recursion handling one dimension of the iteration.

That's it. Have fun with it like I did, and maybe you'll miss Python less.