[admin] Hamcrest: Improving Reducer Implementations

In the beginning of the year I posted about the ways you can use Hamcrest out of test code, together with hamcrest-collections. This combination allows us to write different kinds of matchers to select and reject items from lists, as well as applying map and reduce to them. After a while making use of them on my current project, I wanted to share what I liked a lot, and what I didn’t like a lot. My friend Liz Douglass has also written a post sharing our experience, and I will just complement it a bit…

What I Liked A Lot

There’s no much to write here, as we all know that this combination is quite powerful when you’re looking for writing code that reads more like english language, making it much easier to express the intent of your code. Not to mention that we get rid of for loops everywhere in the codebase.

What I Didn’t Liked A Lot

One aspect I didn’t like since the beginning when implementing Reducers is that they are coupled to one specific type. It reduces a list of one type into a result of the same type. And from the Wikipedia definition of Map and Reduce


“Map” step: The master node takes the input, chops it up into smaller sub-problems, and distributes those to worker nodes.


“Reduce” step: The master node then takes the answers to all the sub-problems and combines them in a way to get the output - the answer to the problem it was originally trying to solve.

… we can see that it doesn’t mention that the result should be of the same type as the original one, after applying the reducer. And that’s exactly what I wanted to do instead. People said I was trying to combine both Map and Reduce into a single implementation. I kind of disagree with that, because the fact that I am reducing a list into a result of a different type, it doesn’t mean that I am transforming the original input (like multiplying each item in a list of integers by 2, before concatenating them). Confusing?

Let me try to explain it using an example. Given we have a list of integers…

List list = Lists.create(1, 2, 3);

and that I want to concatenate these numbers into a string. With the current hamcrest-collections implementation, that would be possible doing something like…

Iterable listOfStrings = FunctionMapper.map(list, new Function() {
    public String apply(Integer number) {
        return String.valueOf(number);
    }
});

String result = Reduction.reduce(listOfStrings, new Reducer() {
    public String apply(String first, String previous) {
        return first.concat("+").concat(previous);
    }
});

Assert.assertEquals("1+2+3", result);

It’s quite a lot of code just to concatenate a list of numbers! One day while pairing with Tom Czarniecki, we decided to reimplement the Reduction and Reducer classes, so that we could create more flexible and simple Reducer implementations, and of course writing almost half the lines of code.

public interface Reducer {
    U apply(T first, U previous);
}
public class Reduction {

    public static  U reduce(List list, U initialValue, Reducer reducer) {
        U currentValue = initialValue;
        for (T item : list) {
            currentValue = reducer.apply(item, currentValue);
        }
        return currentValue;
    }
}

To do the same number concatenation with this new implementation, is just a matter of defining a new Reducer, that concatenates them in a string.

Assert.assertEquals("1+2+3", Reduction.reduce(list, "", new Reducer() {
    public String apply(Integer first, String previous) {
        return previous.concat("+").concat(String.valueOf(first));
    }
}));

Much simpler!

Advantages Of This New Implementation

  1. Do I have to mention again that it’s much cleaner?
  2. We use java.util.List instead of Iterable.
  3. No exception is thrown if the list to reduce is empty. It just uses the initial value provided on the reducer implementation.
  4. Flexibility to reduce a list into a result of any type.

Hope you enjoy it!