Posts de ‘Alexandre Martins’

[admin] Hamcrest: Improving Reducer Implementations

Thursday, March 18th, 2010

In the beginning of the year I posted about the ways you can use Hamcrest out of test code, together with hamcrest-collections. This combination allows us to write different kinds of matchers to select and reject items from lists, as well as applying map and reduce to them. After a while making use of them on my current project, I wanted to share what I liked a lot, and what I didn’t like a lot. My friend Liz Douglass has also written a post sharing our experience, and I will just complement it a bit…

What I Liked A Lot

There’s no much to write here, as we all know that this combination is quite powerful when you’re looking for writing code that reads more like english language, making it much easier to express the intent of your code. Not to mention that we get rid of for loops everywhere in the codebase.

What I Didn’t Liked A Lot

One aspect I didn’t like since the beginning when implementing Reducers is that they are coupled to one specific type. It reduces a list of one type into a result of the same type. And from the Wikipedia definition of Map and Reduce


“Map” step: The master node takes the input, chops it up into smaller sub-problems, and distributes those to worker nodes.


“Reduce” step: The master node then takes the answers to all the sub-problems and combines them in a way to get the output - the answer to the problem it was originally trying to solve.

… we can see that it doesn’t mention that the result should be of the same type as the original one, after applying the reducer. And that’s exactly what I wanted to do instead. People said I was trying to combine both Map and Reduce into a single implementation. I kind of disagree with that, because the fact that I am reducing a list into a result of a different type, it doesn’t mean that I am transforming the original input (like multiplying each item in a list of integers by 2, before concatenating them). Confusing?

Let me try to explain it using an example. Given we have a list of integers…

List list = Lists.create(1, 2, 3);

and that I want to concatenate these numbers into a string. With the current hamcrest-collections implementation, that would be possible doing something like…

Iterable listOfStrings = FunctionMapper.map(list, new Function() {
    public String apply(Integer number) {
        return String.valueOf(number);
    }
});

String result = Reduction.reduce(listOfStrings, new Reducer() {
    public String apply(String first, String previous) {
        return first.concat("+").concat(previous);
    }
});

Assert.assertEquals("1+2+3", result);

It’s quite a lot of code just to concatenate a list of numbers! One day while pairing with Tom Czarniecki, we decided to reimplement the Reduction and Reducer classes, so that we could create more flexible and simple Reducer implementations, and of course writing almost half the lines of code.

public interface Reducer {
    U apply(T first, U previous);
}
public class Reduction {

    public static  U reduce(List list, U initialValue, Reducer reducer) {
        U currentValue = initialValue;
        for (T item : list) {
            currentValue = reducer.apply(item, currentValue);
        }
        return currentValue;
    }
}

To do the same number concatenation with this new implementation, is just a matter of defining a new Reducer, that concatenates them in a string.

Assert.assertEquals("1+2+3", Reduction.reduce(list, "", new Reducer() {
    public String apply(Integer first, String previous) {
        return previous.concat("+").concat(String.valueOf(first));
    }
}));

Much simpler!

Advantages Of This New Implementation

  1. Do I have to mention again that it’s much cleaner?
  2. We use java.util.List instead of Iterable.
  3. No exception is thrown if the list to reduce is empty. It just uses the initial value provided on the reducer implementation.
  4. Flexibility to reduce a list into a result of any type.

Hope you enjoy it!

[admin] RESTful Web Services: Preventing Race Conditions

Thursday, March 18th, 2010

One of the core premisses of RESTful web services is that HTTP should be seen as an application protocol rather than just a transport protocol. It comprises a whole bunch of semantics that allows us to build robust distributed systems. And for some cases, when multiple consumers manipulate the same resource, therefore changing its state, the solution should be robust enough to prevent the system to get into a race condition.

But how HTTP could prevent that?

HTTP provides a simple but powerful mechanism for aligning resource states by making use of entity tag or ETag and conditional request headers. An ETag is anything that uniquely identifies an entity, such as the ID associated with a persisted resource, a checksum of the entity headers and body, etc. If this resource changes—that is, when one or more of its headers, or its entity body, changes—then the entity tag changes accordingly, reflecting this new resource state.

When a response contains an ETag associated to a resource state and you want to continue working with this same resource, it’s recommended to use this tag in subsequent requests (called conditional requests), otherwise the resource state might eventually become out of sync with service one, returning something like a 409 Conflict.

Conditional requests happens when the current ETag is supplied to a conditional request header, such as If-Match or If-None-Match, when user is requesting to update a resource for example. The service will then check the precondition, by comparing the current resource ETag with the one provided in the request. If it’s satisfied than the server proceeds and process the request, otherwise it concludes that the resource has changed and responds with a 412 Precondition Failed.

Example

Given an online shop for home goods, where two people— Admin1 and Admin2 —are responsible for administrating its contents. In our scenario both administrators are trying to change the state of the same product (the Weber BBQ), around the same time. Admin1 wants to lower the product price down to $300.00 and Admin2 wants to change its state to “Not Available”. Firstly, both administrators GET the current product state independently of one another by doing the following request:

GET /product/1 HTTP/1.1
Host: myshop.com

Returning the following resource (product) as response. Note that the service’s response contains an ETag header.

HTTP/1.1 200 OK
Content-Length: 265
Content-Type: application/xml
ETag: "686897696a7c876b7e"

<product>
  <name>Weber Family BBQ</name>
  <description>Great for parties and cooks a neat roast too.</description>
  <price>$399.00</price>
  <status>In Stock</status>
</product>

When Admin1 does a conditional PUT, including an If-Match header with the ETag value from the previous GET.

PUT /product/1 HTTP/1.1
Host: myshop.com
If-Match: "686897696a7c876b7e"

<product>
  <name>Weber Family BBQ</name>
  <description>Great for parties and cooks a neat roast too.</description>
  <price>$300.00</price>
  <status>In Stock</status>
</product>

And as the product state hasn’t changed since the last request, then the request is thus successful! Notice that the response returns an updated ETag value, reflecting the new product state.

HTTP/1.1 204 No Content
ETag: "616898r96a8cy86b8eee11"

Little time after Admin1 has updated the product, Admin2 does another PUT to the same product, including the same If-Match header with the ETag value from the GET request.

PUT /product/1 HTTP/1.1
Host: myshop.com
If-Match: "686897696a7c876b7e"

<product>
  <name>Weber Family BBQ</name>
  <description>Great for parties and cooks a neat roast too.</description>
  <price>$399.00</price>
  <status>Not Available</status>
</product>

The service then determines that someone is trying to change the same product, using an out-of-date resource representation (ETags are different!), and responds with a 412 Precondition Failed code. No race conditions whatsoever!

HTTP/1.1 412 Precondition Failed

Conclusion

Although ETags and conditional request headers make up a powerful mechanism for dealing with concurrency, one thing to keep in mind is that, depending on the amount of computation performed by the server to generate an ETag, response times might increase considerably. So use it only if you need it!