Saturday, May 26, 2012

Summary of "Your Mouse is a Database" - May 2012 CACM

In a typical web application, when you make an Ajax call to a web service, you have to wait for the entire response to be received before you can act on it. You cannot process the data as it's coming in off the wire. The article "Your Mouse is a Databse" in the May 2012 issue of CACM by Erik Meijer explores a programming API called Reactive Extensions that can asynchronously stream large or infinite amounts of data to an application (think of a stream of Twitter tweets--new tweets are always being created, so there's never an end to the data). Two key words here are "asynchronous" and "stream". Asynchronous means that the data is handled in a separate thread so the application's UI does not lock up. And stream means that the data is acted on as it's being received.

The term "data" is used very broadly--it doesn't have to be data from a web service over the Internet. The data can also be UI input from the user such mouse movements or typing characters into a text box. The data is push-based because it is sent to the consumer instead of the consumer having to explicitly request it.

To help explain the concept, Meijer proposes modifying Java's Future interface (an interface used to check on the status of threads that are running in the background) to handle such data.

interface Future<T> {
  Closable get(Observer<T> callback);
}

He slims the Future<T> interface down to just one method. The get method allows a consumer to subscribe to the data stream. The return value for the method is an instance of Closeable, which allows the programmer to cancel that particular subscription if she wishes.

The Observer<T> parameter processes data from the stream asynchronously. Meijer bases the Observer<T> interface on GWT's AsyncCallback interface by giving it onFailure() and onSuccess() methods. onFailure() is called if there's an error retrieving data from the stream. onSuccess() is called when (or if) the stream ends. The third method, onNext() defines how to handle each data item from the stream (like saving it to a database or displaying it on the screen).

interface Observer<T> {
  void onFailure(Throwable t);
  void onSuccess();
  void onNext(T value);
}

Meijer then describes a number of query opeators in the Reactive Extensions library that can be used to filter this streaming data. For instance, the "where" operator lets the programmer specify the criteria for whether or not a data item should be processed. If the data item does not meet the criteria, then it is discarded. Another operator is "throttle", which prevents too much data from being processed too quickly. For example, if the throttle value is set to 2 seconds and 10 data items are pushed within a 2 second time span, it will only process the most recent message within that 2 second time span. The other 9 messages will be ignored. These operators can be chained together to give the programmer strong control over how to filter a stream.

This idea of streaming push-based based data can help developers design more memory-efficient applications. Tt can also help developers filter the data from larger streams, like Twitter for example, so as to not overwhelm the application with data it doesn't need.

No comments: