Saturday, December 8, 2012

Java Custom Annotations - Intuitive view

Java Custom Annotations - Intuitive understanding

Annotation, in simple terms, is a description of something, but not all descriptions could be considered as valid annotations in programming world. Simple Java comments couldn't be considered as an annotation, but a Javadoc could be. Formally, annotation is meta-data of some data. Meta-data is a type of data which could be understood by compiler or run-time like JVM, but shall be treated differently than normal data.

For instance, Annotations in a java class is a means to provide some extra information about that class. But, why is it useful? You could have used some of the compile-time annotations in your code, which provide some hints to the compiler like @ SuppressWarnings, @Deprecated, etc. Annotations are much more powerful which could make certain great stuff possible. Let us start with an example and understand the power of Annotations intuitively.

Let us think of a simple client-server protocol which exchanges messages through TCP Layer. And we define Messaging protocol as follows:

command + "END" + data

command -> defines the action to be performed
"END" -> delimiter for command message
data -> data on which action shall be performed

For example,
"echoENDHello World" shall be handled by the Server to return the data "Hello World" as echo is the command which is not supposed to do any operation on the data.

Now let us think of the server's implementation for this. All messages shall be received by a socket and sent to a message processor object for processing. In the processor, parsing may happen and according to the command, a right handler could be selected to process further.

The complete example is uploaded in the GIT.

And this is for demo purpose and expect some TODOs here and there :)

For the Echo command, we may define a EchoHandler class which extends Handler interface. We may define a handler factory which returns the EchoHandler object, given the command parameter, echo. This design is almost loosely coupled as adding a new command, needs a new handler class to be implemented and some changes in Handler Factory and no changes needed in other components. Wait, what is the information being held by this Handler factory? Is that dependency really needed? Yes, because a class derived from Handler interface isn't capable of informing its own purpose to other components. There are several ways by which this information could be injected into the class, think of a final string which says which command it is capable of handling.

Annotations is a non-intrusive and programmatic way by which this sort of extra information about the class can be clearly expressed. Non-intrusive means the annotations don't do any harm on its own to the host class. If we could annotate the new controller with the new command, message processor could utilize this annotation during its handler discovery phase. A perfect de-coupling is possible with annotations. Let us get into some details of annotations.

Annotation is actually an interface and Handler annotation could be defined as follows.

@ in front of interface keyword, can be understood as AT, Annotation Type. Apart from this, there is no distinction between normal interface and Annotation definition.
Target and Retention are the meta annotations which "describe" annotations. To make this example complete, Target says this annotation should be applied at Class level;  BTW, Method and Field level annotations are also possible. Retention says annotation should be available at RunTime. We need this annotation at run-time for message processor object to discover the command handlers.

Given annotation is an interface, we need to instantiate somewhere, right. Let us check the Echo handler code to understand this.

Like a Java modifier, annotation precedes the definition. In the above snippet, we annotate the EchoMessageHandler with Handler and makes its value to return "echo". This could be thought of as instantiation of Handler annotation interface.

I need to digress a bit; Check the Override annotation in the above snippet, which is actually annotating "handle" method. This annotation says that this method should be the parent interface method and is getting overridden. If that is not so, compiler would throw an error. Since Override annotation is Compile-time stuff, Run-time doesn't have any idea about this annotation.

Handler discovery of MessageProcessor class can be defined as follows:

Every Class object of type and Method object of class methods has a method getAnnotations which gives an array of annotations. This could be utilized for Handler discovery in this example.

Tuesday, March 27, 2012

Thread Pooling in Java - Part 2 - Internals.

For the basics of Java threads, please check this post,

In the first part, we had analyzed the needs for Fixed and cached Thread Pools.

Fixed thread pools have fixed number of running threads operating on a finite unbounded tasks queue.
Cached thread pools spawn as many number of threads as the task count at any time and have a Synchronized Queue.

And we had seen use cases for each of the thread pools in the previous part. Now, what if an use case needs the mixed behaviors of the above. For instance, behave like a CachedThreadPool until a fixed number of tasks.

Analysis of the implementations of the above thread pools would open new doors for solving interesting variants of thread-pool based problems.

Actually, both Fixed and Cached thread pools creation, internally would create instance of ThreadPoolExecutor with different parameters.
For instance, let us analyze the FixedThreadPool call,

public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
new LinkedBlockingQueue());

The declaration of ThreadPoolExecutor is,

public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue workQueue) { ...

Let us try to understand each and every parameter.

CorePoolSize: This represents the number of threads to be alive even in the absence of any task. In Fixed Thread Pool, it should be equal to the total or Max thread count as we know the optimal number of threads and destroying/ recreating the threads incur performance hurt.

Maximum Pool Size: This represents the maximum number of threads that could be created in the thread pool. If the count of running threads exceeds "corePoolSize" and queue of waiting tasks are filled completely, then a new thread could be created if Maximum Pool Size > Core Pool Size.

Keep Alive Time: In case threads created exceeds the corePoolSize and some of the threads are idle for "keepAliveTime" then those would be killed to save the resources in the System. And the next parameter is the unit for KeepAliveTime.

BlockingQueue: It describes the queue to be used for Waiting tasks. For Fixed Thread Pool, it is unbounded. And for CachedThreadPool, it is SynchronizedQueue, means at any time, queued task must be immediately served; means no task could be queued for processing later.

So, if we could statistically analyze the peak and average traffic of incoming tasks, we could come up with optimal values for Core, Max pool size and KeepAliveTime; which could make our thread-pool efficient and resources conservative. :)

To make the discussion complete, we will try to understand the implementation of ThreadPoolExecutor.

Well, we need to discuss what happens when Execute of ThreadPoolExecutor called.

Algorithm which backs Execute is simple. If number of threads is less than the core pool size, a new thread will be spawned to handle this new task. If the number of active threads exceeds the core pool size and queue is filled up fully, algorithm would check for the spawning of additional threads, constrained by the max pool size count, is possible; If not, rejection handler would be called.

ThreadPoolExecutor holds a control state variable ctl, which is an AtomicInteger, provides some useful information like effective worker threads and state of ThreadPool(Running, Shutting down, etc). And there are several utility functions around this variable.

Apart from this, there are several other functionalities which assist the main functionalities like termination of Thread Pool and thread factory, etc. People interested in that, could dive into the source code for complete understanding. I hope I tried my best to keep the information concise.

Thanks for Reading.

Friday, March 23, 2012

Thread Pooling in Java - Intuitive overview. Part 1

You may want to check my post on basics of Java threads,

Today Let us talk about Thread Pooling. Before that, let me give you an intuitive idea of why pooling of threads needed.

Let us start with a trivial web server implementation. In this, main thread would keep on listening to incoming requests and process those messages according to their arrival. This is easy to implement; Good for single processor web-server, given tasks are CPU bound or intensive. Normally, servers would have multiple processors. So, in Quad-core machine, even CPU intensive tasks would be utilizing about 25% of entire system's capability, if service is single-threaded.

Simultaneously, N number of requests could be easily served in N-processor based web server. Now, let us make our trivial web server to run N threads simultaneously to improve the performance by N fold. Cool. This way of scaling(generally, it means increasing number of requests served, per unit time) of a service is known as Vertical Scaling.

Let us discuss, how this system shall be designed. Since the requests served are of CPU intensive, we know the optimal number of concurrent threads., a priori. So, the system shall be designed as given below.
  1. At most, only the given number of threads be running and not more than that.
  2. It should be having finite unbounded queue of pending tasks; This is a moot point, by the way. But we will believe eventual completion of a task is better than rejecting that.
  3. Already created threads shouldn't be killed or shouldn't die on its own after the completion of a task. Since creating threads are generally known to be costly.
  4. During task execution, if a thread happens to crash itself, thread pool must be intelligent enough to create one.
The above design is so generic and could be abstracted easily. Java's FixedThreadPool does exactly the same job. It has to be defined with number of threads; It has finitely unbounded queue(roughly, 4 gig entries could be waiting in this queue, in a 32 bit machine) for waiting tasks.

It could be created using below line of code.

ExecutorService threadExec = Executors.newFixedThreadPool(numThreads);
ExecutorService is an interface which has APIs for submitting a task to the pool and Shutting down, etc. We will discuss this and "Future" in the next part of this blog as it would be digressing if we start discussing that right now.

A complete test code could be found here. You can test with several parameters and see the power of thread pooling.

Let us assume, our web server has to handle very simple requests which doesn't take much time to complete and involves huge of I/O activities - Files I/O, network activity like another web service to process. In this scenario, it is not good to limit the number of threads as most of the time would be spent on I/O and not on computing.

Main problem with this case is, determining the number of threads at most could run on a system. Even if we could get that parameter statistically, it is not good to hold those many threads running always. For instance, after some analysis, we come to know that there may be around 1000 threads needed, at most. If we go for FixedThreadPool, it is a waste of resources as we wouldn't get the peak traffic always. Since these tasks are asynchronous and probably short lived, getting a proper max bound on number of optimal threads wouldn't be always possible.

The system that could handle this scenario, shall be designed as follows.
  1. The system should create threads as and when needed.
  2. After a thread's task completion, it could wait for certain time and die if no other task is available.
  3. It shouldn't have any waiting tasks, rejection shall be preferred instead. Think about a SynchronousQueue in Java.

Java's CachedThreadPool is designed with the above-said design goals. It is perfectly great for short-lived, asynchronous tasks. Creating a CachedThreadPool and working with that is essentially tantamount to FixedThreadPool. So, the line below, doesn't need any further explanations.

ExecutorService threadExec = Executors.newCachedThreadPool();

And a test program to analyze this is,
We will discuss the implementation details of Fixed and Cached Thread Pool in the next part.

Thanks for Reading.