Tuesday, March 27, 2012

Thread Pooling in Java - Part 2 - Internals.

For the basics of Java threads, please check this post, http://karthikpresumes.blogspot.in/2013/02/java-multi-threading-basics.html

In the first part, we had analyzed the needs for Fixed and cached Thread Pools.

http://karthikpresumes.blogspot.in/2012/03/thread-pooling-in-java-intuitive.html


Fixed thread pools have fixed number of running threads operating on a finite unbounded tasks queue.
Cached thread pools spawn as many number of threads as the task count at any time and have a Synchronized Queue.

And we had seen use cases for each of the thread pools in the previous part. Now, what if an use case needs the mixed behaviors of the above. For instance, behave like a CachedThreadPool until a fixed number of tasks.

Analysis of the implementations of the above thread pools would open new doors for solving interesting variants of thread-pool based problems.

Actually, both Fixed and Cached thread pools creation, internally would create instance of ThreadPoolExecutor with different parameters.
For instance, let us analyze the FixedThreadPool call,

public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue());
}

The declaration of ThreadPoolExecutor is,

public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue workQueue) { ...

Let us try to understand each and every parameter.

CorePoolSize: This represents the number of threads to be alive even in the absence of any task. In Fixed Thread Pool, it should be equal to the total or Max thread count as we know the optimal number of threads and destroying/ recreating the threads incur performance hurt.

Maximum Pool Size: This represents the maximum number of threads that could be created in the thread pool. If the count of running threads exceeds "corePoolSize" and queue of waiting tasks are filled completely, then a new thread could be created if Maximum Pool Size > Core Pool Size.

Keep Alive Time: In case threads created exceeds the corePoolSize and some of the threads are idle for "keepAliveTime" then those would be killed to save the resources in the System. And the next parameter is the unit for KeepAliveTime.

BlockingQueue: It describes the queue to be used for Waiting tasks. For Fixed Thread Pool, it is unbounded. And for CachedThreadPool, it is SynchronizedQueue, means at any time, queued task must be immediately served; means no task could be queued for processing later.

So, if we could statistically analyze the peak and average traffic of incoming tasks, we could come up with optimal values for Core, Max pool size and KeepAliveTime; which could make our thread-pool efficient and resources conservative. :)

To make the discussion complete, we will try to understand the implementation of ThreadPoolExecutor.

Well, we need to discuss what happens when Execute of ThreadPoolExecutor called.

Algorithm which backs Execute is simple. If number of threads is less than the core pool size, a new thread will be spawned to handle this new task. If the number of active threads exceeds the core pool size and queue is filled up fully, algorithm would check for the spawning of additional threads, constrained by the max pool size count, is possible; If not, rejection handler would be called.

ThreadPoolExecutor holds a control state variable ctl, which is an AtomicInteger, provides some useful information like effective worker threads and state of ThreadPool(Running, Shutting down, etc). And there are several utility functions around this variable.

Apart from this, there are several other functionalities which assist the main functionalities like termination of Thread Pool and thread factory, etc. People interested in that, could dive into the source code for complete understanding. I hope I tried my best to keep the information concise.

Thanks for Reading.

Friday, March 23, 2012

Thread Pooling in Java - Intuitive overview. Part 1

You may want to check my post on basics of Java threads, http://karthikpresumes.blogspot.in/2013/02/java-multi-threading-basics.html

Today Let us talk about Thread Pooling. Before that, let me give you an intuitive idea of why pooling of threads needed.

Let us start with a trivial web server implementation. In this, main thread would keep on listening to incoming requests and process those messages according to their arrival. This is easy to implement; Good for single processor web-server, given tasks are CPU bound or intensive. Normally, servers would have multiple processors. So, in Quad-core machine, even CPU intensive tasks would be utilizing about 25% of entire system's capability, if service is single-threaded.

Simultaneously, N number of requests could be easily served in N-processor based web server. Now, let us make our trivial web server to run N threads simultaneously to improve the performance by N fold. Cool. This way of scaling(generally, it means increasing number of requests served, per unit time) of a service is known as Vertical Scaling.

Let us discuss, how this system shall be designed. Since the requests served are of CPU intensive, we know the optimal number of concurrent threads., a priori. So, the system shall be designed as given below.
  1. At most, only the given number of threads be running and not more than that.
  2. It should be having finite unbounded queue of pending tasks; This is a moot point, by the way. But we will believe eventual completion of a task is better than rejecting that.
  3. Already created threads shouldn't be killed or shouldn't die on its own after the completion of a task. Since creating threads are generally known to be costly.
  4. During task execution, if a thread happens to crash itself, thread pool must be intelligent enough to create one.
The above design is so generic and could be abstracted easily. Java's FixedThreadPool does exactly the same job. It has to be defined with number of threads; It has finitely unbounded queue(roughly, 4 gig entries could be waiting in this queue, in a 32 bit machine) for waiting tasks.

It could be created using below line of code.

ExecutorService threadExec = Executors.newFixedThreadPool(numThreads);
ExecutorService is an interface which has APIs for submitting a task to the pool and Shutting down, etc. We will discuss this and "Future" in the next part of this blog as it would be digressing if we start discussing that right now.

A complete test code could be found here. You can test with several parameters and see the power of thread pooling.

http://code.google.com/p/threadpool-tests-java/source/browse/FixedThreadPoolTest.java

Let us assume, our web server has to handle very simple requests which doesn't take much time to complete and involves huge of I/O activities - Files I/O, network activity like another web service to process. In this scenario, it is not good to limit the number of threads as most of the time would be spent on I/O and not on computing.

Main problem with this case is, determining the number of threads at most could run on a system. Even if we could get that parameter statistically, it is not good to hold those many threads running always. For instance, after some analysis, we come to know that there may be around 1000 threads needed, at most. If we go for FixedThreadPool, it is a waste of resources as we wouldn't get the peak traffic always. Since these tasks are asynchronous and probably short lived, getting a proper max bound on number of optimal threads wouldn't be always possible.

The system that could handle this scenario, shall be designed as follows.
  1. The system should create threads as and when needed.
  2. After a thread's task completion, it could wait for certain time and die if no other task is available.
  3. It shouldn't have any waiting tasks, rejection shall be preferred instead. Think about a SynchronousQueue in Java.

Java's CachedThreadPool is designed with the above-said design goals. It is perfectly great for short-lived, asynchronous tasks. Creating a CachedThreadPool and working with that is essentially tantamount to FixedThreadPool. So, the line below, doesn't need any further explanations.

ExecutorService threadExec = Executors.newCachedThreadPool();

And a test program to analyze this is,
http://code.google.com/p/threadpool-tests-java/source/browse/CachedThreadPoolTest.java
We will discuss the implementation details of Fixed and Cached Thread Pool in the next part.

Thanks for Reading.