Wednesday, February 6, 2013

Java Threads - Synchronization

If you would like to have a look at my post on basics on Java Threads, Here it is http://karthikpresumes.blogspot.in/2013/02/java-multi-threading-basics.html

In this post, Let us discuss the need for synchronization and methods or techniques provided by Java to achieve that.

Threads do share resources and address space with other threads, if they belong to same process. Sharing address space implies that one thread could easily communicate with other threads by normal Java objects. This is not true if a process want to communicate with other process, as processes wouldn't share address space with other processes.

Let us assume a simple producer-consumer application. Producer is running in a thread and Consumer in another thread. Producer may populate information in a queue which is shared with Consumer object. This becomes very easy to implement using threads, as sharing the queue is as simple as holding the reference of the queue in both consumer and producer.

Producer Consumer With Threads

This code is to show you how easy to design with threads. But this comes with the cost of synchronization. Even though, in the code shown above, I didn't do anything special for multi-threading, synchronization is needed as the Queue object is being used by more than one thread. The ill effect of multi-threaded app without synchronization, may be well explained with the following code.

Simple counter without synchronization

Can you guess what could be the result? In my system, I got the following result.



We can explain this behavior with the concept of thread interleaving. Core part of counter is the code, "count++". This isn't a single activity as it seems. CPU has to execute three sequential steps to execute that.

  • Fetch the value of count and store it locally
  • Increment stored value
  • Set it back to the variable, count


During the execution of these steps, thread could be preempted by the processor at any step and another thread shall be resumed. This is known as thread interleaving. Certain patterns of interleaving may result in wrong behavior. Let us assume one thread got preempted at second step completion. And another starts. Now this thread wouldn't see the incremented value of counter and would increment the old value of the count variable. This may cause inconsistency and hence produce the wrong results.

Ok. How to solve this? By making a non-thread safe method, synchronized as below.



Synchronized makes the method to be executed by only one thread completely or atomically. If a thread is yet to fully finish off the synchronized method execution, other threads would have to wait, if they call that method. This is achieved by Java's object level monitor or lock mechanism. Whenever a thread calls the synchronized method, object lock/ monitor would be assigned to the thread and lock would be released only at the end of the method call. If other threads call this method, have to wait until the lock is released.This essentially serializes the method calls.

If a method modifies the state of the object which is shared across threads, then that method must be synchronized. Trade off of using the synchronized method is performance. Yes. if all methods are synchronized, there is no point in multi-threading, as it is effectively, single threaded. In fact, performance would decrease if synchronized is heavily used. It is important to encapsulate the code which modifies the state into a separate methods or synchronization should be applied for the right block of the code inside a method using synchronized statement.



Ok. what if the static methods use the synchronized keyword. Class level lock would be used instead of Object level. So, synchronized method is the easiest way of handling thread interleaving issues. Synchronized statement not only solves the threads interleaving issue, it also helps to achieve the happens-before relationship. It makes the changes done to the state, visible to other threads. As per Java thread model, it isn't guaranteed that write happened in one thread wouldn't be essentially visible to other thread. Have you ever heard of volatile variables? That is exactly to solve the issue of visibility. Any change in volatile variable would be immediately visible to other threads which uses the same variable. This might cause a bit of performance degradation as CPU couldn't do any optimization in this case. Memory barriers are techniques to solve the visibility issues. One of those techniques, is synchronization. Volatile and Atomic variables are some other ways for memory barriers.

Some of the reasons for memory inconsistency or visibility issues:


  • CPU would reorder execution steps as long as it doesn't affect the correctness of the program. 
  • CPU cache isn't needed to be always in sync with main memory. In multi-processor machines, this may cause visibility issues.
If a thread wants to make the changes visible to other threads immediately, it has to use synchronization. Ok, the simple thumb-rule is, whenever state is getting changed in multi-threaded environment, it is good to keep that synchronized.

One more important stuff to discuss is Atomic variables in Java. As we had discussed, Synchronized keyword is to make sure that the method or block of code has to be executed atomically by threads. Atomically means, the entire set of steps would be finished by thread before other threads starts executing the method. This comes with the cost as the thread has to acquire the lock and that involves certain kernel related activities. There is a cheaper way to achieve atomic execution, if the execution is simple like increment, decrement or swap with other value. CPU provides set of instructions to achieve these atomic operations using CAS(compare and swap) instructions. Java has abstractions like AtomicLong for this. Let us rewrite the Counter class with AtomicLong.


Now, there is no need for synchronized method as the operation is already atomic, because of AtomicLong. 

Thanks for reading. 

I will come up with one more post to discuss further in multithreading. Comments are welcome.

Tuesday, February 5, 2013

Java Multi-threading - Basics

Threads, in simple terms, is the basic unit of execution; It runs the code. It has its own stack and shares the heap space and resources with other threads in the process. Process could also be considered as the unit of execution, but threads are light weight as it shares several stuff with other threads in the process.

You may want to look at my blog post on Processes, http://karthikpresumes.blogspot.in/2013/01/linux-processes-essentials-part1.html

Why Multi-threading

Any program or software may have to perform CPU activity and IO activity, during the execution. These activities are exclusive, and performed by CPU and IO processor respectively. At certain times, either CPU or IO would be completely idle, if the program runs in a single thread. Multi threading may help in this scenario. During IO on one thread won't necessarily hinder CPU activity in another thread. And now a days, computers are powered by multi cores/ processors. Simultaneously several threads can run to improve performance, even though all are CPU intensive. Multi-processing is another way to effectively utilize the computing/ IO resources. Every task would be assigned a process.

Multi-threading has certain problems which Multi-processing doesn't have, like synchronization of threads, as processes would run in their own different address space and have its own set of resources. But since threads are light weight, it is better to use threads if tasks are short-lived.

Java threads  

Java has nice abstraction for threads in a class, Thread. So, creating a thread is as simple as instantiating a thread object. Thread has to be assigned some code to run, right. Java has an abstraction for that also, with its Runnable interface which has only one method "run". Just have a look at the following snippet for Thread creation.


We have implemented the "run" method of Runnable interface and assigned the "target" object in the Thread object, thread. Given these, "thread" object is all set. In order to start this thread, we have to call the start method in thread object. After that, a new thread would be spawned and the main thread would  continue further. Ok, at one point of time, we may need to wait for created thread to complete its task, right? In order to do so, we need to call thread object's join method from the parent thread; Join forces the calling thread to wait until the called thread is terminated. Let us have a look at the full code for this.


Forgot to mention, there is one more way to create thread which is to extend the Thread class and override its run method. But I would suggest the Runnable interface way, which looks clean and more readable than the other way. 


Since multiple threads are running in the process, it is good to make sure the threads are always doing some useful work. Think of a thread implementation which keeps on polling for some information in a tight while loop. Won't it hurt the overall system performance? Ok, In those scenarios, thread may give up its execution so that CPU could go ahead with other threads in the process. For that, thread may invoke "sleep" method which causes the CPU to suspend the execution of the thread for some time and switch over to the other threads. Sleep method has to be provided with time argument which signifies until when the thread has to be suspended. This time period should be approximate as it is not guaranteed that CPU would resume the thread exactly after this period of time, which is impossible. 

A snippet of code to illustrate this,


In this example, you could see, an try catch around Sleep method. Yes, sleep could throw Interrupted exception. In the same way, Join could also be interrupted. You have to gracefully handle this, in your application. Mostly, this exception could be ignored. Let us see an example with Interrupt from main thread and its handling in Child thread.


You may want to have a look at Java synchronization and atomic variables related stuff, which is discussed in a different post,

http://karthikpresumes.blogspot.in/2013/02/java-threads-synchronization.html

Thanks for reading, friends.