If you would like to have a look at my post on basics on Java Threads, Here it is http://karthikpresumes.blogspot.in/2013/02/java-multi-threading-basics.html
In this post, Let us discuss the need for synchronization and methods or techniques provided by Java to achieve that.
Threads do share resources and address space with other threads, if they belong to same process. Sharing address space implies that one thread could easily communicate with other threads by normal Java objects. This is not true if a process want to communicate with other process, as processes wouldn't share address space with other processes.
Let us assume a simple producer-consumer application. Producer is running in a thread and Consumer in another thread. Producer may populate information in a queue which is shared with Consumer object. This becomes very easy to implement using threads, as sharing the queue is as simple as holding the reference of the queue in both consumer and producer.
Producer Consumer With Threads
This code is to show you how easy to design with threads. But this comes with the cost of synchronization. Even though, in the code shown above, I didn't do anything special for multi-threading, synchronization is needed as the Queue object is being used by more than one thread. The ill effect of multi-threaded app without synchronization, may be well explained with the following code.
Simple counter without synchronization
Can you guess what could be the result? In my system, I got the following result.
We can explain this behavior with the concept of thread interleaving. Core part of counter is the code, "count++". This isn't a single activity as it seems. CPU has to execute three sequential steps to execute that.
During the execution of these steps, thread could be preempted by the processor at any step and another thread shall be resumed. This is known as thread interleaving. Certain patterns of interleaving may result in wrong behavior. Let us assume one thread got preempted at second step completion. And another starts. Now this thread wouldn't see the incremented value of counter and would increment the old value of the count variable. This may cause inconsistency and hence produce the wrong results.
Ok. How to solve this? By making a non-thread safe method, synchronized as below.
Synchronized makes the method to be executed by only one thread completely or atomically. If a thread is yet to fully finish off the synchronized method execution, other threads would have to wait, if they call that method. This is achieved by Java's object level monitor or lock mechanism. Whenever a thread calls the synchronized method, object lock/ monitor would be assigned to the thread and lock would be released only at the end of the method call. If other threads call this method, have to wait until the lock is released.This essentially serializes the method calls.
If a method modifies the state of the object which is shared across threads, then that method must be synchronized. Trade off of using the synchronized method is performance. Yes. if all methods are synchronized, there is no point in multi-threading, as it is effectively, single threaded. In fact, performance would decrease if synchronized is heavily used. It is important to encapsulate the code which modifies the state into a separate methods or synchronization should be applied for the right block of the code inside a method using synchronized statement.
Ok. what if the static methods use the synchronized keyword. Class level lock would be used instead of Object level. So, synchronized method is the easiest way of handling thread interleaving issues. Synchronized statement not only solves the threads interleaving issue, it also helps to achieve the happens-before relationship. It makes the changes done to the state, visible to other threads. As per Java thread model, it isn't guaranteed that write happened in one thread wouldn't be essentially visible to other thread. Have you ever heard of volatile variables? That is exactly to solve the issue of visibility. Any change in volatile variable would be immediately visible to other threads which uses the same variable. This might cause a bit of performance degradation as CPU couldn't do any optimization in this case. Memory barriers are techniques to solve the visibility issues. One of those techniques, is synchronization. Volatile and Atomic variables are some other ways for memory barriers.
In this post, Let us discuss the need for synchronization and methods or techniques provided by Java to achieve that.
Threads do share resources and address space with other threads, if they belong to same process. Sharing address space implies that one thread could easily communicate with other threads by normal Java objects. This is not true if a process want to communicate with other process, as processes wouldn't share address space with other processes.
Let us assume a simple producer-consumer application. Producer is running in a thread and Consumer in another thread. Producer may populate information in a queue which is shared with Consumer object. This becomes very easy to implement using threads, as sharing the queue is as simple as holding the reference of the queue in both consumer and producer.
Producer Consumer With Threads
This code is to show you how easy to design with threads. But this comes with the cost of synchronization. Even though, in the code shown above, I didn't do anything special for multi-threading, synchronization is needed as the Queue object is being used by more than one thread. The ill effect of multi-threaded app without synchronization, may be well explained with the following code.
Simple counter without synchronization
Can you guess what could be the result? In my system, I got the following result.
We can explain this behavior with the concept of thread interleaving. Core part of counter is the code, "count++". This isn't a single activity as it seems. CPU has to execute three sequential steps to execute that.
- Fetch the value of count and store it locally
- Increment stored value
- Set it back to the variable, count
During the execution of these steps, thread could be preempted by the processor at any step and another thread shall be resumed. This is known as thread interleaving. Certain patterns of interleaving may result in wrong behavior. Let us assume one thread got preempted at second step completion. And another starts. Now this thread wouldn't see the incremented value of counter and would increment the old value of the count variable. This may cause inconsistency and hence produce the wrong results.
Ok. How to solve this? By making a non-thread safe method, synchronized as below.
Synchronized makes the method to be executed by only one thread completely or atomically. If a thread is yet to fully finish off the synchronized method execution, other threads would have to wait, if they call that method. This is achieved by Java's object level monitor or lock mechanism. Whenever a thread calls the synchronized method, object lock/ monitor would be assigned to the thread and lock would be released only at the end of the method call. If other threads call this method, have to wait until the lock is released.This essentially serializes the method calls.
If a method modifies the state of the object which is shared across threads, then that method must be synchronized. Trade off of using the synchronized method is performance. Yes. if all methods are synchronized, there is no point in multi-threading, as it is effectively, single threaded. In fact, performance would decrease if synchronized is heavily used. It is important to encapsulate the code which modifies the state into a separate methods or synchronization should be applied for the right block of the code inside a method using synchronized statement.
Ok. what if the static methods use the synchronized keyword. Class level lock would be used instead of Object level. So, synchronized method is the easiest way of handling thread interleaving issues. Synchronized statement not only solves the threads interleaving issue, it also helps to achieve the happens-before relationship. It makes the changes done to the state, visible to other threads. As per Java thread model, it isn't guaranteed that write happened in one thread wouldn't be essentially visible to other thread. Have you ever heard of volatile variables? That is exactly to solve the issue of visibility. Any change in volatile variable would be immediately visible to other threads which uses the same variable. This might cause a bit of performance degradation as CPU couldn't do any optimization in this case. Memory barriers are techniques to solve the visibility issues. One of those techniques, is synchronization. Volatile and Atomic variables are some other ways for memory barriers.
Some of the reasons for memory inconsistency or visibility issues:
- CPU would reorder execution steps as long as it doesn't affect the correctness of the program.
- CPU cache isn't needed to be always in sync with main memory. In multi-processor machines, this may cause visibility issues.
One more important stuff to discuss is Atomic variables in Java. As we had discussed, Synchronized keyword is to make sure that the method or block of code has to be executed atomically by threads. Atomically means, the entire set of steps would be finished by thread before other threads starts executing the method. This comes with the cost as the thread has to acquire the lock and that involves certain kernel related activities. There is a cheaper way to achieve atomic execution, if the execution is simple like increment, decrement or swap with other value. CPU provides set of instructions to achieve these atomic operations using CAS(compare and swap) instructions. Java has abstractions like AtomicLong for this. Let us rewrite the Counter class with AtomicLong.
Now, there is no need for synchronized method as the operation is already atomic, because of AtomicLong.
Thanks for reading.
I will come up with one more post to discuss further in multithreading. Comments are welcome.
nice article.
ReplyDeletequestions:
1)Memory barriers are techniques to solve the visibility issues. can't understand the term memory barrier here ?
2) As per Java thread model, it isn't guaranteed that write happened in one thread wouldn't be essentially visible to other thread
so, does this means synchronized doesn't sole the visibility issue ? like atomic var or volatile keyword