|
The Java Specialists' Newsletter
Issue 125 2006-04-17
Category:
Book Review
Java version: JDK 6.0 Book Review: Java Concurrency in Practiceby Dr. Heinz M. KabutzAbstract: We review Java Concurrency in Practice by Brian Goetz. Brian's book is the most readable on the topic of concurrency in Java, and deals with this difficult subject with a wonderful hands-on approach. It is interesting, useful, and relevant to the problems facing Java developers today.
Welcome to the 125th edition of The Java(tm) Specialists' Newsletter, sent to you from the
"Dark Continent" (a term for Africa, coined about 130 years ago).
A few weeks ago we suffered recurrent power failures, which made
the name "Dark Continent" rather apt.
Would you like to really understand Java concurrency? Join us for an
in-depth study of how threading works in Java. During the course,
you will learn how to write correct and fast multi-threaded Java code.
Please
click here if you would like to learn more.
In my course on the new
features in Java 5, we examine the "new" concurrency
constructs of Java. Most of these are based on classes that
have been freely available on Doug Lea's website
for at least six years, and were well described in his
excellent book Concurrent Programming in
Java . However, I am yet to meet someone, either on a
course or during my contracting / consulting, that has read
Doug Lea's book.
Java Concurrency in Practice
is split into four distinct sections:
- Fundamentals
- Structuring Concurrent Applications
- Liveness, Performance, and Testing
- Advanced Topics
Brian does an excellent job of explaining, and his examples are
more similar to the real world than you would find with other
books.
Something else I like about the book is that it mentions all
sorts of new features that are available in Java 6. Whilst I
cannot use that yet in production, it is good to know what is
coming.
At the end of the first section on Fundamentals, Brian goes
through the steps of building an efficient, scalable result
cache that you could use in a web server. The idea is to
remember the result from a previous calculation in order to
reduce latency and increase throughput, at the cost of a bit
more memory usage.
The problem with building this cache is that if we are not
careful, we could easily turn it into a scalability bottleneck.
Brian starts with a basic HashMap, then looks at ways that we
can make it more scalable.
We first define interface Computable, which performs some
calculation:
public interface Computable<A, V> {
V compute(A arg) throws InterruptedException;
}
The next class is ExpensiveFunction, which takes a long time to
compute the result:
import java.math.BigInteger;
public class ExpensiveFunction
implements Computable<String, BigInteger> {
public BigInteger compute(String arg) {
// after deep thought...
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
return new BigInteger(arg);
}
}
We now build a Computable wrapper that remembers the result from
the previous calculations and returns these transparently. This
process is called memoization.
import java.util.*;
public class Memoizer1<A, V> implements Computable<A, V> {
@GuardedBy("this")
private final Map<A, V> cache = new HashMap<A, V>();
private final Computable<A, V> c;
public Memoizer1(Computable<A, V> c) {
this.c = c;
}
public synchronized V compute(A arg) throws InterruptedException {
V result = cache.get(arg);
if (result == null) {
result = c.compute(arg);
cache.put(arg, result);
}
return result;
}
}
The @GuardedBy annotation is one of several described in the
book that help us to document our assumptions regarding thread
safety:
import java.lang.annotation.*;
/**
* The field or method to which this annotation is applied can only
* be accessed when holding a particular lock, which may be a
* built-in (synchronization) lock, or may be an explicit
* java.util.concurrent.Lock.
*/
@Target({ElementType.FIELD, ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
public @interface GuardedBy {
String value();
}
Since HashMap is not threadsafe, we need to take the
conservative approach and lock on all access. If you have
several threads queued up to execute compute, then it might
actually take longer than without caching.
A slight improvement occurs when you change the HashMap to be
a ConcurrentHashMap instance:
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
public class Memoizer2<A, V> implements Computable<A, V> {
private final Map<A, V> cache = new ConcurrentHashMap<A, V>();
private final Computable<A, V> c;
public Memoizer2(Computable<A, V> c) { this.c = c; }
public V compute(A arg) throws InterruptedException {
V result = cache.get(arg);
if (result == null) {
result = c.compute(arg);
cache.put(arg, result);
}
return result;
}
}
The problem with Memoizer2 is that even if one thread starts an
expensive computation, other threads may still start the same
computation. Instead, we should have some way of representing
the notion that "thread X is currently computing f (27)", so
that if another thread arrives looking for f (27), it knows that
it should wait until Thread X has finished its work.
Yet another option is to use the FutureTask class
import java.util.Map;
import java.util.concurrent.*;
public class Memoizer3<A, V> implements Computable<A, V> {
private final Map<A, Future<V>> cache
= new ConcurrentHashMap<A, Future<V>>();
private final Computable<A, V> c;
public Memoizer3(Computable<A, V> c) { this.c = c; }
public V compute(final A arg) throws InterruptedException {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = ft;
cache.put(arg, ft);
ft.run(); // call to c.compute happens here
}
try {
return f.get();
} catch (ExecutionException e) {
// Kabutz: this is my addition to the code...
try {
throw e.getCause();
} catch (RuntimeException ex) {
throw ex;
} catch (Error ex) {
throw ex;
} catch (Throwable t) {
throw new IllegalStateException("Not unchecked", t);
}
}
}
}
The Memoizer3 is almost perfect, but there still is a window of
vulnerability when two threads might compute the same value.
The window is smaller than in Memoizer2, but it still is there.
Memoizer3 is vulnerable to this problem because a compound
action (putif-absent) is performed on the backing map that
cannot be made atomic using locking.
The next approach in the book is
to use the atomic putIfAbsent() method from ConcurrentMap.
import java.util.concurrent.*;
public class Memoizer<A, V> implements Computable<A, V> {
private final ConcurrentMap<A, Future<V>> cache
= new ConcurrentHashMap<A, Future<V>>();
private final Computable<A, V> c;
public Memoizer(Computable<A, V> c) { this.c = c; }
public V compute(final A arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
// Kabutz: this is my addition to the code...
try {
throw e.getCause();
} catch (RuntimeException ex) {
throw ex;
} catch (Error ex) {
throw ex;
} catch (Throwable t) {
throw new IllegalStateException("Not unchecked", t);
}
}
}
}
}
When I read the Java samples, the problem and the solution both
appeared quite straightforward to me. Brian has picked
plausible real-world example that clearly demonstrate the
techniques that he presents.
This book should be available in your bookshops in the next few
weeks, so keep your eye open for this one! Alternatively, you
can also pre-order it on Amazon.com .
Kind regards
Heinz
Book Review Articles
Related Java Course
Discuss at The Java Specialist Club
|