Benji Smith in Why I Hate Frameworks:

So this week, we're introducing a general-purpose tool-building factory factory factory, so that all of your different tool factory factories can be produced by a single, unified factory. The factory factory factory will produce only the tool factory factories that you actually need, and each of those factory factories will produce a single factory based on your custom tool specifications. The final set of tools that emerge from this process will be the ideal tools for your particular project.

A great read that not only shows why I too hate most frameworks, but also over-engineered code in general. YAGNI FTW! (via Don Box)

During a Java debug session today, I was investigating an exception and noticed that its cause was set to itself; effectively leading to an infinite stacktrace (or so I thought).

When I added a watch on exception.getCause(), this returned null though, so I initially assumed this was a bug in my IDE (I'm using a beta release of IDEA 5.0). However, a quick check of the source-code of Throwable disproved that initial assumption. As it turns out, this problem is caused by a JVM hack implementation.

Take a look at the excerpt from Throwable below:

  /**
   * The throwable that caused this throwable to get thrown, or null if this
   * throwable was not caused by another throwable, or if the causative
   * throwable is unknown.  If this field is equal to this throwable itself,
   * it indicates that the cause of this throwable has not yet been
   * initialized.
   *
   * @serial
   * @since 1.4
   */
  private Throwable cause = this;

  public Throwable getCause() {
      return (cause==this ? null : cause);
  }

  public synchronized Throwable initCause(Throwable cause) {
      if (this.cause != this)
          throw new IllegalStateException("Can't overwrite cause");
      if (cause == this)
          throw new IllegalArgumentException("Self-causation not permitted");
      this.cause = cause;
      return this;
  }
It turns out that self-causation is the default state; indicating that the cause has not yet been initialized. In other words, it's nothing but a hack to save the developer from either adding a causeInitialized boolean to Throwable, or (if they really felt they need to save those 4 bytes), doing something like
  private static Throwable NOT_INITIALIZED =
             new Throwable("CAUSE NOT INITIALIZED", null);
    
  private Throwable cause = NOT_INITIALIZED;
Now from a runtime-perspective, this hack really doesn't matter as cause is private and the this initial value is never returned to the user. Unfortunately from a debug-perspective, this implementation is utterly confusing.

So why do I still say I love Java? Because, unlike .NET for instance, I have access to the source-code in moments like this. Language-wise, I actually prefer C# over Java. Library-wise, I also tend to favor the .NET implementations over their Java counterparts. But with .NET, if something works somewhat differently from what I would expect, I don't have the option of checking the actual implementation. To me, this is a BIG DEAL. Having the source available is not only helpful in situations like this, but it's also a tremendous aid when you truly want to grok an API.

And yes, I'm aware of the existence of Rotor and decompilers, but

  1. Rotor is an incomplete implementation, lacking all of WinForms for instance.
  2. Decompilation is not perfect and sometimes leads to awkward looking code.
  3. Checking code from either Rotor or decompiling a class tends to not be integrated in VS.NET, so I'm forced to leave the IDE for tasks like this.
  4. .NET decompilers typically show only one decompiled method at a time (the ones I've tried did this anyway) instead of the full class, leading to a more fragmented view of the code.
  5. Decompiled code lacks comments and sometimes even lacks the original variable-names.

There's been some dialog going on for over a year now about open-sourcing Java, but as far as I'm concerned, Java's already open-source enough. I just wish .NET would follow Java's example on this aspect as well. I don't need a license to allow me to change the source, but it sure would be nice if Microsoft would surprise me and include the source in .NET 2.0...

Last week, Sam Ruby posted a very interesting article on continuations for "people older than dirt" (a category which I, according to his definition, fall into). The topic became even more interesting when Don Box posted how you can use a very similar syntax in the next iteration of C#. Shortly thereafter, Cedric Beust posted that he wasn't convinced on the usefulness of this construct, and that a simple Java class without continuations pretty much does the same thing.

Despite having, like Cedric, mainly a Java background, I do think this construct will be useful, and would welcome it in the next release (that seems to be the established pattern anyway). Consider the example of trying to write a filtered Iterator. Using the C# 2.0, the code almost writes itself:

    class Program
    {
        public static IEnumerable<T> IteratorFilter<T>(IEnumerable<T> iterator,
                                                       Predicate<T> predicate)
        {
            foreach (T value in iterator)
            {
                if (predicate(value))
                {
                    yield return value;
                }
            }
        }

        static void Main()
        {
            int[] values = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
            foreach(int i in IteratorFilter(values, 
                    delegate(int v){ return v%2 == 0; }))
            {
                Console.WriteLine(i); // even numbers only
            }
        }
    }
The same can definitely not be said for Java. My initial version took quite a bit longer to create than the C# version, and ended up looking like this:
    public interface Predicate<T> {
        boolean evaluate(T value);
    }

    public class IteratorFilter<T> implements Iterator<T> {
        private Iterator<T> _iterator;
        private Predicate<T> _predicate;
        private T _currentValue;
        private boolean _hasNext = true;
    
        public IteratorFilter(Iterator<T> iterator, Predicate<T> predicate) {
            _iterator = iterator;
            _predicate = predicate;
            skipFiltered();
        }
    
        private void skipFiltered() {
            while (_iterator.hasNext()) {
                _currentValue = _iterator.next();
                if (_predicate.evaluate(_currentValue)) {
                    return;
                }
            }
            _hasNext = false;
        }
    
        public boolean hasNext() {
            return _hasNext;
        }
    
        public T next() {
            if (!_hasNext) {
                throw new NoSuchElementException();
            }
            T result = _currentValue;
            skipFiltered();
            return result;
        }
    
        public void remove() {
            throw new UnsupportedOperationException();
        }
    
        public static void main(String[] args) {
            Integer[] values = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
            Iterator<Integer> i = new IteratorFilter<Integer>(
                    Arrays.asList(values).iterator(), 
                    new Predicate<Integer>() {
                        public boolean evaluate(Integer value) {
                            return value % 2 == 0;
                        }
                    });
    
            while(i.hasNext()) {
                System.out.println(i.next());
            }
        }
    }
Ouch! Without continuations, there's no option to just iterate over the elements and filter out the unwanted values, like I did in the C# implementation. Instead, I'm forced to create a class that implements the Iterator interface, store _currentValue and _hasNext as fields, and create a skipFiltered() method to skip unwanted values.

I am aware that C#'s IEnumarable isn't exactly the same as a Java Iterator; in a way it's more like a subset of Java's Collection. However, creating a Java function that creates a new Collection (containing the filtered subset of the original) wouldn't be quite the same as it would create a second Collection and have all filtering done up-front, instead of just an iterator whose elements are fetched from the original in an on-demand fashion.

Disclaimer: I am by no means an expert on the new constructs in C# 2.0 or Java 5.0 (far from it) - if I've overlooked something, please let me know in the comment section below.

For me, the most interesting sessions of the conference were regarding the new MemoryModel and Concurrency Utilities. Having recently done quite a bit of threading code, it's clear that JDK 5.0 will vastly improve on the current primitives synchronized, wait and notify (which will, of course, still be supported in JDK 5.0, though they probably won't be used as much).

The new memory model will ensure that things will work much more intuitively than they currently do. The double-checked locking idiom for instance is broken right now due to allowed statement re-ordering and caching of variables in registers (this, by the way, is not just a Java-issue, it's also broken in languages like C++ and C#). Now statement reordering (either by the compiler or by the processor) and caching of values in registers are extremely important features for performance so they cannot simply be turned off to make stuff like this work. Using new definitions for volatile and synchronized, stating when they should read/write to memory (as opposed to using cached values), things will work much better and more intuitively in JDK 5.0.

The new concurrency utilities package (based on Doug Lea's threading libraries) contains a lot of new Classes and Interfaces to make writing multi-threaded code much easier. The idea behind this package is to do for threading what the collections framework did for collections. When they first mentioned this, I was rather skeptical to say the least. At the end of the session though, I believed they might just pull it off.

The concurrent utilities package will implement thread pooling through Executors. The Executor interface consists of a single "void execute(Runnable command)" method, so is effectively just "something that runs Runnables". This can be a single threaded worker, a regular thread pool, a scheduled thread pool or even a custom implementation. There's an Executors class which has factory methods to create many commonly used types of Executors. The actual type returned by these factory methods is typically an "ExecutorService": a sub-interface of Executor with added methods to, among other things, manage termination of the pool.

JDK 5 also introduces concurrent collections. These collections achieve thread safety while still allowing certain operations to occur concurrently. While using the synchronized keyword often requires placing a lock on the entire collection during iteration, concurrent collections will allow multiple operations to overlap each other. The new ConcurrentHashMap class for instance allows overlap of multiple reads, reads over writes and even up to 15 overlapping writes (it will be interesting to check out the code to see how they pulled off the concurrent writes). Also, iteration over a ConcurrentHashMap will never throw a ConcurrentModificationException.

Another example of a concurrent collection is the CopyOnWriteArrayList - this class is optimized for cases where iteration is much more frequent than insertion or removal. It is for instance ideal for EventListeners. As the name suggests, every write causes the current collection data to be copied, thus guaranteeing that no ConcurrentModificationExceptions will ever be thrown during iteration - if a collection gets changed while another thread is iterating, that other thread will continue seeing the original (non-changed) data.

There are also a number of lower level concurrency utils like Locks and Conditions that will be introduced in JDK 5.0. A Lock allows for functionality similar to that provided by the synchronized keyword, but without forcing you to lock and unlock within a single block of code. Locks also have added options like the ability to check a Lock without waiting infinitely, or timing out Locks.

The ReentrantLock class implements a reentrant mutual exclusion lock with the same semantics as built-in monitor locks (synchronized), but with extra features like the ability to interrupt a thread waiting to acquire a lock, specifying a timeout while waiting for a lock, polling for lock availability, and support for multiple wait-sets per lock via the "Condition" interface. ReentrantLock outperforms built-in monitor locks in most cases, but is slightly less convenient to use as it requires a finally block to release lock, like

Lock lock = new ReentrantLock();
...
lock.lock();
try {
    ...
} finally {
    lock.unlock();
}

A Condition is the abstraction of wait-notify. You can get a Condition object from an existing Lock object, and subsequently call await() and signal() on them. This allows you to for instance have multiple Condition objects against a single Lock, instead of having to use notifyAll to wake up all threads, only to put all but one of them back into a wait-state.

There are many other useful classes and interfaces like ReadWriteLock, ReentrantReadWriteLock, Semaphore, CountdownLatch, etc.

And last but not least, JDK 5.0 will introduce Atomic Variables: classes that support atomic operations like "compare-and-set" and "get,set-and-arithmetic". At runtime, the JVM will use the best available implementation of this functionality, depending on the platform it runs on. So this may internally be implemented using a lock, or may use a native contruct (if supported by the processor) to do these kinds of atomic operations. An example of an atomic variable class is "AtomicInteger". which will be useful for things like counters and sequence numbers.

Sun's studio creator has interesting collaboration-options built in: it comes with an instant messenger pane that has code-options, enabling things like copy'n paste of code (which will be sent with code-highlighting), doing code-completion in the IM window, and even sending entire files, which can subsequently be edited synchronously between both parties - very cool stuff.

It was announced that "There is no application in the gaming space today that cannot be written in Java". I find this somewhat hard to believe to say the least. The first example they gave to show the power of Java in gaming was a third person shooter that was partially written in Java (that is, none of the actual rendering used Java) not quite what I'd call overwhelming evidence. The second game they showed was 100% Java though and can be downloaded from Java.net. It looked pretty good, but only showed a 3-D view walking through the environment. There were no other characters, let alone any actual action. I don't think managed code is quite at a point yet where it can be used to create games like HalfLife 2, but I'm willing to be proven wrong...

The also showed the phantom gaming service hardware. This is a system, which supports Java (which I assume means it can do Java, but can do other stuff as well), was created by the co-creator of Microsoft's XBox. It is not a regular console, but instead can download games on demand from the network. The hardware will be free with a two-year subscription.

JDK 5 will have less dependence on command-line parameters and instead auto-tune itself for optimal performance. It will use your machine's configuration to come up with an initial config, and dynamically change settings at runtime. This is great news as tweaking stuff like NewSize, SurvivorRatio, etc. can be a major pain. This feature will NOT be turned on by default on windows though (it will be on other platforms) - use the concurrentGC option to enable it on windows.

Lucene is a Java search-engine I'll definitely have to check more into. I think there's a .NET port for this project as well (though I don't know how up-to-date it is with the master Java project) so I may even be able to use it in SharpReader.

The Groovy scripting language is, well, groovy. Using a very powerful and highly condensed syntax, you can write things in Groovy that would take ten times the number of lines in Java. Closures are especially cool. Performance is currently about 20-90% of that of Java (closer to 20% when using dynamic typing (which I guess uses reflection for every method call), closer to 90% for static typing (which takes away from the easy of use of Groovy)). This makes me somewhat sceptical of how useful this language really is right now for production systems. I'm sure they'll be able to beef up performance at some point, but until they do, I think I'd rather spend some extra time and learn Python instead if I want to use a scripting language (although I have to admit that the Java support in Groovy is pretty nice...)

Prior to going to JavaOne, the main subjects I was looking forward to were generics, EJB3 and the AOP panel. The actual experience was different though: sessions on generics did not show me much I had not already read online, and the main thing to take away from the AOP panel was that it can be useful, but don't overdo it - duh.

The EJB persistence session was interesting though: it looks like the EJB 3.0 spec will finally make Entity beans useful again. No more interfaces to implement, POJOs instead of abstract classes (so you can actually test against them outside of the container) and no more home+remote+bean+2deployment-descriptors per EJB.

Most importantly though, EJBQL becomes vastly more useful, allowing you to use it for almost anything you can do with SQL. You can, for instance, use bulk operations like

DELETE FROM Customer c 
WHERE c.status = 'inactive'

UPDATE Customer c 
SET c.status = 'outstanding' 
WHERE c.balance < 10000
Also, there will be support for sub-queries, as in
SELECT goodCustomer 
FROM Customer goodCustomer
WHERE goodCustomer.balance < (SELECT AVG(c.balance) FROM Customer c)

SELECT DISTINCT emp 
FROM Employee emp
WHERE EXISTS(SELECT spouseEmp 
             FROM Employee spouseEmp 
             WHERE spouseEmp = emp.spouse)
and joins will also be supported:
SELECT i.category 
FROM Item i JOIN j.bids b WHERE b.amount > :amount

SELECT i 
FROM Item i LEFT JOIN FETCH i.bids WHERE i.category = 'paintings'
The FETCH above means that the bids objects will be prefetched from the database when the items are retrieved, instead of the current situation where there will be one extra query for every bid object.

EJBQL will even support GROUP BY and HAVING. My personal favorite though is projection, allowing you to retrieve select fields from select objects, as in

SELECT c.id, c.status 
FROM Customer c JOIN c.orders o WHERE o.count > 100 (returns object array)

SELECT new CustomerDetails(c.id, c.status, o.count) 
FROM Customer c JOIN c.orders o WHERE o.count > :ordercount
The first query will return an Object-array, the second will return actual CustomerDetails objects.

As interesting as all this was though, it still wasn't my favorite session, on which I'll blog more later this weekend...

Last night's blogger meetup was a blast. It was great to actually meet the people behind the blogs. I talked to Tim Bray, Simon Phipps, MaryMary, Cedric, Cameron, Mike, Charles, Matt, Russel Beattie (which is pronounced Bee-Ah-Tee, not Bee-tee) and many others. Unfortunately I missed out on Hani (who I would've loved to introduce to Russel :-) and EricGu (who may have been chased out of there before I showed up on account of being from the so-called "dark" side). Even a new kid on the blog: Jonathan Schwartz was there, though I did not get a chance to talk to him. Interesting to see some someone like him start a blog; I hope he'll actually be able to put some personal opinions in there and not just use it as a marketing tool. Who knows: maybe Joshua Bloch will start a blog soon too (the Bloch Blog?); now THAT would be one to subscribe to.

Copyright © 2003, 2004 Luke Hutteman