In the early 1990's, I was a software engineer at Sun. We had an exceptionally bright team that was responsible for tuning the performance of our applications. Like most software engineers, I thought of performance tuning as seeking more optimal algorithms. Find the most elegant way to search a string, use optimal data structures that reduced search spaces. I was a bit taken aback to discover that the performance team worked exclusively on working set. In fact, they'd burn processor cycles to save working set. Why? Well, in those days our processors were running at 25MHz and our disks had an average access time of 15ms. Do the math, and you realize that you could execute more than 350K instructions in the time required for one disk page fault.
And guess what. Nothing has changed. Well, that's not exactly true. Memory sizes have grown significantly, so high performance services rarely actually incur a disk page fault. In fact, most applications are tuned to prevent it. But we've also shifted languages. C++ provides very literal control over memory while Java purports to hide it from developers. Unfortunately, what Java really does is allow developers to create memory problems that are ticking time bombs, impacting both the performance and the stability of their applications.
This is an issue of discipline though. C++ engineers become good at optimizing memory usage because it's impossible to develop a robust C++ application without thinking about memory constantly. Who creates the object, who owns the object, what is the life cycle, who cleans it up. These subjects are always at the forefront of C++ design exercises. When Java arrived, everyone eliminated these considerations because the language did magic and made them irrelevant.
Well, we're not that lucky. Leaks still exist. Even more insidious is garbage churn that exists in virtually every Java application I review. Interfaces are great, but I often find situations where an object is transformed into a string only to be parsed into a similar object by the caller. The new auto-boxing feature of Java 1.5 is great from a syntactic point of view, but now Java creates wrapper objects blindly, generating even more extraneous garbage. I rarely see intermediate object caching done in code, resulting in multiple calls to a method generating the same intermediate result repeatedly. There are many other examples I can cite, but the end result is significant amounts of extraneous garbage which impacts scalability.
How does this get addressed? Simply by admitting that Java doesn't make memory irrelevant. In fact, it makes it very relevant because garbage collection is significantly more expensive than manually managing your memory ala C++. Talk about how interfaces will be used to insure proper return types. Think about the garbage your creating as you develop. Make sure you understand what objects Java is creating behind the scenes for you. Add efficiency.
Technorati Tags: architecture, engineering, programming, java, scalability, services, software, to_read, toread
So now that Java is open source, how about a reference-counting, just-in-time finalizing optional memory management implementation, configurable on a per-class basis?
Posted by: Vadim Geshel | Monday, November 20, 2006 at 02:40 PM
On another blog someone was talking about a presentation you made concerning "green software". I am interested in hearing your ideas on this. Thanks, Tim Van Tongeren.
Posted by: Tim Van Tongeren | Tuesday, September 11, 2007 at 01:06 PM