Main | December 2006 »

Thursday, November 30, 2006

Omniscient Inference, The Promise of SOA, REST, and ROA

I've been doing a lot of reading lately on SOA and REST. I've even spent a little time looking at ROA. There is a common theme that permutes all of these approaches to architecture. It's something I can best sum up as omniscient inference. The idea that software will be able to infer the usefulness of the service or resource through some omniscient capability.  What do I mean? Well, let me tackle each architectural pattern individually.

SOA proponents argue that you can't have a true SOA without UDDI. The promise of UDDI is that software that needs a service can make SOAP calls and discover what services are available. Once discovering these services, your software is now able to infer the mechanisms by which the service can be invoked. It's quite impressive really. Your application will be able to infer from the WSDL the entities that are required to interact with the service, understand the boundary conditions of the service, and therefore be able to immediately leverage the service to extend its functionality.

Okay, this isn't completely accurate.  The intent of UDDI is to standardize the mechanism for organizations to discover what services are available. The problem is that it really tries to serve two masters. Software obviously can't leverage a service a priori. Developers need to design their applications based on the semantics of the service. The problem is that WSDL is a mediocre way to specify the full semantics of an interface for developers. It's difficult to read and even more difficult to document. Proponents would argue that developers don't use WSDL, they use code generators to translate into native languages familiar to the developer. But a Java method doesn't define semantics. Ultimately, the developer needs associated documentation, which challenges the need for UDDI as a service library.

For deployment and routing discovery, UDDI is useful. It's also incredibly complex as most of the information in UDDI is not required simply to manage the rendezvous between a client and service. In fact, most coordination of services does not require the WSDL but rather system qualities about the service provider.  The result is that UDDI has too few semantics defined for developers and too much semantics defined for client/service binding.

REST tries to go in a different direction than SOA.  The goal is simplicity and it does achieve a simpler interaction style than SOAP for simple tasks.  This is a good thing.  The problem with SOAP (and CORBA before it) is that even simple things require a significant amount of overhead. You have to design WSDL, after reading several hundred pages of specifications. You have to translate the WSDL with compilers, and install services frameworks. The performance of your application is hampered by the SOAP marshaling and the framework layers that exist to protect you from SOAP's complexities.  This is reasonable for complex problems but when I want to retrieve is a list of news items, a simple object in itself, then this is unnecessarily complex.  The REST camp is justified in crying foul.

But REST also proposes magic.  Micro-formats will save developers from complex entities.  They will standardize the way information is exchanged and allow applications to just work across the Internet.  In fact, micro-formats are simple enough that applications may be able to extract information from formats they have never seen before, or so the claim. And now we're back to this style of programming that allows our applications to possess omniscient inference.

Micro-formats face two major challenges. First, standardizing across organizations will prove to be challenging. Especially where the contents represents business entities that are offered by competing companies.  What incentive exists for company A to cooperate with company B on a standard definition of their common entity? Businesses thrive in a competitive market through differentiation. It's conceivable that some least common denominator format can be agreed upon, but each will still want to extend it to attract customers to their business.  Even relatively common data will face the temptation for extensions that give one company an edge in the market.

More problematic though is micro-formats are appropriate for simple entities.  As entities become more complex though, the format will no longer be micro.  For example, ATOM and MIME are both formats for representing a message.  ATOM is perfectly sufficient for a broad range of messages but it is not complete.  That's okay because the goal is to make simple messages simple and it does that admirably. But as the message semantics expand,  you have to resort to MIME to adequately model them. Messages are not the only entities that will either outgrow or potentially never fit within a micro-format.

ROA is an intriguing proposition.  Applications are freed from worrying about services at all.  A resource, which is effectively an entity, exposes the state transitions it is willing to accept.  Applications don't even care what the resource is, they simply decide whether they are interested in the state transitions available.  The level of omniscient inference with this approach is difficult to even explain.  So, if the resource is a tennis court and the state transition is reserve (is that a state change or operation?) an application can decide, without caring about the resource per se, that it wants to reserve it.  Of course if this is a seat at a concert you may be committing to a financial transaction. At the very least, the state may change back without the applications knowledge (except it's omniscient so that isn't possible).

The reader may take away from this article that I'm anti-SOA, anti-REST, and anti-ROA. That would be unfortunate as it doesn't characterize my position at all. SOA is necessary for proper decoupling of large architectures. REST has tackled the problem of making simple services, simple.  ROA is truly intriguing, although I'd like to see more attention given to the concept.  No, my objection is to bold claims that can't realistically be achieved. It isn't necessary and detracts from the real value that these approaches bring to the world of architecture.  There are meritorious aspects of each approach so let's focus on how to best leverage those in appropriate ways.

Technorati Tags: , , , , , , , , , , , , ,

Friday, November 24, 2006

You Scaled Your What?

"We're embarking on a project to provide us 4X improvement in scalability.", said the enthusiastic architect.

"Really, in what way?", I replied, knowing that the trap had already been laid.

"What do you mean, it's obvious, we will have 4X the scalability of our current architecture when we're done!", my com-padre stated with a confused brow.

"Is that transactional headroom, storage improvements, reduction in operations staff, or time to market for new features?", I offer, struggling to hold back a wry smile.

"Uh, well, I guess it's transactional headroom, or at least that is what I call scalability.", he offers, realizing this is probably not going where he wanted.

"Well, can you achieve your 4X scalability if you experience a 4X increase in data volume?", realizing that I now feel sorry for him.

Yes, this conversation happens far too often. Scaling systems actually require managing the scaling along all of the vectors. Ignore one and that will be the one that becomes your next bottleneck. But what are the vectors? Where is our friend, full of enthusiasm, misguided? Let's look at the minimal set of scalability vectors that should be considered in every architecture.

Transactional

This is the most obvious scalability vector. Everyone understands the concept of TPS and it is fairly straightforward to manage. However, there are some interesting challenges defining what improving scalability means. Is the improvement in TPS based on higher TPS per system or is it by supporting more systems? Either is possible but each influence other vectors (that we'll get to in a bit). How can that be? Well, let's look at these two options a little deeper.

If the approach is to improve TPS per system, that implies improving software efficiency. This would be accomplished by performing in depth analysis of the bottlenecks and improving the implementations. This will require development and test resources that could be applied to business features. During the scalability work, feature time-to-market (TTM) will suffer. Beyond that, some performance improvements will require algorithms and implementations that may be more difficult to understand and manage. Interfaces may need to be compromised to achieve the optimizations. This will lead to potential negative impacts on developer productivity.

If scalability will be achieved through improved horizontal scalability. This implies more systems that need to be deployed and managed. The systems will require operational support. The operational model will need to have appropriate management tools or the incremental cost of the additional systems will quickly strain the organization's ability to manage the site.

The point is that even the obvious challenge of scaling transactions affect other aspects of the system. Let's move on to other scalability vectors.

Data

Data scalability, like transactional scalability, is something that most organizations feel they adequately understand. All too often though all data is treated the same which leads to less than optimal data scaling. Most organizations have different classifications of data and determining the most appropriate storage and data management approaches can often lead to dramatically improved cost efficiency.

The data that is critical to the mission of the business must be protected with the most robust technology. But non-critical data, for example derived data, can be maintained by less costly technology. Creating tiers of data categories is an important aspect of providing cost effective data scaling.

Operational

Operational scalability addresses the ability to manage the software once it's developed. Operationally the software has to be provisioned, monitored, and controlled. These aspects must be included in any complete system architecture. But more importantly, the incremental overhead of adding new software and new features must also be considered as a scaling vector. Minimizing the incremental operational cost for each new increment in transaction and feature growth should be a part of every scalability project.

Deployability

Deployability refers to the ability to deploy the system in a variety of different locations and conditions. Most large scale systems will require geographic diversity for a couple of reasons. First, geographic diversity protects the business against localized disasters. As this protection is required, most organizations will want to try to leverage the diversity to provide better responsiveness to their customers. As long as you have data centers on both coasts, you might as well use those resources to reduce latency.

Software architectures should take the challenges of spanning geographies into consideration. The best architected solutions will scale to global deployments. Unfortunately, this aspect of architectural scaling is often ignored and when the need arises to geographically distribute the applications, the latency between the locations becomes an issue, often restricting the geographic span of the system.

Another deployment resource that is becoming critical to most organizations is power and cooling. Architects have not considered power consumption as one of the primary scalability vectors until recently. But if you are not looking at power now, you will be shortly.

Productivity

Productivity refers to developer productivity. Arguably this is one of the more difficult metrics to accurately measure but there are subjective ways to determine how an organization is doing on this front. There have been sufficient treatise on the topic of developer productivity and how to achieve it. I won't attempt to summarize that here. What I will assert though is ignoring the potential impact on developer productivity while attempting to improve the scalability of any other vectors discussed here will almost surely result in a decrease. It should be considered when any architectural changes are being made in the name of scalability.

Feature TTM

Feature time-to-market as a scalability vector? What kind of heresy is this? An architect grousing about how quickly those annoying features reach our customers! That's not a scalability problem! Okay, sit down, keep reading, I'll explain myself.

Business features pay the bills. The primary job of an architect is to provide the business with the flexibility it needs to respond to competitive situations. The good news is that if you do a good job on the other scalability vectors, this one will largely take care of itself. But like all vectors, you can't ignore it.

The challenge here is insuring the architecture has the ability to move into new business models cost effectively. These may be incremental changes to the current model or completely new models. The best way to do this is to pay attention to the overall architecture and not let any of the other scalability vectors become too badly ignored.

What Does All This Mean?

The point is that you have to consider these scalability vectors wholly when scaling an architecture. That doesn't mean that you will be able to scale in all directions simultaneously. Quite the contrary, you always have to give up ground on one axis to gain ground on another. But the axis you ignore is the one that suffers the most, often to the long term detriment of the architecture.

Scalespider I like to use spider charts to illustrate architectural scalability. They allow you to easily visualize what aspects of your architecture are scaling and what aspects are suffering. They are also a useful tool for setting the ideal scaling model and comparing how your current model is tracking ideal. The red line on the spider graph to the right represents the ideal scaling model for a hypothetical organization. Every organization will have its own ideal scaling graph. The blue line represents their current investment in scaling the architecture. It becomes immediately obvious that transactional scalability has taken priority over other scaling opportunities.

Have some scaling vectors you think I've missed? Post a comment. I enjoy discussing the various aspects of scaling architectures.

Scaling eBay

Want to know more about how we scale eBay? Randy Shoup and myself will presenting that topic at SDForum's SAM SIG on Wednesday, November 29th. If you are in the San Francisco Bay Area, and would like to see how eBay approaches scalability, I encourage you to come to the talk.

Technorati Tags: , , , , , , , , , ,

Tuesday, November 21, 2006

The REST Dialogues, Part 2 with a Real eBay Architect

In Part 2, Duncan continues the interview discussing the REST advantages for setting data. Once again, I will replace my imaginary co-worker and respond to the interview questions. And yes, the standard disclaimers are still in effect. Nothing I state should be construed as product plans and these opinions are mine alone.

Duncan Cragg: Now, let's look at those calls in eBay's SOAP API that are expected to change data. There is often a corresponding 'Get' function.

Examples are GetTaxTable, SetTaxTable and GetMyMessages, ReviseMyMessages, ReviseMyMessagesFolders, DeleteMyMessages, etc.

Here, there is an implicit model of data (or lists of data) that you can look at, modify, delete, etc.

Dan Pritchett: Okay

DC: Yes, but we're talking about what the API is telling us, here. It says things like GetTaxTable and SetTaxTable!

We'll come on to business processes later, but just look at the simple stuff first.

DP: Sure, let's see how simple this stuff really is.

DC: Again, scalability and interoperability.

In the same way as with GET, you can partition your POST handling across the URI space. Have your Tax Tables and Message folders handled by many machines, split by their URIs. This is an example of the inherent parallelisability of the Web's architecture.

DP: Absolutely, and we do. Horizontal scaling is an inherent aspect of our architecture. We partition along many vectors, not just URI or function. We also partition based on subsets of data and geographic proximity.

DC: Sure, but you had to 'hard-code' this partition: to use specific heuristics and code to achieve it: URI-based partitioning is much more generic and flexible, and can be much finer-grained. You seldom get single-point of failure or bottleneck issues with good REST architectures.

DP: Most of our partitioning is data driven and not hard-coded. Ultimately though you have to map an operation onto a pool that has the necessary algorithms and access to the data. eBay has over dynamic 1500 URI's implemented by a large body of code. The scale of the problem is larger than just deploying all operations to all machines so our partitioning will always involve a certain degree of fixed mapping. We have considerable flexibility in managing these mappings but they are a necessity of our scale.

And architectures with single-point of failures aren't allowed at eBay. It's one of the fundamental principles that every architect strictly abides by.

DC: Interoperability again derives from the standardisation of the Content-Types and the schemas of POSTed data. When you know the meaning of the data you fetch, you also know how to try and change it, if you can.

DP: Sure, the structure of the entities should remain consistent whether you are getting or setting them. That's logical. Of course there are aspects of entities that are derived and effectively read-only (e,.g. current high bid on item) but that doesn't necessarily violate the principle.

DC: Well, the same applies, insofar as the GET data effectively tells you what any return data should look like. However, this time it's in an implicit way.

Where, in the read context, you understood what to do with a given content type you'd fetched, now, in the write context, you further understand what you can send back to it again.

DP: There are additional constraints on setting data though that are more problematic than reading. Attributes of entities will have restrictions on valid values. This is not necessarily defined when reading as the service is generating the entities so it controls these boundaries.

DC: Straightforward data updating is the simplest case - it's the same content type coming out as going back in. You see a Tax Table at some URI, then you just PUT a new Tax Table back at that URI to try and replace it.

All the API function calls beginning 'Set' should probably be implemented this way.

DP: Perhaps, but using the Tax Table as an example, you have attributes like region and tax rate. What are the acceptable regions? Is it an enumeration or an opaque token? This isn't something you were necessarily concerned about when you retrieved the data, but now it is essential. These semantics must be clearly understood to correctly set the data and many may actually vary from implementation to implementation. Standardization becomes problematic as you drill down to this level of semantics.

DC: An example of a data format that implies its edit capabilities by reference to a standard is Atom syndication.

You can GET a list of Atom entries, then POST a new one according to the Atom Publishing Protocol (APP) specification. You POST, not to a 'service' or handler, but to the URI of the actual collection you're adding to.

DP: I understand the example, but I am not following the point.

DC: The eBay Message lists are a great candidate for using this approach, with the benefit that any (recent) feed reader can be used to view them, and an APP-compliant client used to manage them.

All the API function calls beginning 'Add' should probably be implemented in the same way APP adds new entries, or you should consider using APP itself to implement them.

DP: I think this is a great example to illustrate the challenges of semantic compatibility versus syntactic compatibility. Atom and APP define the syntax and the semantics to a certain level but not as fully as you might think. Consider the author element. The name space for this element is undefined. Is it an email address, a full name, or a user name? If it is coming from eBay then it is user name but those have virtually no useful context outside of the eBay system, by design. Yes, you could leverage a good deal of your application's implementation to process message streams from eBay if they are in Atom format, but you couldn't really correlate those messages to any messages from other content providers.

Posting is even more interesting. One of the topics we've not discussed up to this point is authentication. The eBay message system requires authentication to avoid spoofing. REST supports authentication but there doesn't appear to be an agreed upon standard just yet. Even with a standard, each system will have its own authentication credentials which will be an interesting challenge to manage as you attempt to traverse multiple sites. Message format will ultimately be a minor challenge relative to the credential management.

[Note: My imaginary co-worker made a point about SOA that I didn't find as relevant. Please forgive the incongruity in the dialogue.]

DC: Schema proliferation is SOA's problem - so I'm glad you brought it up! It's baked in to the SOA mindset that it's OK to design your own interfaces and schemas from scratch each time.

Conversely, the expectation and culture in the Web and especially in the REST-aware community is to constantly look out for opportunities to conform and to standardise schemas and interactions, and to build on layers below that are already standardised. It's a side-effect of the shareability of URIs.

With REST you get interoperability at many levels above the byte-transfer of HTTP. Content type understanding and standardisation can occur all the way up from characters to Tax Tables.

DP: I'm having a bit of a problem following this point. REST encourages the standardization of content types to gain the efficiencies. I'll concede that SOA has done a poor job of encouraging similar standardization, but there is nothing that inherently prevents SOA from doing so. At the same time, REST providers can unintentionally, or even intentionally if they feel the business model supports it, drive away from standardization.

[Note: My imaginary co-worker got a little lost and needed more explanation.]

DC: OK: there is some code that understands basic HTTP GET and PUT, and UTF-8 characters - and doesn't need to know what a Tax Table is - some that understands UTF-8 plus XML, some that understands that plus Atom, or plus XHTML, then XHTML tables and Microformats like hCalendar, hCard (and hAtom!), then conference schedules using hCalendar and hCard. Tax Tables can perhaps use XHTML tables.

Clients may or may not need to know what the schema of a Message looks like internally in order to be able to do useful work with Messages at their level of understanding - from character stream up to APP.

DP: I think you've described the OSI network model and what I would call good application layer design. At least that's how I design my applications, but I'll grant you not everyone is as diligent. To the extent that you can properly encapsulate the micro-formats in other documents, then I agree that your application can only worry about the portions of the response that is interesting to it. I think this becomes a little more challenging in practice though.

DC: You never know, your eBay schemas might be taken into account when coming up with new standard content types for e-commerce!

DP: That would be nice.

[Note: Deleted a superfluous exchange on XHTML]

DC: This is one of the Myths of REST - that it's just for simplistic reading and writing of data.

It's a actually a myth that's often propagated by the REST community itself, especially with their over-emphasis on the Four HTTP Verbs, which seem to map so conveniently onto Create, Read, Update and Delete.

DP: We call that CRUD and those operations are low level entity persistence methods. I would agree that they aren't terribly meaningful at a business interface level.

DC: It is often good, but can detract from the whole goodness!

One consequence of this mapping it that people inevitably go on to see an analogy between REST and databases, and then start to expect transactions and other database features.

Another consequence of the database analogy is that resources are seen as lifeless servants of the active client: it takes away responsibility from a resource to be master of its own destiny.

This then causes confusion (even within the REST community) about the power of the client and even the very meaning of such basics as the PUT method.

The fact is, GET and POST are more than enough to be REST compliant. This cut-down pair also help focus the mind on URIs, two-way content transfer, content type or schema and on the responsibilities of each resource as active players in an integration scenario.

DP: No argument.

DC: Tim Bray has a history of suggesting that GET and POST are enough.

And recently, there has been further high-level support from Sam Ruby and Leonard Richardson in their manifesto for an upcoming book.

DC: So, to summarise: use GET to read data, then POST back to the same URI to suggest changes to the resource there.

All based on a given level of understanding and interpretation of the content type and any corresponding interaction standards.

Interoperability and scalability in a nutshell!

Now I will explain how there's more to REST than simple reading and (attempted) writing of resources. We're still only two-ninths done...

DP: I look forward to seeing where we go next.

Technorati Tags: , , , , , , , , , , , , , ,

Monday, November 20, 2006

The REST Dialogues, A Real eBay Architect

In Getting Data | The REST Dialogues, Duncan Cragg conducts an interview with an imaginary eBay architect. While I don't play one on TV, I am a real eBay architect and would love to participate in this dialogue. As the first two parts are complete, I felt I should post my follow ups here and hopefully invite Mr. Cragg to conduct the remaining 7 parts with me. I must make the standard disclaimers though. I am not speaking for eBay. None of my comments reflect on current or future products. This is purely a technical discourse on the merits of REST vs SOAP styles of interaction, whether eBay ever chooses to offer such an interaction or not.

Duncan Cragg: So - let's get straight to my argument: I claim that your SOAP APIs, as instances of the SOA style, won't scale or interoperate as well as they would if they were implemented in the REST style. Which, in the form of the Web, has largely proven scalability and interoperability.

Dan Pritchett: The scaling argument is an interesting position. Most of the data that would be returned by eBay interfaces will involve structure that is best captured in XML. From a scaling perspective, XML is XML. Parsing is definitely more expensive than generation and there is little argument that REST can reduce the parse load placed on our resources but this is only a portion of the overall processing load.

Interoperability would depend largely on the relatively similarity between the eBay entities and other Web 2.0 entities. To the extent there is overlap, I would concur that a standardized format improves interoperability. I would also assert that the most interesting entities at eBay are unique to eBay.

DC: That's true now, but if the Web 2.0 vision comes together, you may care: your API traffic could increase dramatically. It would be better to be the one prepared for the scale of the API-Web!

Can you really argue in your company that you don't need to be scalable? What if your port 80 traffic needs to be routed to your APIs for some reason?

DP: While my imaginary co-worker may state that scalability is not a concern for eBay, I would never make such a claim. Scalability is always an architectural consideration. Rather than expecting that port 80 traffic would be routed to the API though, I would expect that future traffic growth might come from applications that leverage the API.

DC: As for interoperability, you could be excluded from Web 2.0 industry-boosting consortia, or excluded from perhaps hugely popular Web 2.0 applications in the future...

Interoperability raises the level of the market as a whole. Market players shouldn't differentiate on what's common to them, they should differentiate on the level above.

It also depends on the value you place on having happy customers who don't have to do the same thing multiple ways or multiple times.

DP: This comes back to defining what operations and entities are common between eBay and other market players. There are probably subsets of entities that are common (e.g. messages or users) but even in that context, there are structured components that require extensions from a common base entity to prove useful. This becomes the substance of the conversation and I also believe the largest challenge that the semantic web currently faces.

DC: OK, let's look at your SOAP API. There are 72 function calls in there that begin with 'Get'. Each one specifies a particular piece of data that you can fetch.

DP: Go on

DC: Sure, but you don't need a new function call for everything you can get from your system: you can just use HTTP GET!

DP: Sure, I just need to parameterize the GET operation to differentiate what data you're requesting. But aren't we largely debating syntax and mechanism, not semantics?

DC: It's not just any 'data going in': the URI can be passed around for anyone to re-use. This URI is more interoperable because so much deployed software understands it. No-one understands 'GetSearchResults()'!

DP: Okay, fair enough. URI oriented requests can be more easily saved and shared.

DC: Another example of how the URI can glue things together is that the data returned from your GETs can have more URIs in them, ready to go! You won't get data from your Web Service with 'GetItem()' in it..

DP: Another fair point.

DC: REST also talks about the formats of the data behind a URI. In a GET, the response data is given a Content-Type, and there's an expectation that clients will understand the types of data being returned: interoperability comes from broad standardisation of return data.

DP: But now we're back to the point I've made earlier. Standardization assumes common entities. We certainly have entities that can be declared common at a high level (e.g. users, products, messages). These entities become somewhat less common as you dive into the details. We also have several entities that are not common (e.g. items, bids).

DC:The explicit statement of Content-Type reflects a culture of agreement forced by the sharability of URIs: your URIs are more sharable when more clients understand the data they dereference to.

On the other hand, the culture of SOA is to declare custom WSDL and custom XML schemas.

Like I said, one day you may care about interoperability, and having an architecture that puts a high value on content type and schema standardisation, as REST does, puts you one step ahead.

DP: So the suggestion is that all vendors are going to agree on a common set of entities and their detailed schemas? I suppose that might happen for a subset of the entities but I think even that will prove challenging. I was at Sun in 1992 when we proposed an industry standard format for calendar appointments. Fourteen years later there are still competing standards and trying to give a Mac iCal appointment to an Outlook user is harder than it should be.

If there are REST standards around the entities that we publish, then it would make sense for us to consider them. To the extent that our entities are unique, then isn't the format we publish the standard by definition?

DC: You can also gain scalability by partitioning on those URIs.

DP: We partition along many dimensions, URI's just being one of them.

DC: Yes, but URI partitioning cuts right through the system in a very simple way: your partitioning is an application-specific optimisation which has to be hand-coded behind the SOAP interface.

DP: Our partitioning doesn't follow the model you've imagined but I can't really share all of our partitioning magic with you.

DC: Another benefit of using HTTP over using SOAP is that you get cacheing built in to the architecture, which you can start using as soon as you ask for it in the headers. This boosts scalability.

DP: Caching dynamically generated content is considerably more difficult than you think. There are portions of our results that can be cached but rarely the entire result set from a single request. We already to caching where it can be done and still provide correct results to the interface. Bear in mind that you are talking about a system with more than 5,000 state changes per second.

DC: Which is where you're potentially inefficient.

DP: Correctness must always override efficiency, especially where money is concerned.

DC: Again - it's application-specific.

So - even in the simple cases of fetching data, REST has given you much greater scalability and interoperability than your SOAP interface - as well as a simpler, more generic approach.

DP: In many cases our caches have to be application specific. The correctness of the data can only be insured by understanding the logic used to generate it. We've studied caching opportunities extensively and apply caching where it can be done safely, with no risk of producing inconsistent or incorrect results. REST isn't going to change the business rules or our customer's expectation of accuracy.

DC: And we're only one-ninth of the way through our conversation!

DP: Great, this has been fun!

Technorati Tags: , , , , , , , , , , , , , , , ,

Sunday, November 19, 2006

It's Still About Memory

In the early 1990's, I was a software engineer at Sun. We had an exceptionally bright team that was responsible for tuning the performance of our applications. Like most software engineers, I thought of performance tuning as seeking more optimal algorithms. Find the most elegant way to search a string, use optimal data structures that reduced search spaces. I was a bit taken aback to discover that the performance team worked exclusively on working set. In fact, they'd burn processor cycles to save working set. Why? Well, in those days our processors were running at 25MHz and our disks had an average access time of 15ms. Do the math, and you realize that you could execute more than 350K instructions in the time required for one disk page fault.

And guess what. Nothing has changed. Well, that's not exactly true. Memory sizes have grown significantly, so high performance services rarely actually incur a disk page fault. In fact, most applications are tuned to prevent it. But we've also shifted languages. C++ provides very literal control over memory while Java purports to hide it from developers. Unfortunately, what Java really does is allow developers to create memory problems that are ticking time bombs, impacting both the performance and the stability of their applications.

This is an issue of discipline though. C++ engineers become good at optimizing memory usage because it's impossible to develop a robust C++ application without thinking about memory constantly. Who creates the object, who owns the object, what is the life cycle, who cleans it up. These subjects are always at the forefront of C++ design exercises. When Java arrived, everyone eliminated these considerations because the language did magic and made them irrelevant.

Well, we're not that lucky. Leaks still exist. Even more insidious is garbage churn that exists in virtually every Java application I review. Interfaces are great, but I often find situations where an object is transformed into a string only to be parsed into a similar object by the caller. The new auto-boxing feature of Java 1.5 is great from a syntactic point of view, but now Java creates wrapper objects blindly, generating even more extraneous garbage. I rarely see intermediate object caching done in code, resulting in multiple calls to a method generating the same intermediate result repeatedly. There are many other examples I can cite, but the end result is significant amounts of extraneous garbage which impacts scalability.

How does this get addressed? Simply by admitting that Java doesn't make memory irrelevant. In fact, it makes it very relevant because garbage collection is significantly more expensive than manually managing your memory ala C++. Talk about how interfaces will be used to insure proper return types. Think about the garbage your creating as you develop. Make sure you understand what objects Java is creating behind the scenes for you. Add efficiency.

Technorati Tags: , , , , , , , ,

Saturday, November 18, 2006

What Is This About?

The name of this blog and the quote by Antoine de Saint-Exupery is a philosophy that I have followed most of my career. It carries over from work to the rest of my life as well. Albert Einstein said it a different way, "Everything should be as simple as possible but no simpler". The phrase "Add Simplicity" is paraphrased from Colin Chapman, father of Lotus cars, who stated that they work to "add lightness".

The reality is that when engineering anything, your job is to add simplicity. While it sounds like an oxymoron, the reality is that the easiest designs are the most complex. Finding simple solutions is actually hard work. This is one of the hardest lessons for fresh software engineers to learn. Everything has a simple solution, finding it is hard.

I hope to post on a variety of engineering topics. They will vary from design philosophies to opinions on emerging trends. Throughout though, I will stick to the idea of "add simplicity".

Technorati Tags: ,