« September 2007 | Main | March 2008 »

Wednesday, October 24, 2007

What Metadata?

I'm positive that is what operations teams often think developers are asking themselves. And what exactly is metadata? Well, the simple answer is any information that is outside the main data flows that is used to control or monitor the behavior of the application. And with an answer that vague, it's little wonder it's often not part of the initial design. I break metadata into two major categories. Configuration data that controls the behavior of the application. And telemetry data that is generated by the application.

Configuration Data

Configuration data is usually given little if any attention during application development. Just throw whatever you need in a property or perhaps an XML file. The file is picked up from somewhere in the file system and that's that. Unfortunately, this fails to take into consideration several factors that can impact the manageability of the component. Let's take a look at some of the common pitfalls.

Configuration vs. Code

This is one of the biggest issues I see when designing configuration data. People get confused that properties and XML files may contain what is essentially code. Anything that would only be changed by a developer, is code, regardless of what file format is used to express it. Making files that are essentially code available for potential modification as configuration in a production deployment is a formula for disaster. Code must be tested before released to production. You can't test code that you allow to be changed in production. Therefore, files that are essentially code should be delivered embedded in your deliverable (e.g. inside the jar file) and not with configuration files you might expect to be modified in production.

Configuration Resource

Where to put the configuration information and how to format it is a subject of continuous debate. Files are the obvious choice for most applications but this brings up several interesting questions. First is the format. Java properties? XML? Something else (probably not)? Fortunately this issue can usually be resolved by looking at the structure of the information. The more structured the more likely XML is the right choice.

Files bring up the interesting problem of where do you put the file. If you deploy it inside your WAR, then it becomes a challenge for operations to locate the file. It's buried somewhere under your application server directory. You can put it in some other distinguished location but if you do this, you need to allow operations to override that location at server start or you will constrain their ability to manage the number of instances of the application per server.

Delivering the configuration with the application has other issues though. Invariably, there will be values of configuration that must be set to reflect the production environment. Once operations has modified the file, how do you deliver your subsequent versions. You can't overwrite the version of the file on the production server, yet if you've added necessary configuration values, how do these get merged into the production file.

Another challenge with file based configuration is that it does not scale well. It works well enough for O(10^2) servers but beyond that it is unwieldy. Attempting to manage files on thousands of servers requires appropriate tools to have any chance at all. Even with the best tools, there is a tendency for files to be missed and very confusing production errors result.

So what is the alternative? Centralizing configuration into services such as LDAP or a configuration database (CDB) can alleviate the challenges associated with distributing files. This definitely scales better to a large number of servers. But there are some challenges with this approach as well. The primary challenge is providing developers with a usable environment for testing. Developers typically require a local version of their resources with the ability to change the content at any time. This is relatively straightforward with files but much more difficult with LDAP or CDB. Still, the scalability offered to production may well be worth this hassle.

Telemetry

Okay, logging, but I have actually borrowed this term from professional racing (or NASA, take your pick) for a good reason. From Wikipedia:

Telemetry is a technology that allows the remote measurement and reporting of information of interest to the system designer or operator.

Thinking of the information that a component can send to logs in terms of telemetry instead of just logs, gives a better perspective on what belongs in the stream for operators. Developers tend to only think of logs as tools for themselves to help in debugging when in fact they are a necessity to properly monitor the health of a running application.

The question of course is what kind of information should be in the telemetry? Considering a common web service, at a minimum, I'd expect to find the following information easily in the stream:

  • The request URI including parameters
  • Basic parametric information about the request
  • External resource interactions performed by the service. These should include status and timings.
  • The result status of the request.
  • The processing time of the request.

If this information is made available to operations in real time, there is several types of monitors that can be created. Alarms can be set on thresholds for error status ratios or dramatic drops in request rates. Operational graphs can be made for average response time with potential alerts for response time drifting out of SLA. Dependency graphs can be constructed that will help operations correlate resource failures to client impacts. And this is from the small amount of information proposed above. Additional telemetry can provide even more operational monitoring capabilities.

The Java logging facility can meet the needs of generating telemetry although you may want to separate out telemetry into its own logger name. For small scale deployments, this telemetry can simply be sent to log files. Scripts that regularly scrape the logs can be used to extract the relevant bits of information. For larger scale operations however, a central logging scheme is more relevant. Logging using the socket handler may be sufficient although for very large scale installations, it may be desirable to move to a less reliable but more scalable transport such as multicast.

Summary

I know that I've only scratched the surface metadata issues. The point of this posting wasn't to give you an exhaustive guide of configuration and telemetry, but rather to bring up some issues and initiate a dialog. As always, comments most welcome.

Technorati Tags: , , , , , , , , , , , , ,

Sunday, October 14, 2007

Gartner's Top 10 Technologies for 2008

Network World has published an article on Gartner's Top 10 Technologies for 2008. I love predictions although I will admit that October is a bit early. But why not get yours out early and beat the holiday rush, right? Actually as predictions go these are not bad. Like most, they range from the obvious to the "as-if" but it's risky to stick your neck out and we shouldn't be harsh with any organization that does. I do have a few comments on the predictions though.

Starting off in the #1 spot is green computing. I think they are right, especially for large organizations. It's no surprise to anybody trying to run a large data center that power is now the constraining resource. Unfortunately, the message is that green computing is a hardware problem. They also fall short of tying #5, virtualization, to the challenges of green computing. But if we get people to use virtualization to drive up utilization without realizing it's also saving power, I'm okay.

I'm a bit more dubious about #2. Not that I wouldn't love to have some unification of my communication world, but for me it seems my communications are diverging, not converging. I have far too many mailboxes, phone numbers, and calendars now. Every social network a friend joins just brings another communication channel for me to manage. At some point we may all cry "Uncle!" but right now it feels like we are moving in any direction but unified communications.

The next one that captured my attention was #4, meta data. This is definitely a problem that needs more attention. It is usually left as an afterthought to the overall architecture and yet any system of scale is virtually unusable without a good meta data system, designed in, not added on. I couldn't agree more that this needs to become serious focus in the year to come.

Virtualization is #5 on their list. Personally I'd move this higher. Abstracting deployments from physical instances is incredibly powerful. It is the next step in the logical progression of abstractions (machine language, high level language, virtual machines, and now virtual OS containers). Fortunately there is little for application developers to do on this one but if you aren't designing your applications to live happily inside a virtual container, start now.

It is interesting that the broke out their definition of computing fabrics into a separate item, #8. If my execution platform is sufficiently abstract, then a logical extension is that the hardware platform will be able to gain flexibility in how it allocates physical resources like processors and memory across boundaries within blades. The more interesting fabric to me is a better abstraction of systems like Beowulf so programming becomes more practical. There are an increasing number of social problems that scale far beyond the confines of one system but also don't split will due to the network nature of the data. Environments that let mere mortals program to loosely coupled clusters would help tremendously.

As always with predictions, it's fun to have a look back at the end of the year and see what you got right and what went far astray. I'm sure 2008 will follow several of these trends but will also bring us some interesting surprises. Feel free to share your predictions with comments!

Technorati Tags: , , , , , , , , , , ,