« Gartner's Top 10 Technologies for 2008 | Main | Software Is A Craft »

Wednesday, October 24, 2007

Comments

Jim Standley

I agree with your configuration comments. I worked with a vendor product that configured everything in XML files distributed in the EAR. This made it impossible to have anything that varies from one environment to another from build through production. We overrode the configuration reader to read from XML and if not found read from database. Each environment has its own database, and each developer can optionally have their own XML for localhost testing. It's A Bad Thing (!) if a personal XML file happens to get deployed, though.

Re telemetry ... I'm currently working with code that displays "Exception occurred" but does not display the exception or stack trace. Some team standards might be in order. ;-)

Bill de hOra

In non-Java systems you learn quickly to house everything (logs, config, scripts) outside the codebase. In designs, I look for layouts that allow code to be upgraded without touching config files and vice versa. The linux file system is a good working example.

Java systems are different. I'm not sure whether this is technical (containers, classpaths) or cultural (file system independence, wora); I suspect it's both. Certainly by 2007 I'd have hoped we'd be beyond throwing WAR/EAR file over to ops and driving production configurations out of software build chains. But it doesn't look that way yet.

Smartfrog is a project to watch:

http://www.hpl.hp.com/research/smartfrog/


Telemetry: I'm biased, but I think this will come to be done using Atom or some such over XMPP. What you've described in the information stream looks like an extended Atom Entry and while XMPP is inefficient relative to jmx/snmp, it's more flexible and scalable, and probably requires less upfront agreement between agents.

Sri Shivananda

Configuration aspects I have had to pay attention to:
1. Who needs to change configuration?
- Developer, Operations, Data Managers, Product staff . . .
2. How often does configuration update need to happen?
- Code release boundary, Time based, Demand driven . . .
3. Push vs. Pull?
- Deployer based, knows all end points (?)
- Repository subscription based, I know what I need principle

In a large system, configuration data / metadata has the tendency to get scattered over various places. I have seen certain elements of configuration directly setup inside the confines of the code (I mean c, java ...). First level of externalization is in configuration files that are shipped alongside code as properties, xml or even text files. Both of these scenarios in which the frequency of change is limited by the code release cycle and requires development staff to be engaged. Second level of externalization is where the config files are deployed into alternate paths on the server file system. Gives operations staff the control, however, like you have mentioned becomes hard to manage as the number of servers increase. Third level of externalization being a CMDB, allows for ad-hoc, on demand changes, but cannot have one for each developer unless you can create a decent replicated environment on every developers box.

Another aspect of configuration: In larger deployments, where servers are going in and out of rotation making sure each of the servers have the correct / latest configuration deployed also becomes a challenge. It is up to the maturity of the configuration audit systems to validate, verify, reconcile and fix issues. Especially an issue with push based systems.

As you have may have experienced, hierarchical configuration based on multiple levels of overrides and customizations makes this problem even harder. I am personally a fan of a CM service. A well abstracted interface that can hide the complexity of the location of configuration, overrides, specialization. One that could fetch (Pull) config based on functional, location and other application semantics, either at startup time, or based on an explicit instruction through the server's admin interface, or a periodic pull.

A well thought out consistent, well designed approach with "simplicity added" :-) along with a good portfolio of tools and finally the organizational discpline to keep it consistent could work.

vmoharil

Very interesting and relevant post. Currently we run into some of the same issues you have mentioned in both areas.

Internally we developed a configuration mechanism that builds on top of java.util.preferences, adds more functionality and uses XML files as the backing store. The API hides the location/env/file-vs-db complexity from the app dev teams. We also ran into the same issues that you mentioned. Some of the basic requirements we had were
- individual server level granularity for different configs (A/B testing)
- remove to eliminate dependence on db services team
- make it fast for devs
- must support hot deployment of changes

As you said we are in the O(10^2) server complexity and this has worked fairly ok for us. We also had to build scripts and gui tools to view and make changes to the production env so that non-devs could make changes to n number of servers without bringing the server down.

Some of the requirements I mentioned above made LDAP and db a more difficult choice though at some point we went ahead and added db connectivity so - some rather static configs come from db in addition to hundreds of others that come from XML files.

On the logging - totally agree. We built a fairly sophisticated enterprise logging library. This allows us to traverse a single user and his activity across various systems that span java and C++, including status, timings, errors etc. This same data gets plugged into monitoring console to keep track of systems' health. We also built a sophisticated viewer that sits on top of this data so that non-devs can keep track of farms, server, single user down to a single page.

In summary totally agree with you and more often than not - this is the last thing that gets resources when putting together the plans/sotries.

Damon Edwards

Regarding working with configuration metadata, we've found that the most effective way to manage environments of any significant scale or complexity is to use a model-driven system like ControlTier (http://open.controltier.org) to manage application deployment and ongoing management.

Why model-driven? Because the that metadata you were referring to is needed for purposes other than just configuring a single applcation. You need to use that same metadata to configure other applications to work in conjunction with your application. You also need to use that same metadata to provide context to the automation that effects change on your environment.

So I guess my point is that you not only have to think about where the "metadata" lives within each application... but how it gets managed at an environment wide level and how it gets delivered.

-Damon
http://dev2ops.org

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment