I put this line on a slide recently for a presentation at work. Looking around the room, I could tell that some of the people understood, some were perplexed, and some annoyed. What does latency have to do with architecture anyway? We're concerned with proper component factoring, interfaces, and a collection of "ilities". How could latency be relevant to any of these?
In any large system, there is are a few inescapable facts:
- A broad customer base will demand reasonably consistent performance across the globe.
- Business continuity will demand geographic diversity in your deployments.
- The speed of light isn't going to change.
Given these facts, latency is a critical part of every system architecture. Yet making latency a first order constraint in the architecture is not that common. The result are systems that become heavily influenced by the distance between deployments and limit the business's ability to serve their customers effectively and protect itself against localized disasters.
So how do you design for latency? There are a few strategies that can be applied to your architecture that will allow you to deploy your components across diverse geographic locations. Here are the ones that I find particularly important.
Good Decomposition - Highly coupled, monolithic applications are the bane of any distributed architecture. Allowing components with little functional overlap to be coupled either in code or during deployment will pretty much kill any hope distributing your architecture across a collection of global data centers. Do it badly enough and you will kill any hope of distributing your architecture across two cities in the same state. This sounds obvious, but there are plenty of enterprise level applications in use today that have forced themselves into data centers on the far edges of the same city as their only business contingency plan.
Asynchronous Interactions - This is more than just using messaging between components. It starts by setting the appropriate expectations on your external interfaces be that SOA or a web page. Companies get tripped up here by exposing an early version of an interface that sets the clients expectation of synchronous, low latency interactions. As the interface becomes more heavily used it becomes more and more difficult to change that semantic. If the client has an expectation of a synchronous response, the likelihood of leveraging a collection of components with asynchronous interactions becomes low. Start with an expectation of asynchronous behavior and you can more readily add latency as needed to meet your deployment demands.
Monolithic Data - You can decompose your applications into a collection of loosely coupled components, expose your services using asynchronous interfaces, and yet still leave yourself parked in one data center with little hope of escape. You have to tackle your persistence model early in your architecture and require that data can be split along both functional and scale vectors or you will not be able to distribute your architecture across geographies. I recently read an article where the recommendation was to delay horizontal data spreading until you reach vertical scaling limits. I can think of few pieces of worse advice for an architect. Splitting data is more complex than splitting applications. But if you don't do it at the beginning, applications will ultimately take short cuts that rely on a monolithic schema. These dependencies will be extremely difficult to break in the future.
Design for Active/Active - If you do a good job with the preceding recommendations, then you've most likely created an architecture that can service your customers from all of your locations simultaneously. This is a more efficient and responsive approach than an active/passive pattern where only one location is serving traffic at a time. Utilization of your resources will be higher and by placing services nearer your customers, you are better meeting their needs as well. Additionally, active/active designs handle localized geographic events better as traffic can simply be rebalanced from the impacted data center to your remaining data centers. Business continuity is improved.
Latency is another example of what you don't take into consideration in your architecture will ultimately undo your design. It is one of the more difficult constraints to design for correctly. As such, it should be given more attention, early in your architectural process. Are their other aspects of this that you think are important? I'd love to hear them.