Archive for July, 2007

Elliotte Rusty Harold with an interesting commentary on the REST vs WS-* war.

There’s one statement that I might contend with just a little:

"The WS-* community really believes that developers are too stupid to be allowed to manage themselves. Developers have to be told what to do and kept from getting their grubby little hands all over the network protocols because they can’t be trusted to make the right choices."

The WS-* community may well see things this way but I think there’s at least one other possibility that would place the fault elsewhere. Given that an awful lot of developers are heard to utter sentences like…

"I don’t want to be bothered with the nitty gritty details of network protocols, threads, persistence etc I just want to write my business logic"

…perhaps it’s natural to expect that various entities will construct technologies like WS-*. Developers are seemingly pushing responsibility elsewhere, placing their fate in others hands and paying the price. Perhaps they should be more careful what they wish for?

Comments 2 Comments »

When we write programs one of the things we seek to do is encapsulate our data so as to allow us to manage our dependencies and keep our code clean. Most languages OO or otherwise provide mechanisms to support this way of working.

The thing about the average database is that it doesn’t really encourage similar behaviour. It is all too tempting (and easy) to just allow everyone to access everything. Whilst we confine ourselves to a single application using the database, the problem is to some extent contained but often what we actually do is allow multiple applications access to the same database. The exact way in which this is done varies:

  1. Sometimes we bundle all our middle tier code together even though it has separate roles and responsibilities and integrate all of it via a single database.
  2. Sometimes we have multiple applications each running in a different process.

With each application we put on top of the database the problem gets worse increasing the number of invisible dependencies tying unrelated elements of code together by virtue of accessing a shared schema.

What’s happening is we’re sharing too much intimate knowledge across our system, something we’re all taught to fear. The solution is as always to prevent direct access to this intimate knowledge by interposing layers of abstraction. One way to do this is by requiring access to data to be wrapped up behind an interface. Historically we’ve done this by having a system own the database and expose interfaces that other systems can use to get the data.

Unfortunately there is a well-known issue with this approach which is that the level of granularity is wrong and these additional integration interfaces rapidly balloon into complex beasts. What we need is a a database wrapping entity that has a finer level of granularity than an entire system. Then the integration interfaces will be simpler because there will naturally be a less complex schema underpinning this more limited functionality.

What are we talking about? Services. We end up with a system of lots of discrete services each wrapping up their own data storage.

There are other benefits to this approach:

  1. Each service can utilize the most appropriate storage option for it’s contained data whilst having zero impact on other services that might have different needs.
  2. Each service is an independent entity that can be managed (monitored, deployed etc) separately.
  3. Centralized access patterns are more easily broken down which is useful in cases where we deploy across multiple data-centres.

Who would do such a thing?

Technorati Tags: , , ,

Comments 3 Comments »

…..Three-tier architecture. Three tier architecture is really a logical partitioning of system functions into presentation, business logic and data-access. Of course, some frameworks have attempted to turn this into a physical reality which is fine but many people believe that such systems are or can be distributed which makes little sense – why?

Because as has been said elsewhere placing code far away from the data source makes little sense. If one is to place code away from the data source it’s going to be for one or more of the following reasons:

  1. The computational weight demands separate scaling from the computational load inherent in managing the storage of the data.
  2. The computational load in the data-storage layer can be better scaled elsewhere.
  3. We can offset the additional latency introduced by network roundtrips.

Most three-tier architectures fail to satisfy any of the above criteria and thus aren’t good candidates for distribution.

…..Synchronous. This is because, in order to offset latency we must exploit asynchronous behaviour. Note that this does not imply the use of messaging rather it means adopting suitable asynchronous design patterns which can be implemented via RPC or messaging.

…..Completely consistent all of the time. Trying to enforce ACID properties across a distributed system is opening an enormous can of worms where one constantly attempts to defy the nature of the network. It’s not impossible to achieve but there is a tradeoff to be made. It’s often better to prefer eventual consistency in good-enough time i.e. something that approximates total consistency under most circumstances whilst degrading (hopefully gracefully) under load or in the presence of failure.

Many have attempted to implement distributed systems whilst falling into one or more of the traps above. Many have paid the price and many have consequently pronounced that distributed is inappropriate, impossible or insane. This is the motoring equivalent of strapping a massive turbo to an unmodified engine and complaining when the pistons explode through the bonnet and the oil is dumped all over the floor. In both worlds the remedy is the same, talk to an expert and be prepared to throw out a few beliefs.

Technorati Tags: ,

Comments Comments Off

Tom Ayerst pretty much hits the nail on the head.

I would suggest just one more refinement:

Architecture and code need ongoing concern, review and re-organization. One simply cannot leave what has been previously built untended and focus on the next feature. It only takes one broken window……

Comments Comments Off

My notes on the talk by Werner Vogels and Swami Sivasubramanian:

State management is the dominant factor in scaling – this is the stuff that is tough to look after, stateless is easy.

There’s a tight, complex interplay between scalability, availability, consistency, efficiency, management and performance.

Consider that billions of your body’s cells commit suicide in a day and yet you continue to function uninhibited. This process (Apoptosis) is essential for the health and stability of the overall organism and can be usefully applied in distributed systems. There are other interesting aspects of our biology that are relevant – check out the paper "The Limits of the Alpha Male"

Amazon is a collection of seven web-sites, it started as a website and a database but is now a distributed system. These changes were driven by the natural brittleness of integration via the database, performance and scaling issues. It was noted that database technology is many years old (reference was made to this article in ACM Queue) and we really need to move on.

For Amazon, incremental scalability is key and it’s desirable to be able to scale dynamically both up and down with demand. Improved performance can be defined in many ways including serving more units or serving larger units such as is required when datasets grow.

An always-on service is said to be scalable if adding resources to facilitate redundancy does not result in a loss of performance. Other aspects of a scalable service are that it:

  • handles heterogeneity
  • is operationally efficient
  • is resilient
  • becomes more cost effective when it grows

We should never expect systems to be stable:

  • things leave, join and fail continuously
  • perturbations and disruptions happen
  • failures are highly correlated and systems do not fail by stopping

A key part of Amazon’s approach to defining service contracts is SLA’s. Conventional wisdom for SLA’s is that they are a one-way contract but in fact they should be considered as two-way contracts (what the service promises and how it is to be used). The contract might well include factors around:

  • latency in respect of single service or paths through the system
  • durability and availability
  • cost

SLA’s introduce the right for a service to throttle in the face of various conditions and should not be defined with single numbers, rather they should be defined with ranges.

The remainder of the talk was concerned with Dynamo which has been previously known as HASS due to constraints in respect of an upcoming unreleased paper (titled "Dynamo: Amazon’s Highly Available Key-Value Store" to be presented at SOSP 2007 and my notes say it will be released on August 9th). Dynamo embodies much of what was talked about above, achieving it’s functional and non-functional targets with a mixture of:

  • Sloppy quorum and hinted handoff (Werner’s own terms)
  • Vector clocks for versioning and consistency, and exposed to the client application which is expected to define the model for merges etc)
  • Consistent hashing and other p2p techniques for scalability (I’d recommend examination of examples such as Chord or Bamboo)
  • Anti-entropy using Merkle Trees

Update: The paper is now available

Technorati Tags: , , , , ,

Comments 2 Comments »