Archive for the “Enterprise Systems” Category

When we write programs one of the things we seek to do is encapsulate our data so as to allow us to manage our dependencies and keep our code clean. Most languages OO or otherwise provide mechanisms to support this way of working.

The thing about the average database is that it doesn’t really encourage similar behaviour. It is all too tempting (and easy) to just allow everyone to access everything. Whilst we confine ourselves to a single application using the database, the problem is to some extent contained but often what we actually do is allow multiple applications access to the same database. The exact way in which this is done varies:

  1. Sometimes we bundle all our middle tier code together even though it has separate roles and responsibilities and integrate all of it via a single database.
  2. Sometimes we have multiple applications each running in a different process.

With each application we put on top of the database the problem gets worse increasing the number of invisible dependencies tying unrelated elements of code together by virtue of accessing a shared schema.

What’s happening is we’re sharing too much intimate knowledge across our system, something we’re all taught to fear. The solution is as always to prevent direct access to this intimate knowledge by interposing layers of abstraction. One way to do this is by requiring access to data to be wrapped up behind an interface. Historically we’ve done this by having a system own the database and expose interfaces that other systems can use to get the data.

Unfortunately there is a well-known issue with this approach which is that the level of granularity is wrong and these additional integration interfaces rapidly balloon into complex beasts. What we need is a a database wrapping entity that has a finer level of granularity than an entire system. Then the integration interfaces will be simpler because there will naturally be a less complex schema underpinning this more limited functionality.

What are we talking about? Services. We end up with a system of lots of discrete services each wrapping up their own data storage.

There are other benefits to this approach:

  1. Each service can utilize the most appropriate storage option for it’s contained data whilst having zero impact on other services that might have different needs.
  2. Each service is an independent entity that can be managed (monitored, deployed etc) separately.
  3. Centralized access patterns are more easily broken down which is useful in cases where we deploy across multiple data-centres.

Who would do such a thing?

Technorati Tags: , , ,

Comments 3 Comments »

Well maybe – I certainly wouldn’t run Windows on a Thumper as they have over at Johns Hopkins but the tests are interesting for a couple of reasons:

  1. They represent what might well have been one of Jim Gray‘s last pieces of work.
  2. There’s a headline figure of 9Ktps for an approximation of a large scale bank account processing system.

Technorati Tags: , , ,

Comments Comments Off

“Make as much stateless as possible” is the mantra but I wonder if we’re being a little over-zealous in our application? Consider this note in Fielding‘s REST thesis:

“Like most architectural choices, the stateless constraint reflects a design trade-off. The disadvantage is that it may decrease network performance by increasing the repetitive data (per-interaction overhead) sent in a series of requests, since that data cannot be left on the server in a shared context. In addition, placing the application state on the client-side reduces the server’s control over consistent application behavior, since the application becomes dependent on the correct implementation of semantics across multiple client versions.”

Thus while statelessness is often claimed to achieve scalability, in certain applications that may not be the case due to the resultant load on the network.

Our pursuit of statelessness leads us to behaviours such as making a single entity responsible for the maintenance of all state. Often it’s a database that becomes a black hole sucking up hardware, network bandwidth, admin time and endless tuning effort. It also becomes the focus of our reliability concerns, with a need for clustering, RAID arrays etc. Stand around long enough and you’ll hear terrified utterings from staff such as “if we ever lose the database….”

Making some single thing responsible for all these aspects of our system is asking for trouble. Having all these heavyweight concerns squeezing down on a single element ultimately leads to breakage.

History shows that we aren’t entirely happy with this “single point of responsibility for all state”. We have cookies in browsers, local storage in browsers, thin clients that rely on servers to store all state and so on.

Perhaps we’re ignoring an underlying message: Maintenance of state is a shared responsibility for a system. We should seek to place that responsibility in appropriate places at appropriate times and be much more aware of responsibility boundaries and when it’s appropriate, share that responsibility amongst components.

Generally we consider TCP to be responsible for ensuring that state makes it to the other end of the connection. One hands some data to the TCP layer and we expect that it will ensure the data reaches the recipient. But is this true? What happens if we suffer a power outage before TCP transmits the data? When the machine restarts, is TCP going to restart and resend all that unsent data? Clearly not, whoever delegated responsibility to TCP for this data will now need to take steps to recover the situation.

What about a message queue? Typically we place some data in the queue and demand that it absolutely must deliver that message and not lose it in the meantime. That’s an awful lot of responsibility for a single component to carry! As an aside we’re also potentially making that queue a performance bottleneck of the future.

Then there’s the Web which in many cases puts responsibility on the client for maintenance of state. This is achieved through retries, restoring backups, re-entering details etc. Notably, this is the case even if the client “fails” e.g. your home router goes down or the PC overheats. There’s a certain amount of illusion here too where we believe the responsibility for state maintenance has been placed elsewhere e.g. Flickr. Ideally they don’t want to lose all your precious pictures but if they do, who will have to restore all that information?

I think it’s interesting that placing such responsibility with some single entity is perceived as the easy solution but it has a lot of hidden costs like redundant hardware, clustering, strict data-centre environment control, backups etc.

Spreading responsibility might ultimately be easier and fit with our desire for utility computing but it’s not commonplace and thus we’re lacking well documented patterns, software components etc. We are seeing some examples however, I would speculate that S3‘s API is the way it is precisely because it relies on spreading responsibility for state across a co-operative shared-nothing system rather than placing it all in a single shared-everything cluster.

Technorati Tags: , , , ,

Comments 1 Comment »

This is interesting:

W3C members have issues surrounding enterprise computing. Not just distributed computing, but the typical concerns around transactions, scalability, high availability, and so on. Not only that they’re also dealing with 15, 20, and 25 year old technology that works just fine but is getting more and more difficult to maintain and doesn’t play well with others. Finally, there’s the issue of interconnecting these systems to each other, to partners, and to the Web.

Once 15+ year old technology is getting difficult to maintain and doesn’t play nice with others, surely it’s time to considering throwing it out? Surely keeping it just forces all the new stuff we do to get bogged down and stifled supporting the old stuff?

I’m just wondering if the logical destination for this will be we can’t make progress any more because we get completely bogged down in our legacy? I certainly appreciate the companies holding on to so much legacy consider it important that the issue is tackled but should the rest of the world which is perhaps more flexible have to care? Should we constantly be bending all the good new stuff we do to cope with all this old crud?

If you buy a car there’s a limit to it’s useful life. You can keep running it after that but maintenance bills get higher and higher to the point where one gives up and just buys a new car because it’s cheaper.

Note also that the owner of this “legacy” car is the one suffering the high maintenance cost, not the manufacturer or the mechanic. Admittedly, the mechanic and the manufacturer must hold onto tools and parts but they can at a point of their choosing drop support. Should it not also be the case that companies holding onto legacy systems suffer similar high maintenance costs? Should these companies be able to displace those costs onto others in the form of horrendous complexity and endless legacy considerations in new products, specs etc?

Clearly there is no right answer for this dilemma but it might be time for us all to sit down and consider the whole cost of holding onto legacy which goes way beyond the simple increasing cost of maintenance.

If nothing else it strikes me as interesting that whilst IT pursues the concept of endless life for systems, nature provides many examples of limited lifespan as a useful tool for progress and sustainability.

Technorati Tags: ,

Comments Comments Off

…..has been worshipped for a long time but there are various barbarian enclaves that are not ready to kneel before him.

Enterprises certainly want the money saving opportunities of utility computing options such as EC2 but there is evidence to suggest that RDBMS’en aren’t cut out for this role.

And recent statements on TSS suggest some people are at least beginning to think that we are mis-using RDBMS’en ignoring it’s lack of suitability to a task because it’s an easy option.

Time to seek a new religion?

Technorati Tags: , , ,

Comments 2 Comments »