Prioritisation is a solution that can be used in a few situations:
- Messaging – where some class of messages needs to be processed before one or more other classes.
- Job execution – where the results of some set of jobs need to be available before others.
- Levelling – where satisfying peak demand would require lots of hardware that in other periods would be significantly under-utilised.
It’s a very useful pattern but there are a few dark corners to think about:
- Even low priority items have some importance, otherwise they wouldn’t exist at all. If there are too many high priority items passing through the system there is significant risk the low priority items will not be processed in an acceptable time period.
- If there are too many high priority items passing through the system, the low priority items might not get processed at all leading to huge backlogs that take an age to process.
- If the high priority items begin taking a large amount of time to process, low priority items are delayed with resulting in a huge backlog as above.
In essence, a certain workload mix can mean that one must wait infinitely for low priority items to be processed and that is rarely acceptable. Making prioritisation work effectively means ensuring that there is sufficient capacity to process all work within their respective acceptable time periods.
For some applications there is a convenient “quiet” period overnight where low priority items can be cleared out of the system as there’s a dearth of high priority items to process. In other cases processing of priority classes must be interleaved e.g. process 100 high priority items, then 5 low priority items and repeat. Alternatively one can dedicate varying sized pools of resource (partitioning) to processing priority classes with each pool scaled according to their timeliness requirements.
Some technical staff naively use priority to solve a throughput problem where capacity is insufficient to cope with all work in parallel. This can appear to work for a while if there are lulls in demand as mentioned above but ultimately, as workload increases such an approach will fail unless care is taken in profiling the workload and ensuring there is sufficient capacity to satisfy all priorities.
No Comments »
A programming language is a tool. These days in fact it’s more a toolbox as there’s an entire ecosystem associated with a language that makes it more or less suitable for a particular discipline (e.g. website development). There are many other tools beyond languages of course: CORBA, J2EE, SOAP, AJAX, Visual Studio .NET, Emacs etc
The obsession we have with our tools is verging on the sexual. We worship them, we endlessly compare them, we get excited about this or that extension. It drives much conversation in corridors and at conferences but it’s largely worthless because there’s no context.
Does a carpenter get excited about a saw, a power-drill or the latest hammer? Not really, because long ago they realised that whilst one must know how to make effective use of a tool and how to maintain it whilst it goes unused, what really matters is figuring out what the job itself actually is. This is the context that dictates which tools are appropriate.
We speculate about concurrency, we speculate about building websites, we speculate about writing this or that application but it’s all pointless until we actually set about a specific task with intent.
The smart techie has a good grasp of a wide range of tools, knows when to use them and ensures they have meaningful escape plans (that may never be implemented) in case the day comes when those tools turn out to be the wrong choice or need replacing. Most of all a smart techie puts thinking and planning well before worrying about tools.
In simple terms, we need to stop playing with our tools and focus on the real challenge, tackling real-world problems with elegant, simple, well thought out, maintainable, cost-effective solutions. Tools help you build such things but they aren’t the essence of it.
4 Comments »
Building a concurrent system ultimately boils down to:
- Partitioning the data into chunks that can be separately acted upon
- Applying computations against those chunks to produce results
The smaller or more fine-grained the chunks, the more concurrent activity will be possible. In theory the closer one can get to one chunk per core the better but in reality it’s rare (a function of throughput and size of calculation) one needs to do computation across all chunks simultaneously such that a core can be assigned many chunks any one of which it will dispatch operations against at a moment in time.
There are many solutions for building concurrent systems but those that provide some abstraction which makes request routing easy to implement are likely to work best as it makes re-balancing of computation easier. One shouldn’t immediately assume that message passing is the answer as there are many ways to achieve routing (e.g. via DNS).
Any solution represents a transparency tradeoff. If for example routing is hidden inside of the solution, this can make it easy to get something up and running but we might find it difficult to transition from one box to a multi-box deployment. There are many tradeoffs to be made and for any case where control is given to the developer/architect it’s likely there will be libraries/frameworks to ease the initial implementation burden, programming languages alone will not be enough (Scala makes such a differentiation quite difficult given it’s language extension capabilities).
One aspect discussed less often is the difference between processing on a set of cores all in one box versus processing across a set of cores on many boxes. The latter brings the following challenges all related to the fallacies of distributed computing:
- Cores are more likely to become inaccessible
- The latency of an operation can become substantially more variable
- Any centralised functions (e.g. job scheduler or watchdogs) are more vulnerable to becoming isolated from the resources they manage such that processing ceases.
The latency factor is particularly challenging as few concurrent approaches make it sufficiently explicit that developers/architects are encouraged to be appropriately mindful.
Thus far, as has been the case throughout our history, the solutions are polarising into those that work within the confines of a single box and those that work across multiple boxes with the emphasis on the former. I fully expect developers and architects to fall into the old trap of using a single-box solution to solve a multi-box problem with all the associated issues. Of the solutions that work across multiple boxes, very few account fully for the impact of the network.
Comments Off
As soon as we give something a name, it becomes open to abuse and misuse.
Vendors can claim they are doing it and support it, developers can claim they do it, use it or implement it. There are a bunch of ready examples: Agile, XP, SOA and REST. Naming something makes it easy to ignore or forget its underpinnings, the elements that deliver value.
As a martial artist, I’m familiar with this pattern of behaviour: various people claim to practice and teach authentic Silat, Karate, Kung Fu, Escrima and so on. Inevitably some of them are exposed as pretenders. One of the more notable martial artists, Bruce Lee was sufficiently concerned about this that he gave serious consideration to leaving his approach to martial art (Jeet Kune Do) unnamed*.
Is it worth naming things? Might we be better served by making our knowledge, approaches and philosophies visible for others without naming them to adopt or not as they see fit? Would it reduce the number of valueless certifications, buzzword cv’s and endless wars over which way is the way and who’s doing it right?
* Jeet Kune Do (1997) ‘Actually, I never wanted to give a name to the kind of Chinese gung fu that I have invented, but for convenience sake, I still call it “Jeet Kune Do”. However, I want to emphasize that there is no distinction between jeet kune do and any other kind of gung fu, for I strongly object to formality, and to the idea of distinction of branches.’
3 Comments »
I’ve spent a significant amount of my career helping to unpick messed up architectures and wondering how they ever come to be. Certainly it can’t be because they’re appealing to work with:
- Making changes becomes increasingly expensive – make one small change and it spiders into changes across many other areas and gets into corners one least expects.
- Replacing components of the system because for example they’re no longer supported, don’t perform adequately or can’t scale requires significant reverse engineering to understand dependencies etc.
- It only takes one piece of the system failing to bring everything to its knees.
- Isolating the root cause of a bug takes significant amounts of effort because it’s difficult to quickly eliminate large chunks of the system.
More often than not it’s believed (I’m guilty) these systems come into being through incompetence or indiscipline on behalf of the developers involved but I think there’s maybe another contributory factor: Much of the advice on design and architecture is couched in terms of design from scratch, there’s less guidance in regard to working with an existing architecture.
The result is that when developers start out building a system they have a lot of advice they can apply but as it grows, it becomes more difficult to apply the advice and discern what changes are appropriate, so the architecture unravels. Is there a way to avoid this unravelling? I believe there is and it’s derived from the process for fixing up an errant architecture.
These architectures have smells equivalent to the code-level examples Fowler discusses in his book on refactoring such as:
- Some area of the system is too tightly coupled, making changes harder.
- Some part of the system contains an assumption that there is only one resource of some type (e.g. a database) limiting scaling.
- Many components of the system are reliant upon one key component being constantly available such that if it fails, nothing works.
Having identified these smells we need to perform appropriate cleanup which, for the list of examples above might include:
- Placing additional APIs (interfaces) within the tightly coupled area of the system to reduce shared implementation knowledge and create well-bounded islands of data.
- Introducing a resource discovery pattern to abstract away the assumption of a single resource at a single address.
- Introducing concepts like acceptable staleness of data which allows caching for a period of time, eventual consistency which supports making updates and resolving the outcome at a later date or asynchronous operations.
It’s important to realise that in any substantial system we will be unable to eradicate a smell completely in a single update because it’s too risky. There will be many places in the code we might forget to patch up, a high likelihood we’ll miss something in testing, low probability we’ll get API designs exactly right etc. We must gradually introduce modifications over a period of time (months or even years) rather than perform significant rewrites. This isn’t as bad as it seems because no architecture is perfect for very long once it’s exposed to users. It also suggests that perhaps we need to focus on documenting techniques for gradual evolution of an architecture.
If we were to get better at spotting these architectural smells early (slight odour as opposed to horrific stench) and working to address them sooner than later it might be possible to avoid having a system’s architecture unravel, leading to something more sustainable.
Updated: to include additional commentary on APIs and perfection.
Comments Off
|