A JISC-funded Managing Research Data project

Posts tagged resiliency

One of the cool things about Orbital from my point of view is that I’m not just responsible for putting together a bit of software that runs on a web server, but also for designing the reference platform which you run those bits of software on.

At this point I could digress into discussing exactly what boxes we’re running Orbital on top of, but that doesn’t really matter. What is more interesting is how the various servers click together into building the complete Orbital platform, and how those servers can help us scale and provide a resilient service.

You’re probably used to thinking of most web applications like this:

A 'traditional' server model

It’s simple. You install what you need to run your application on a server, hook it up to the internet, and off you go. Everything is contained on a single box which gives you epic simplicity benefits and is often a lot more cost efficient, but you lose scalability. If one day your application has a traffic spike your Serv-O-Matic 100 may not be able to cope. The solution is to make your server bigger!

Throw more power at it!

This is all well and good, until you start to factor in resiliency as well. Your Serv-O-Matic 500 may be sporting 16 processor cores and 96GB of RAM, but it’s only doing it’s job until the OS decides it’s going to fall over, or your network card gives up, or somebody knocks the Big Red Switch.

(more…)

With the Orbital project we’re looking at taking a leap into the brave new world (well, at least as far as university projects go) of cloud hosting. Now, I hate the word “cloud” when used to describe most services because people bander it about like it’s some magical world-fixing technology when in actual fact all they mean is “it’s on the internet”. We have “cloud” services which are fundamentally no different from doing the same thing on a server kept on a desk somewhere; but Orbital isn’t going to be like that.

I hope.

Instead, Orbital will be a true ‘cloud’ service in that it’s a resource which end users can tap into with no care at all for the underlying technologies. It’ll scale up and down with demand, extending both processing power and storage space as needed. Should one of our servers fail for any reason it won’t be met with a week of downtime whilst we rebuild things, but instead a seamless transition of work to one of the redundant, load balanced alternatives. If a process stops working then instead of the entire system crashing down it’ll adapt, queueing tasks until things are restored. Alongside this, the use of common standards is something that is essential to development. RESTful APIs follow well understood principles for interacting with data, and authentication using OAuth (the same authentication method used by Twitter, Facebook, Google and Microsoft) is core to how things behave.

Whilst the Orbital application itself is built to run in a cloudy manner using these loosely coupled methods and Rambo architecture, we’re also going to be hosting the thing in the cloud. This helps us with a few things including the aforementioned scalability, improved resiliency and the ability to properly analyse how the cloud works for higher education.

There’s also an unexpected benefit of this cloudy approach to Orbital: we gain the ability to pin a real-world cost on the storage of research data since we are quite literally being charged by the GB. At the moment researchers tend to treat storage as a one-off cost – for example buying a pile of hard disks – with less understanding of what it actually costs to keep them spinning. Since Orbital will know more about the intricacies of the stored data than the researchers we will be able (for the first time) to offer a number both in terms of how much it is costing to store data and also the estimated carbon impact.

Both these numbers are something that we want to be able to give researchers to help them understand that hanging on to research data has a cost, but also that it’s probably more efficient to hang on to it in a central, cloud-based platform. Of course, we also want to give people a clean exit strategy so we’re also going to be looking at ways of easily creating ‘hard’ copies for offline, non-cloud storage whilst still maintaining a virtual presence for the purposes of referencing and metadata.