A JISC-funded Managing Research Data project

Posts tagged Twitter

This is a post about our first release of Orbital.

About a month ago, Dr. Tom Duckett, Reader in The Department of Computing and Informatics approached the Orbital project because he urgently wanted to publish around 20GB of data for Long-term mobile robot operations. That afternoon, we gave Tom and Feras Dayoub, his Research Assistant, space on one of our servers and they uploaded a bunch of HTML pages and the zipped up data. We minted a proxy URL for them and advised them on an appropriate data license to choose.   We also set up Google Analytics, so they could see what interest in his data there was.

Job done. For the time being.

What Tom really wanted was to be able to email a link to his data to a robotics mailing list and tell an international community of likeminded researchers and manufacturers that the data was available to use. He says that long-term datasets for mobile robots are quite rare in his community, so there was a good chance people would be interested in them. He also wanted to be able to demonstrate his work when writing an EU bid. There will be a follow up blog post about what impact this has had on Tom’s research.

That afternoon got us thinking: What is the minimal set of functions that a researcher like Tom requires of a Research Data Management tool?

Tom wanted access (sign in) to a server (hosting) where he could upload his data (storage) and describe it so that other people could understand and download it (publish) under an appropriate license. The URL pointing to the data should be persistent, even if the data itself is migrated from one system to another. The impact (analytics) of the data should also be measurable.

Tom’s chance intervention in our project made us focus on Orbital v0.1 as the ‘minimum viable product‘ for researchers who need to publish open data. We thought his requirements were a great opportunity to release something early and start getting direct user feedback on our product. We decided to set a release date for Orbital v0.1 a month ahead and aim to deliver everything that Tom asked of us in this first release.

A Minimum Viable Product has just those features that allow the product to be deployed, and no more.

Today, we released Orbital v0.1 and it does everything described above. It’s an alpha release, but we’ve been testing it like crazy, we also had Feras test it and we’ve been pushing code through Jenkins since the beginning of the project so we know it passes our QA checks and we think it’s stable enough for use. From this point forward, Orbital and the URIs it mints will persist, too.

From today, a researcher at the University of Lincoln can sign in to Orbital, create and describe a project, upload their data to the project, choose a license for the data and add a Google Analytics code to measure project analytics (we’re also tracking each button click to better understand how people use Orbital). The data is published at a id.lincoln.ac.uk URI, which will persist indefinitely. At this stage, until we’ve got an approved business case for scaling it up and out to all academics, we’ll be limiting uploads on a case-by-case basis. You can view and request what other features we develop for Orbital on UserVoice, or in more detail on our project tracker. We’ve also written a basic development roadmap.

For developers, here are the basic technical details. You might also want to trawl through our implementation plan and the collected blog posts at the bottom of the plan.

Orbital is written in PHP using the CodeIgniter development framework.  It’s split into two main pieces of functionality. Orbital Core (database and APIs) is currently hosted on a Linux box on Rackspace’s cloud. Orbital Manager (the User Interface) is likewise hosted on Rackspace. A user signs in to Orbital Manager via OAuth 2.0 using their university credentials. Orbital Manager is using Twitter’s Bootstrap framework. The project metadata is stored in a MySQL database. Files are uploaded to Rackspace’s cloud files storage using Andrew Valums’s AJAX Uploader. APIs are exposed using Phil Sturgeon’s CodeIgniter REST server.

Orbital is licensed under the GNU Affero GPL 3 license and you can download, fork it and create pull requests on Github:

Orbital Core

Orbital Manager

New contributors to Orbital will be ritually applauded each weekday morning :-) Thanks.

With the Orbital project we’re looking at taking a leap into the brave new world (well, at least as far as university projects go) of cloud hosting. Now, I hate the word “cloud” when used to describe most services because people bander it about like it’s some magical world-fixing technology when in actual fact all they mean is “it’s on the internet”. We have “cloud” services which are fundamentally no different from doing the same thing on a server kept on a desk somewhere; but Orbital isn’t going to be like that.

I hope.

Instead, Orbital will be a true ‘cloud’ service in that it’s a resource which end users can tap into with no care at all for the underlying technologies. It’ll scale up and down with demand, extending both processing power and storage space as needed. Should one of our servers fail for any reason it won’t be met with a week of downtime whilst we rebuild things, but instead a seamless transition of work to one of the redundant, load balanced alternatives. If a process stops working then instead of the entire system crashing down it’ll adapt, queueing tasks until things are restored. Alongside this, the use of common standards is something that is essential to development. RESTful APIs follow well understood principles for interacting with data, and authentication using OAuth (the same authentication method used by Twitter, Facebook, Google and Microsoft) is core to how things behave.

Whilst the Orbital application itself is built to run in a cloudy manner using these loosely coupled methods and Rambo architecture, we’re also going to be hosting the thing in the cloud. This helps us with a few things including the aforementioned scalability, improved resiliency and the ability to properly analyse how the cloud works for higher education.

There’s also an unexpected benefit of this cloudy approach to Orbital: we gain the ability to pin a real-world cost on the storage of research data since we are quite literally being charged by the GB. At the moment researchers tend to treat storage as a one-off cost – for example buying a pile of hard disks – with less understanding of what it actually costs to keep them spinning. Since Orbital will know more about the intricacies of the stored data than the researchers we will be able (for the first time) to offer a number both in terms of how much it is costing to store data and also the estimated carbon impact.

Both these numbers are something that we want to be able to give researchers to help them understand that hanging on to research data has a cost, but also that it’s probably more efficient to hang on to it in a central, cloud-based platform. Of course, we also want to give people a clean exit strategy so we’re also going to be looking at ways of easily creating ‘hard’ copies for offline, non-cloud storage whilst still maintaining a virtual presence for the purposes of referencing and metadata.

Clare CollegeYesterday I was at Clare College, University of Cambridge for a meeting organised by USTLG, the University Science & Technology Librarians Group. The group—open to any librarians involved with engineering, science or technology in UK universities—has meetings once or twice a year. The theme of yesterday’s meeting (free to attend, thanks to sponsorship from the IEEE) was data management, with an implied focus on research data.

The meeting consisted of a series of presentations (plus a fantastic lunchtime diversion, below) with plenty of time for networking – there were about 40 people there, all with an interest in research data management – though interestingly, a show of hands suggested very few people were actively engaged in looking after their own institution’s researchers’ data.

As usual, this blog post has been partially reconstructed from the Twitter stream (hashtag #ustlg).

First up, Laura Molloy, substituting for Joy Davidson of the Digital Curation Centre (DCC), on a project called the Data Management Skills Support Initiative (DaMSSI), looking at the [shades of information literacy] skills needed by different people involved in the research data curation process. “DaMSSI aims to facilitate the use of tools like Vitae’s Researcher Development Framework (RDF) and the Seven Pillars of Information Literacy model” developed by SCONUL. Key question: how do you assess the effectiveness of research data management training?

(more…)

Hi everyone! For those who don’t know me (this is my first blog post for Orbital) I’m Nick Jackson, the lead web developer for the project. I’ll be blogging about all kinds of bits and pieces, ranging from discussion of user requirements through to the rationale for features, how certain bits of Orbital fit together with the wider world, overviews of technologies that we’re using and the minutiae of implementation.

If you’re interested in this kind of stuff at an institutional level you may also be interested in following my other blog, where I discuss things I’m working on in a wider context along with the occasional opinion piece on technology in education.

For those of you who enjoy Twitter you can find me at @jacksonj04, and don’t forget that you can keep up with the latest from Orbital at #OrbitalMRD. If you’re more a face-to-face person then you can find me most days at the University of Lincoln’s Brayford campus, and I’m open to being plied with coffee.