A JISC-funded Managing Research Data project

Posts tagged requirements

On the 18th February, we ran a workshop in London which focused on the use of CKAN for research data management. The Orbital project made the decision to use CKAN last summer and was soon followed by Bristol’s data.bris project, which is using CKAN for its discovery catalogue. Simon Price from Bristol, gave a very interesting presentation of their work with CKAN, which you can read about on their project blog.

The #CKAN4RDM workshop was fully booked with 40 delegates attending – many more than we originally anticipated. It was facilitated by Simon Hodson, the Programme Manager of JISC’s Managing Research Data programme. Following presentations from Lincoln and Bristol on our respective uses of CKAN (ours was a live demo of ‘Orbital Bridge‘), we spent the later part of the morning undertaking a requirements gathering exercise, where tables of around 8-10 people acted as different users, providing ‘stories’ (requirements) for a research data management system. The exercise was introduced in the following few slides.

This was a useful exercise regardless of the software used, but after collating all 70+ stories over lunch, we then returned to our user groups and each table worked with a CKAN expert from the Open Knowledge Foundation to discuss the existing constraints for each requirement and started to develop a gap analysis so as to identify work to be done. The output of this work can be viewed on Google docs.

Types of users
Types of users
The 'researcher' user group
The ‘researcher’ user group


There was quite a positive buzz about the day and general feedback suggested that delegates got a lot out of the event. You can read write ups from the DCC, LSE and the Datapool project at Southampton.

One of the original purposes of the workshop was research for a conference paper that I (Joss) am giving at the IASSIST conference in Cologne, in May. The abstract I submitted to the conference was as follows:

This paper offers a full and critical evaluation of the open source CKAN software <http://ckan.org> for use as a Research Data Management (RDM) tool within a university environment. It presents a case study of CKAN’s implementation and use at the University of Lincoln, UK, and highlights its strengths and current weaknesses as an institutional Research Data Management tool. The author draws on his prior experience of implementing a mixed media Digital Asset Management system (DAM), Institutional Repository (IR) and institutional Web Content Management System (CMS), to offer an outline proposal for how CKAN can be used effectively for data analysis, storage and publishing in academia. This will be of interest to researchers, data librarians, and developers, who are responsible for the implementation of institutional RDM infrastructure. This paper is presented as part of the dissemination activities of the JISC-funded Orbital project <http://orbital.blogs.lincoln.ac.uk>.

As well as using last week’s outputs of the CKAN4RDM workshop, I’ll also be working closely with OKF staff to ensure that the evaluation is as thorough, accurate and up-to-date as possible by the time of the conference. It will focus on version 2.0 of CKAN, which is due for release soon.

I’d also like to appeal to other JISC MRD projects to send me any existing requirements documents you have produced during the course of your project. I will use the anonymised data to enrich the requirements we gathered last week. If you have such documents, please email me.

Finally, we have set up a CKAN4RDM mailing list, which anyone is welcome to join to discuss the use of CKAN within academia. One thing is clear to me: the academic community cannot expect OKF and existing CKAN developers to meet all of our requirements for research data management. We need to contribute developer time and other resource and effort to the overall CKAN open source project, just as other public sector organisations are doing.


Orbital v0.1 was released on 16 May 2012. Every two weeks, staff working on Orbital meet with Dr Bingo Wing-Kuen Ling and Dr Chunmei Qing to discuss their research and RDM practice. Until now these meetings have been all about requirements-gathering – today was the first opportunity for some real, hands-on user testing with the alpha release of Orbital.

The notes below have been turned into tasks on the Orbital project Pivotal Tracker site.

BL = Bingo Wing-Kuen Ling.

  1. BL successfully viewed Orbital v0.1 in Internet Explorer 7 on the UoL corporate desktop and was able to sign in and grant access to the application using his UoL credentials. BL was able to create and describe a new project.
  2. BL tried to upload a file from his desktop to Orbital using IE7 and received an error (this is a known bug with Orbital in Internet Explorer). He was then unable to delete this file.
  3. Switching to Firefox, BL uploaded multiple files from his desktop to his project in Orbital (it wasn’t clear from the page that this was possible). This completed successfully: but because the files sizes were small, he did not receive any feedback on his upload.
  4. Returning to the original file upload screen, BL had to manually refresh the page to view the changes made (files uploaded). Files scheduled for processing are marked as ‘queued’ however this status does not update automatically without refreshing.
  5. Joss Winn demonstrated the file and project metadata pages, citable URLs for files, and Google Analytics on projects. The display of file metadata needs to be more complete, and G.A. needs a better explanation and links to sources of help.
  6. The group discussed BL’s requirements around project calendars/timelines. BL wants to be able to view project events (meetings, deadlines, etc.) for each project (but not aggregated) and is not particularly concerned about notifications on activity/changes to files. The group discussed this and will explore ways of presenting timelines made up of three sorts of events (project events, activity stream, and comments) with each type of event suppressible in the timeline. A timeline overview will be displayed on the Orbital ‘front page’ once a user has logged in.
  7. BL also would like to be able to organise project and data files in all Orbital workspaces using folders/tags, and to allow bundled file download by organising files into collections.

You can read about Orbital v0.1 in this blog post, and about the roadmap for development and release of future versions, here.

Last week Joss and I had a chat with one of our engineering researchers about the kinds of data he handles in his research. This was an incredibly useful meeting, leading to a whole bunch of notes on data types, requirements and workflows. The one I’m taking a look at today is the flow that data takes from its source, through storage and processing, and into a useful research conclusion.

The existing workflow looks something like that shown above. Source data is manually transferred (often using ‘in-the-clear’ methods) from its point of origin to local storage on a researcher’s machine, where it will reside on the hard disk until it’s used. From there the data is processed (Engineering love using MATLAB, as do a lot of other science disciplines, so that’s the example here) and potentially the results of that analysis are recombined with the local storage for further work. At some point the processing will arrive at conclusions for the data, and from those an output can be drawn.