Research data training at the University of Lincoln

As part of the Orbital project to build a pilot Research Data Management (RDM) infrastructure at the University of Lincoln, I’m looking particularly at support, training and documentation.

We aim to start offering—early in 2013—an introductory 1-hour workshop on managing your research data, aimed at early-career researchers and postgraduate research students. In particular, we want to promote this training through three avenues:

  1. As part of the Lincoln Graduate School‘s standard timetable of postgrad training;
  2. Directly, to PhD students in the School of Engineering (our pilot group);
  3. To researchers who completed our Data Asset Framework questionnaire.

The training will be supported by documentation (written and maintained through WordPress and a dedicated RDM reading list), presented through the main Orbital “bridge” site, which we’re starting to treat as a VRE.

Here’s an outline of the initial workshop. I’m meeting the Graduate School this afternoon to agree this.

“Managing your research data”

  1. Definitions, terminology and scope (what do we mean by research data?)
  2. Policies and laws affecting your data
  3. The “research data lifecycle
  4. Data Management Planning (DMP)
  5. Practical tools for looking after your data
  6. Data publishing and citation
  7. Where to go for further help and support

Comments welcome!

Orbital Team Meeting 22nd November

Present

Nick Jackson
Melanie Bullock
Joss Winn
Bev Jones
Paul Stainthorp
Harry Newton

Apologies

Annalisa Jones

Agenda

Policy & Business Case
Training
Technical
Dissemination
Budget

Policy & Business Case

Ieuan Owen has taken over senior management of research at the University.

JW met with Lisa Mooney and IO. Agreed policy and business case should be presented as “research services”, and include case studies where these have benefitted researchers.

JW to present business case to SMT on December 17.

Policy should be more presented more as collegial support than a university mandate. Policy to be re-worded to this effect.

Presentation to SMT – focus on long tail, benefits to institution, risk and benefits, “publishing of research data helped massively” in gaining new research grant. Roles required for RDM and institutional positions.

RDM proposed to be through existing roles, except for Data Scientist. LM suggested research funding for study into how role will fit into Lincoln.

Action: JW to circulate draft documents for business case and SMT presentation to Orbital team.

Development of policy and business case are on schedule.

Training

PS has developed an outline of a 1 hour training workshop covering RDM. Action: PS to blog this outline.

No success in getting PhD students from engineering to partake in initial sessions, will talk instead to Graduate School.

Training is not technical or software/process specific, instead focussing on high level concepts and best practice for RDM. Target is for training to be prepared by Christmas, even if sessions are not running by that time.

Existing RDM blog is to be used as authoring environment for policy and training documents and syndicated to permanent RDM site. Action: PS to talk to NJ and HN about ingestion of this content to Bridge.

When new Library Repository Officer is appointed, it was agreed they should become part of Orbital team, and involved in RDM.

Technical

Technical development continuing broadly to plan, with exception of OpenStack which has suffered various setbacks.

JW to include conceptual overview of Orbital Bridge in presentation to SMT.

Meeting planned with ICT to discuss Awards Management System (AMS) integration last week was cancelled due to illness, and has been rescheduled. AMS integration is highly desirable, but not essential.

NJ/PS/BJ/HN spent time on mapping concept of a ‘dataset’ within Orbital Bridge to ePrints using the SWORD deposit method.

Most of SWORD mapping is also valid for DataCite specification, which also informs sanity checking of data within Orbital Bridge.

Cost for membership of a DOI service to go into business case to SMT. DOIs should be minted at point of deposit to ePrints (‘publication’), and not at point of original dataset creation.

AMS project IDs are key for collecting items in a project together. AMS could also handle unfunded projects, but this will require extension of the system and is outside its current scope. Orbital avoids reliance on AMS by using Nucleus data store as primary keys and project IDs.

Dedicated time should be set aside for OpenStack work.

Dissemination

JW and PS went to JISC project meeting in Nottingham. JW presented on adoption of CKAN, as a result Bristol have adopted CKAN. JISC programme manager is encouraged by seeing us using CKAN.

JW went to DCC forum in Cambridge.

Management of active data is now a high priority in the RDM field.

JW submitted abstract for conference in Cologne on providing critical evaluation of CKAN for academic use. The resultant paper, if accepted, could inform JISC on the use of CKAN in academia.

A member of the Orbital team should attend DCC conference in Amsterdam, including specific themes on “what is a data scientist”. This will help inform a new role at Lincoln.

Budget

Funds remain for hardware and dissemination. It was suggested that some of this might be spent on developing a more permanent hosting solution for the Nucleus data warehousing platform.

Money is also required for a dedicated CKAN server and Orbital Bridge server, as well as possibly a dedicated database server for CKAN’s DataStore.

It is necessary to integrate Orbital with research impact analysis and recording systems. Action: MB to send NJ/HN information on impact recording systems.

A job description is being written for the post of “Research Information Management Developer” in the Library.

Presentations from the JISC MRD Programme Progress Meeting

Below are two short presentations I gave at the JISC programme meeting today. Both concern different aspects and advantages of using CKAN to manage research data. They simply link through to blog posts that have been written here which offer more detailed information. During the presentations, I gave demonstrations of using CKAN in practice.

The Development Goes On…

It’s been a while since I gave you an update on the technical side of Orbital, so here’s a lightning-fast overview of what’s going on.

CKAN

We’re still working on fine-tuning CKAN for our needs. Although we’ve made advances in the fields of theming, datastore, HTTPS and a few other tweaks we’re still plagued by mixed HTTP/HTTPS resources, plugins which are difficult to install, broken sign-in using our OAuth 2 SSO service, a broken search and a complete unwillingness of the Recline preview to work. I suspect a lot of this is down to unfamiliarity with the codebase and with Python in general, although some areas of CKAN do feel like they’re a collection of hacks built on top of some more hacks built on a framework which is built on another framework which is built on a collection of libraries which is built on a hack.

In short, CKAN is still in need of a lot of work before our deployment can be considered production ready (hence the “beta” tag). That said, we are already using it to store some research data and the aspects which we’ve managed to get working are working well. We’re going easy though, because CKAN 1.8 and 2.0 are apparently due to land in the next couple of months.

Orbital Bridge

Our awesomely named Orbital Bridge will serve as the central point for all RDM activity around a project, as well as helping people through the process of general project management by being a springboard to our existing policy and training documentation.

Currently Bridge’s public-facing side is in a very basic state, with only static content, but is serving as a test of our deployment toolchain. However, behind the scenes Harry has been working on ways of shuffling data around between systems using abstraction layers for aspects such as datasets, files, people and projects. Today we sat down with Paul and went through some aspects of minimal metadata which are required to construct things to an acceptable standard, which will lead to additional work both on CKAN and our existing ePrints repository to smooth the transfer of things between them.

AMS

The University’s new Awards Management System is designed to help researchers plan their funded research, walking them through the process of building their bid. The system itself has begun its roll-out across the University, and as soon as we’re given access to the APIs we’ll be integrating the AMS with Orbital Bridge, allowing seamless creation of a research project based on the data in the AMS.

This work also helps to inform stuff we’re doing in Bridge around abstracting the notion of a ‘project’ between all our different systems.

Kumo

Our ongoing OpenStack project, which we will use as the bed to provide the technical infrastructure, is slowly moving closer to a state which we can begin to develop on. Tied in with this effort is our continued work on automating our provisioning, configuring, deployment, maintenance, monitoring and scaling.