Data Management Planning

In February, Joy Davidson and Kerry Miller from the Digital Curation Centre visited Lincoln to run a workshop around ‘Data Management Planning’. The workshop was aimed at staff who support researchers at Lincoln and was very well attended by 14 colleagues from the Library and Research and Enterprise. Both Joy and Kerry gave presentations, which are available below. We were very pleased with the way the workshop went and continue to work with the DCC on the use of DMPOnline and integrating it with our Researcher Dashboard.

DMP Online

RDM Policy Overview

 

Research data documentation and training materials

The final within-project version of the Orbital Research Data Management training materials are now live on the Orbital Researcher Dashboard website. They have been written collaboratively by the Orbital project team, and draw on a lot of existing RDM training and guidance material from across the web (in particular, from the DCC).

We intend that these materials will continue to be maintained and developed as part of the new University-wide research information service mentioned in a previous blog post.

Screenshot of the Researcher Dashboard

The training materials can be accessed at https://orbital.lincoln.ac.uk/ and cover the following areas:

  1. What is research data?
  2. The research data lifecycle
  3. Policies affecting your research data
  4. Data Management Planning (DMP)
  5. Data search and discovery tools
  6. Data storage and security
  7. Legal and ethical issues
  8. Tools for working with your data
  9. Data publishing and citation
  10. Licences for sharing your data
  11. Data curation and preservation
  12. Workshops and training events
  13. Help and support

The source text for each page is stored in an open Github repository (at http://github.com/unilincoln/rdm) in Markdown format. The page admin tools in the Researcher Dashboard can then be used to link to the source document, which is then formatted in the University’s Common Web Design.

These web pages will be used to support the ongoing RDM training for postgraduate students, which will shortly be rolled out to University staff.

Open Resources and Open Standards

The Orbital project is about a lot more than just developing a cool bit of software. In fact, the majority of the project impact is to do with policy and training rather than development. However, we think there are some good practices in software development which apply equally to the development of documentation around policy and training. Specifically, revision control.

Throughout the day as we make changes to the source code which makes up Orbital Bridge we record significant states in the development against our revision control software (specifically Git). We can then rewind the state of the entire codebase to any one of these conditions, compare differences between the two, and even pick and choose specific changes to move between states on a line-by-line basis. We can create diverging versions to test new features in isolation and merge them together again with no fear of messing up the working version.

Given that we’re planning to release all of our RDM policy and documentation under an Open licence (specifically CC-BY) it made a lot of sense to use a platform for revision control which makes the most of the community and both allows and encourages people to view our stuff, take it, make changes and even propose changes back to us. Enter GitHub, the most popular source code sharing site in the world. GitHub provides us with a ready to go Git hosting platform, as well as a load of really easy to use tools to help us and other people make the most of our resources.

At the University of Lincoln we already use GitHub for Open Source software projects from both the Online Services Team and the LNCD development group, so it made sense to use it for our RDM documentation as well. The definitive copy of our RDM policy and training materials can now be viewed in the state it was at any given point in time, branched, merged and so-on — but there’s a problem with making documents the Old Fashioned Way that people in the University may be used to. Namely, using Microsoft Word to store a document will cause all kinds of problems for revision management in that Word doesn’t just keep the text, but a whole load of other stuff which is then compressed down into a single binary blob. Using Word would mean that although technically the main features of revision control (versions, branching etc) would work we’d lose some of the more elegant solutions to problems such as line-by-line comparisons of versions and merging of different branches.

A better solution was needed for writing documents, and we ended up with a shortlist of three potential plain-text markup standards. These are ways of marking up a plain text document (such as you’d write in Notepad) with semantic structure and styling so that we can take the document and re-render it in a number of different places. Our three contenders were LaTeX, Markdown and reStructuredText. All three have pros and cons, but have the same basic idea behind the scenes – plain text is surrounded with bits of other plain text that give it meaning. All three result in a document that is fundamentally human readable without the need for any proprietary software, and all three allow for the document to be re-rendered in a form appropriate for the audience.

LaTex is by far the most powerful of the three, having a background in typesetting complex scientific academic papers. It would allow for policy documents to be rendered for both the web and print, but has the downside of being the most complex to use and having a less user-friendly syntax. We want the policy to be as accessible as possible, without needing to understand what a set of tags means.

Markdown and reStructuredText both take a much simpler approach, and use almost identical syntax for most things. However, reStructuredText has a bundle of other markup which mades it better suited to long, structured documents with nested lists. reStructuredText would be ideal if we ever decided to convert the University’s Regulations to a plain text format, but for a simple document such as the RDM policy doesn’t really have any advantage over Markdown.

The tipping point for our decision then lay in the technical implementation of Markdown over reStructuredText. Fortunately this was an easy call, as reStructuredText is very tightly linked into the Python ecosystem whereas Bridge is built entirely in PHP. We could easily drop a PHP library to do Markdown rendering into Bridge, whereas reStructuredText would need additional work to call an external Python library to do the best job of rendering. Should we decide in the future that we need the extra capability of reStructuredText then the migration as far as the document is concerned is virtually non-existent.

You can view our current draft RDM policy in Markdown in our RDM repository on GitHub, as well as fork it and submit pull requests if you want to use it as a basis for your own or propose changes. We will be moving all our training presentations to use a Markdown based in-browser format in the near future.

“Managing Your Research Data” – training for postgrad students

As part of the JISC-funded Orbital project, we are starting to offer introductory training to (initially) postgraduate students, on how to look after their research data.

The first workshop is on 23 January 2013 at 10.00 in the Graduate School classroom, and there are further workshops every couple of weeks throughout 2013.

I’ll be arranging further workshops aimed more at staff in due course.

MANAGING YOUR RESEARCH DATA

The Graduate School – University of Lincoln Multiple dates throughout 2013

Research data management is an important part of the research process, and a vital part of academic practice. This one-hour workshop will include a presentation and discussion of what you should consider when creating, looking after, and sharing/publishing your research data.

The workshop will cover:

  • What do we mean by research data?
  • Policies affecting your data
  • Data Management Planning (DMP)
  • The research data lifecycle
  • Practical tools for looking after your data
  • Data publishing and citation
  • Where to go for help

Postgraduate students can book a place on a workshop, online at: http://uolresearchdata.eventbrite.co.uk/

Research data training at the University of Lincoln

As part of the Orbital project to build a pilot Research Data Management (RDM) infrastructure at the University of Lincoln, I’m looking particularly at support, training and documentation.

We aim to start offering—early in 2013—an introductory 1-hour workshop on managing your research data, aimed at early-career researchers and postgraduate research students. In particular, we want to promote this training through three avenues:

  1. As part of the Lincoln Graduate School‘s standard timetable of postgrad training;
  2. Directly, to PhD students in the School of Engineering (our pilot group);
  3. To researchers who completed our Data Asset Framework questionnaire.

The training will be supported by documentation (written and maintained through WordPress and a dedicated RDM reading list), presented through the main Orbital “bridge” site, which we’re starting to treat as a VRE.

Here’s an outline of the initial workshop. I’m meeting the Graduate School this afternoon to agree this.

“Managing your research data”

  1. Definitions, terminology and scope (what do we mean by research data?)
  2. Policies and laws affecting your data
  3. The “research data lifecycle
  4. Data Management Planning (DMP)
  5. Practical tools for looking after your data
  6. Data publishing and citation
  7. Where to go for further help and support

Comments welcome!