A JISC-funded Managing Research Data project

Posts tagged DCC

The final within-project version of the Orbital Research Data Management training materials are now live on the Orbital Researcher Dashboard website. They have been written collaboratively by the Orbital project team, and draw on a lot of existing RDM training and guidance material from across the web (in particular, from the DCC).

We intend that these materials will continue to be maintained and developed as part of the new University-wide research information service mentioned in a previous blog post.

Screenshot of the Researcher Dashboard

The training materials can be accessed at https://orbital.lincoln.ac.uk/ and cover the following areas:

  1. What is research data?
  2. The research data lifecycle
  3. Policies affecting your research data
  4. Data Management Planning (DMP)
  5. Data search and discovery tools
  6. Data storage and security
  7. Legal and ethical issues
  8. Tools for working with your data
  9. Data publishing and citation
  10. Licences for sharing your data
  11. Data curation and preservation
  12. Workshops and training events
  13. Help and support

The source text for each page is stored in an open Github repository (at http://github.com/unilincoln/rdm) in Markdown format. The page admin tools in the Researcher Dashboard can then be used to link to the source document, which is then formatted in the University’s Common Web Design.

These web pages will be used to support the ongoing RDM training for postgraduate students, which will shortly be rolled out to University staff.

This week sees the formal two-day launch event for the JISC Managing Research Data programme 2011–2013 (the programme which is funding Orbital). It’s being held in the National College for School Leadership, next to the University of Nottingham’s Jubilee Campus.

Unfortunately, after schlepping it from the furthest fringes of Lincolnshire (and then having to go back home for the evening), I was only able to attend a couple of hours of day 1. But it was worth it.

I arrived just in time for a workshop about a number of research data management tools developed/provided by the Digital Curation Centre (DCC). Dr Mansur Darlington, who’s acting as external assessor/consultant to the Orbital project, was also in this workshop and contributed greatly to the discussions. (My Orbital colleagues Joss Winn and Nick Jackson attended the [parallel] workshop on various JANET, Eduserv and UMF SaaS/cloud storage services.)


Clare CollegeYesterday I was at Clare College, University of Cambridge for a meeting organised by USTLG, the University Science & Technology Librarians Group. The group—open to any librarians involved with engineering, science or technology in UK universities—has meetings once or twice a year. The theme of yesterday’s meeting (free to attend, thanks to sponsorship from the IEEE) was data management, with an implied focus on research data.

The meeting consisted of a series of presentations (plus a fantastic lunchtime diversion, below) with plenty of time for networking – there were about 40 people there, all with an interest in research data management – though interestingly, a show of hands suggested very few people were actively engaged in looking after their own institution’s researchers’ data.

As usual, this blog post has been partially reconstructed from the Twitter stream (hashtag #ustlg).

First up, Laura Molloy, substituting for Joy Davidson of the Digital Curation Centre (DCC), on a project called the Data Management Skills Support Initiative (DaMSSI), looking at the [shades of information literacy] skills needed by different people involved in the research data curation process. “DaMSSI aims to facilitate the use of tools like Vitae’s Researcher Development Framework (RDF) and the Seven Pillars of Information Literacy model” developed by SCONUL. Key question: how do you assess the effectiveness of research data management training?


Attending: Nick Jackson, Annalisa Jones, Bev Jones, Chris Leach, Paul Stainthorp, Joss Winn

Apologies: Lee Mitchell, David Young


  1. Review Project Plan and Workpackages
  2. Status updates: Literature Review, User Requirements Analysis, Technology/Standards evaluation
  3. Forthcoming meetings and conferences (Agile method, Open Source policy, ERIM, Engineers, OR12, DCC, Start-up)
  4. Poster, papers, website
  5. Staffing and accommodation
  6. AOB


Joss Winn (JW) reported in detail:

  • JW reported on the work done to date (mostly relating to workpackage WP1), and reported back on:
    • The successful first meeting with users from the School of Engineering
    • The first Steering Group meeting on 3 November
    • The submission of the project plan
    • The appointment of NJ as lead developer
    • The relocation of NJ and PS (part-time) to CERD’s offices to work on Orbital
  • JW ran through the project outputs and workpackages in detail, identifying deadlines – most notably the Implementation Plan, which must be submitted by February 2012, with the following four pieces of work completed by then:
    • Data sources (NJ/CL)
    • User requirements (NJ)
    • Literature review (PS/BJ/CL)
    • Technical review (NJ/JW)
  • The group discussed the further user-engagement work to be completed in workpackages WP5, including Nick Jackson’s work with the School of Engineering to assess their requirements (through workshops, questionnaires, observation, and use of the Data Asset Framework – DAF), and on a planned round-table meeting about ERIM in late January
  • ACTION (NJ): dates needs to be set for user requirements exercises.
  • ACTION (PS): Date in late January needs to be set for ERIM workshop with Engineers.
  • PS reported on the work that he and NJ have begun to benchmark against the EPrints deposit workflow (WP8). NJ will work closely with BJ on this.
  • The group discussed WP9—the planned assessment of data sources—and CL’s role as library user. There are three obvious areas where Orbital crosses over with the Library’s priorities:
    • Integration with the Library’s Discovery selection & implementation project (CL)
    • Integration with the Repository (BJ)
    • Authentication (CL)
  • The Research & Enterprise office (i.e. AJ) will lead on WP11 – developing training materials & workshops.
  • JW will carry on the work with the University’s IP manager, James Murray on the correct approach to Open Sourcing code from Orbital – WP13.
  • ACTION: JW to follow up contacts with EPrints Services and OSSWatch.
  • Dissemination (WP14):
    • PS has been invited to speak at two events in January/February. The group will aim to have a publishable conference paper ready by Summer 2012. Submit abstract to OR12 by ?.
    • NJ, PS and JW are attending the project startup meeting in Nottingham on 1-2 December; presenting a poster. Also attending the DCC roadshow in Cardiff in mid-December.
  • Any other business:
    • JW is convening a meeting (8th December) about agile software and project development methods.
  • ACTION: as many people as possible from Orbital to attend ‘agile’ meeting.

Long day on the trainI’ve been at the University of Warwick today, for a workshop organised by the Digital Curation Centre (DCC), entitled RDMF7: Incentivising Data Management & Sharing. There appeared to be a wide range of attendees, from data curators & data scientists, ICT/database folk. actual researchers and academics, as well as at least one fellow library/repository rat.

Unfortunately I was only able to attend part of the event (which ran over two days). The following notes have been reconstructed from the Twitter stream (hashtag #RDMF7)!

The first speaker I heard was Ben Ryan of the funding council, the EPSRC. He talked about the “long-established” principles of responsible data management [links below]… this may be my own interpretation of Ben’s presentation, but I don’t think I was imagining undertones of “…so there’s really no excuse!“. He also covered individual and institutional motivations for taking care of data [much more about which later], policy and the enforcement of policy, dataset discoverability/metadata, funding (including the EPSRC’s expectation that institutions will make room in existing budgets to meet the costs of RDM), and embargo periods (inc. researchers’ entitlement to a period of “privileged use of the data they have collected, to enable them to publish” first – important to stress this in order to allay fears/get researchers on board?).

Some links:

Next up was Miggie Pickton, ‘queen bee’ of the University of Northampton‘s repository (and self-described RDM “novice”, indeed!), talking about their participation in the multi-institution, JISC-funded KeepIt project, which aimed to design “not one repository but many that, viewed as a whole, represent all the content types that an institutional repository might present (research papers, science data, arts, teaching materials and theses).” This work lead almost by chance to Northampton’s undertaking of a university-wide audit of its research data management processes using the DCC’s Data Asset Framework (DAF) methodology. This helped them to make the case for an institutional research data management working group and [eventually, and not without resistance] to establish a mandatory, central policy for RDM. (Show of hands at this point: how many other institutions have completed a DAF? I counted perhaps only three, Lincoln certainly not being amongst them. Q. Should the University of Lincoln complete a Data Asset Framework exercise as part of the Orbital project?)

After coffee, we heard a third presentation from Neil Beagrie of (management consultancy partnership) Charles Beagrie Ltd. Neil delivered a very comprehensive explanation of the KRDS (“Keeping Research Data Safe”) project, which has developed both an activity model and a benefits analysis toolkit for the management and preservation-of-access to ‘long-lived data’. I have to come clean here and admit that I was a little bewildered by the detail: much of it went through both ears without sticking to the brain on the way through. I need to go back over the tweets more carefully and have a look at the KRDS toolkit and reports at: beagrie.com/krds.php

The morning’s presentations over, we split into three groups for breakout discussion.

I attached myself to the second of the three groups, led by (JISC programme manager for Orbital) Simon Hodson; our job to consider the question: “What really are the sticks and carrots that will make a long-term difference to the pursuit of structured data management processes?“. After spending some time picking apart the terminology, and what each of the various ‘processes’ might include, we had a wide-ranging (and allocated-time-overrunning) discussion about the things that genuinely motivate scientists, universities, and funding councils(!) to care about RDM; about some of the problems caused by the complexity and inconsistency of metadata for datasets; also about the issue of citations/digital object identifiers for data—how those citations might be treated by publishers and citation data services—and how that relates to any notions of ‘peer review’ in experimental data.

As requested, our group came up with three actions which we believe will help address the question of motivation:

  1. Data citation – publishers should consistently include e.g. DOIs for datasets in final published articles, so that citations of the data can be measured.
  2. Measurement of RDM “maturity” – departments and whole institutions should adopt a standardised quality mark for research data management, to give [potential] researchers, funding bodies, and the public confidence in their ability to handle data appropriately.
  3. Discovery – the research councils (probably) should push for common metadata standards for describing datasets and underlying data-generating research/experimental processes.

Lunch followed, and I had time to hear two more presentations in the afternoon before I had to run for a bus:

Catherine Moyes of the Malaria Atlas Project: in effect, demonstrating what really clear and consistent management of large-scale (geo)data looks like. This seems to consist of an extremely rigorous approach to requesting, tracking, and licensing data from the contributors of the project’s data… and an equally strict (but in a good way) expectation of clarity when dealing with requests from third parties to use the data. If that all comes across as restrictive, I’d point to Catherine’s slide on ‘legalities’ of the data that the Malaria Atlas Project has released openly – it’s about as open as it gets, with no registration needed, no terms & conditions placed on re-use of the published data, and all software/artefacts released under very permissive and free licences (Creative Commons or GNU). N.B. the Orbital project should look at the Malaria Atlas Project’s “data explorer”, available via map.ox.ac.uk, as an example of a really nifty set of applications built on top of openly accessible and re-usable data.

Finally (and I’m sorry I only got to hear part of his presentation), University of So’ton chemistry professor Jeremy Frey on their IDMB (Institutional Data Management Blueprint) Project—southamptondata.org—and some rather funny anecdotes about the underlying knowledge, expectations, and problems faced by researchers managing their own data, which emerged when they were surveyed as part of the above project.

Lots to take in (lots). But some useful suggestions for Orbital, which I’ll be bringing to the next project meeting: and plenty more reading material which I’ll add to the project reading list asap.

Paul Stainthorp, lead researcher on the Orbital project.