A JISC-funded Managing Research Data project

Posts tagged Mansur Darlington

In December, colleagues in the Web Team (who manage the corporate web site in the Department of Marketing and Communications) approached a few of us about building a tool to allow staff to edit their profile for the new version of the lincoln.ac.uk website. We suggested that much of the work was already done and it just needed gluing together. Yesterday we met with the Web Team again to tell them that our part of the work is pretty much complete. Here’s how it works.

Quick sketch of profile building at Lincoln
Quick sketch of profile building at Lincoln

This requires a bit of explanation, but let me tell you, it’s the holy grail as far as I’m concerned and having this in place brings benefits to Orbital and any other new application we might develop. Here’s a clearer rendering.


Building staff profiles
Building staff profiles

The chart above strips out the stuff around authentication that you see in the bottom right of the whiteboard photo. That’s for another post – something Alex is better placed to write.

Information about staff at the university starts with the HR database. This feeds the Active Directory, which authenticates people against different web services. Last year, Nick and Alex pulled this data into Nucleus, our MongoDB datastore, and with it built a new, slick staff directory. Then they started bolting things on to it, like research outputs from the repository and blog posts from our WordPress/BuddyPress platform. To illustrate what was possible, they started pulling information from my BuddyPress profile, which I could edit anytime I wanted to. It got to the point where I started using my staff directory link in my email signature because it offered the most comprehensive profile of me anywhere on a Lincoln website.

By the time we first met with the Web Team about the possibility of helping them with staff profiles, Alex and Nick had 80% of the work already done. What remained was to create a richer number of required fields in BuddyPress for staff to edit about themselves and a scheduled XML dump for the Web Team to wrangle into their new templates on www.lincoln.ac.uk.

So the work is nearly done. The XML file is RDF Linked Data, which means that we have a rich aggregation of staff information and some simple relationships, feeding the Staff Directory, being refreshed every three hours and then being output either as HTML, JSON or RDF/XML.

For the Orbital project, all this glue is invaluable. When staff login to Orbital (Nick’s working on this part right now), we’ll already know who they are, which department they work in, what research outputs they’ve deposited in the institutional repository, what their research interests are, what projects they’re working on, the research groups they’re members of, their recent awards and grants, and the keywords they’ve chosen to tag their profile with. It’s our intention that with some simple AI, we’ll be able to make Orbital a space where Researchers find themselves in an environment which already knows quite a bit about their work and the context of the research they’re undertaking. Once Orbital starts collecting specific staff data of its own, it can feed that back into Nucleus, too.

This reminds me of our discussion last month with Mansur Darlington of the ERIM/REDm-MED project. Mansur stressed the importance of gathering data about the context of the research itself, emphasising that without context, research data becomes increasingly meaningless over time. Having rich user profiles in Orbital and ensuring that we record data about the Researcher’s activity while using Orbital, should help provide that context to the research data itself.

Orbital, therefore, becomes an infrastructure not only for storing and managing research data, but also a system for storing and managing data about the research itself.

On January 20th, Dr. Mansur Darlington from the ERIM & REDm-Med projects came to Lincoln to discuss his work in relation to the Orbital project. Mansur has a consultancy role on the Orbital project and will be joining us again later on in the year, to help us evaluate our progress. It was a very useful and interesting meeting for all of the Orbital Team and the Engineering Researchers working with us. What became clear to us is that while ERIM offers the Orbital project a great deal of the underlying research and analysis of how Engineers work with data, Orbital can reciprocally feed back observations and issues arising from ERIM’s recommendations, which are theoretically robust but have not yet been tested in implementation. Similarly, with the REDmMEd project, which finishes in May/June, I hope that we can take the outputs of that prototyping work and build on them in the development of Orbital.

Here are Mansur’s slides from the meeting and below that, my notes.

  1. Purpose of the meeting
  2. Introductions: Bev, Annalisa, Bingo, Chunmei, Joss, Stuart, Lee, Mark, Nick, Paul, Mansur. Apologies, Chris Leach.
  3. Engineers: Bingo, Chunmei, Stuart
  4. See slides. ERIM research offers good spread of Engineering research data.Industry collaboration is vitally important.
  5. MRD in general:

* Need to find out which RC (%), the funding into Engineering School comes from.
* All institutions have to put together a roadmap for RDM by May 2012 for EPSRC.
* Siemens/Lincoln spend a lot of effort in discovery of existing data to base investigations on.
* No national, dedicated Engineering data archive
* Need to look at API integration with DPMOnline (DCC)
* Orbital as tool for managing research projects?
* Ask DCC to visit Lincoln for Policy development and training.
* Reporting to DCC is a formal requirement.
* Include costs of MRD in the university overhead when bidding for funds.
* Datasets as an outcome of research projects. More ‘efficient’ to deal with RDM as part of project.
* ‘Market’ for data. Expectation of costs and benefits of MRD

6. The Nature of Engineering Research Data:

* ERIM: Engineering Research Information Management: Research activity data as well
* Problems with terminology. Need for definition. Both theoretical and practical/empirical outputs from the project.
* Good slides for terminology and understanding domain
* How does Orbital fit into the VRE puzzle?
* Transparent logging and capture of as much activity data as possible.
* Knowing the context is vital for understanding data. Orbital needs to concentrate on contextual data as much as ‘research data’.
* Orbital supports research lifecycle from bidding to completion?
* ‘Engineering research data’ covers pretty much all types of data.
* Need to identify other types of Engineering users to broaden scope of ‘Engineering data’
* Look outside Engineering for variety of data types/activity. Look beyond Engineering. Generalisable.
* Data types is one thing; methodologies and the data they produce are another.
* We manage data so that it can be RE-USED (by someone)
* Must not add to bureaucracy of research

Part of the Orbital project governance is that I report to the university’s Research, Innovation and Enterprise Committee. The Committee meets every three months and I send a short report to each meeting and attend every other meeting. Here’s my report for the February committee meeting.

The Orbital Project

Progress report to the Research, Innovation and Enterprise Committee

30th January 2012

Author: Joss Winn, PI/PM.

Progress since the last update to the RIEC on 13th December:

  1. The short-term focus for the project continues to be the development of the technical infrastructure for managing research data, while being mindful of the long-term requirements to develop policy and a supportive environment for research staff.
  2. Software development has begun. We have finished setting up the development environment for the Orbital system. This is a major software development project for the university and we have spent some time designing the server architecture and quality assurance procedures for development.
  3. Orbital will make use of ‘cloud computing’ and is working with ICT as a pilot project for integrating cloud computing into our local infrastructure. A meeting took place with Eduserv, a non-profit provider of cloud computing to the HE sector (running on Janet) and a further meeting is taking place with Rackspace, a major commercial provider of cloud computing services. This work sits alongside ICT’s need to refresh their server infrastructure next year and will provide ICT with a real opportunity to investigate the business case for cloud computing as well as issues around actual implementation.
  4. A full-time post for a Web Developer has been advertised and we expect the post to begin late March/early April. This is the second full-time Web Developer post on the Orbital project.
  5. We are pleased that Dr. Ling from the School of Engineering and his PhD student, Chunmei Qing, will work with closely with the Orbital project in the development of the software, policy and training materials. Similarly, we are working with Prof. Chris Bingham and Stuart Watson (Siemens), and have recently joined their fortnightly research meetings, which are extremely useful to the Orbital project. At this stage, we welcome involvement from any Researcher in the School of Engineering and further into the project intend to broaden our use cases to other research disciplines.
  6. A meeting has been held with Dr. Mansur Darlington from the University of Bath. Dr. Darlington led the JISC-funded ERIM project, which studied the Research Data Management (RDM) issues for the discipline of Engineering.[1] The meeting was very useful for the Orbital team, including partners at Siemens and Researchers in the School of Engineering, who attended. The ERIM project provides a very robust, theoretical basis, which Orbital will attempt to build upon and implement. Similarly, a follow-up to the ERIM project will provide prototype tools, which we hope to build on for Orbital.[2] This is a key external relationship for the Orbital project.
  7. One issue flagged by Dr. Darlington concerned national funding bodies’ RDM policies. Each funding body has an RDM policy which requires universities to have effective methods in place for managing, preserving and disseminating research data.[3] The EPSRC has told all universities that we must provide them with a RDM roadmap by 1st May 2012 and must be compliant with these expectations by 1st May 2015.[4]
  8. The Orbital project is required by JISC to produce an RDM Policy for the institution. A national meeting is being organised by JISC to assist with the development of such policy in March. Following this, I suggest that a workshop is held in March where the Orbital project and other key staff from the Library and Research and Enterprise Office begin to draft this Policy and the required EPSRC roadmap. This can then be presented to the RIEC for discussion and approval prior to submission to the EPSRC.
  9. A meeting has been arranged for March 7th, 9.30-12pm, to discuss the Business Case for Open licenses. This discussion will be of interest for anyone concerned with licensing research outputs (‘Open Access’), software development projects (‘Open Source’), and teaching and learning resources (‘Open Educational Resources’). Staff from the JISC-funded OSS Watch, University of Oxford, will present at this meeting. Andrew Hunter and James Murray will attend and members of the RIEC are also welcome. Please RSVP to Joss Winn by end of February.
  10. Joss is working with JISC to organise a national event focussing on issues around software development for Research Data Management, which will be held in May.

This week sees the formal two-day launch event for the JISC Managing Research Data programme 2011–2013 (the programme which is funding Orbital). It’s being held in the National College for School Leadership, next to the University of Nottingham’s Jubilee Campus.

Unfortunately, after schlepping it from the furthest fringes of Lincolnshire (and then having to go back home for the evening), I was only able to attend a couple of hours of day 1. But it was worth it.

I arrived just in time for a workshop about a number of research data management tools developed/provided by the Digital Curation Centre (DCC). Dr Mansur Darlington, who’s acting as external assessor/consultant to the Orbital project, was also in this workshop and contributed greatly to the discussions. (My Orbital colleagues Joss Winn and Nick Jackson attended the [parallel] workshop on various JANET, Eduserv and UMF SaaS/cloud storage services.)


Paul, Nick and I had a great meeting with the two principal Engineering users last week, where we set out our broad objectives and discussed their involvement on the Orbital Project. It’s always been our intention to work with three types of user: academic staff, a commercial research partner and a PhD student. This morning, we met Chris Bingham, Prof. of Energy Conversion, and Stuart Watson, Head of Remote Monitoring and Diagnostics at Siemens. Later this week, we’ll be meeting Reader in Optimisation and Symbolic Dynamics, Dr Wing-Kuen Ling and one of his PhD students, who are also interested in contributing to the Orbital project.

At last week’s meeting we discussed Chris and Stuart’s current practice of working on sensor data from Siemen’s turbines, which involves a combination of physically secured machines, secure web services and sanitised data. As is common practice when working with commercial partners, the resulting research papers go through an approval process with the commercial partner prior to being submitted for publication and data is routinely abstracted so that confidential and commercially sensitive data isn’t made public.

We discussed how these current practices might be improved over and above the ‘baseline’ method used now. Chris and Stuart both felt that improvements could be made around physical access to the data (possibly PKI card integration) and a system that does not encourage copies being made of the data. There should be no need for Engineers to take data away with them, but rather always be available from a single data store. We also discussed the use of the Cloud for storing data and both Stuart and Chris acknowledged that attitudes towards Cloud Computing were changing and that it’s worth considering it.

Their measure of success of our research data infrastructure is whether it increases productivity and overcomes some of the barriers to access and sharing of the data that currently exist. They also expressed an interest in how the infrastructure can also help manage related artefacts, such as presentations and research papers. Ideally, they want something that helps manage all aspects of their research environment rather than fragment it into disparate systems.

Actions from the meeting were to introduce the project at the next all staff meeting of the School of Engineering (done), arrange to meet the Developer of Siemen’s in-house software and, as mentioned above, speak to Dr Ling about his involvement as a user on the project, recognising that his area of research is different to Chris and Stuart’s and presents us with a different type of data and workflow. Finally, we also agreed to invite Dr. Mansur Darlington of the ERIM project to hold an extended meeting in late January with Engineering staff to discuss the outcomes of the ERIM project.