A JISC-funded Managing Research Data project

Posts tagged API

On January 20th, Dr. Mansur Darlington from the ERIM & REDm-Med projects came to Lincoln to discuss his work in relation to the Orbital project. Mansur has a consultancy role on the Orbital project and will be joining us again later on in the year, to help us evaluate our progress. It was a very useful and interesting meeting for all of the Orbital Team and the Engineering Researchers working with us. What became clear to us is that while ERIM offers the Orbital project a great deal of the underlying research and analysis of how Engineers work with data, Orbital can reciprocally feed back observations and issues arising from ERIM’s recommendations, which are theoretically robust but have not yet been tested in implementation. Similarly, with the REDmMEd project, which finishes in May/June, I hope that we can take the outputs of that prototyping work and build on them in the development of Orbital.

Here are Mansur’s slides from the meeting and below that, my notes.

  1. Purpose of the meeting
  2. Introductions: Bev, Annalisa, Bingo, Chunmei, Joss, Stuart, Lee, Mark, Nick, Paul, Mansur. Apologies, Chris Leach.
  3. Engineers: Bingo, Chunmei, Stuart
  4. See slides. ERIM research offers good spread of Engineering research data.Industry collaboration is vitally important.
  5. MRD in general:

* Need to find out which RC (%), the funding into Engineering School comes from.
* All institutions have to put together a roadmap for RDM by May 2012 for EPSRC.
* Siemens/Lincoln spend a lot of effort in discovery of existing data to base investigations on.
* No national, dedicated Engineering data archive
* Need to look at API integration with DPMOnline (DCC)
* Orbital as tool for managing research projects?
* Ask DCC to visit Lincoln for Policy development and training.
* Reporting to DCC is a formal requirement.
* Include costs of MRD in the university overhead when bidding for funds.
* Datasets as an outcome of research projects. More ‘efficient’ to deal with RDM as part of project.
* ‘Market’ for data. Expectation of costs and benefits of MRD

6. The Nature of Engineering Research Data:

* ERIM: Engineering Research Information Management: Research activity data as well
* Problems with terminology. Need for definition. Both theoretical and practical/empirical outputs from the project.
* Good slides for terminology and understanding domain
* How does Orbital fit into the VRE puzzle?
* Transparent logging and capture of as much activity data as possible.
* Knowing the context is vital for understanding data. Orbital needs to concentrate on contextual data as much as ‘research data’.
* Orbital supports research lifecycle from bidding to completion?
* ‘Engineering research data’ covers pretty much all types of data.
* Need to identify other types of Engineering users to broaden scope of ‘Engineering data’
* Look outside Engineering for variety of data types/activity. Look beyond Engineering. Generalisable.
* Data types is one thing; methodologies and the data they produce are another.
* We manage data so that it can be RE-USED (by someone)
* Must not add to bureaucracy of research

One of the interesting things about Orbital is its use of an API-driven development approach. In traditional, API-less applications your end-to-end system would look something like this:

The only way to interact with this application is to either be a user, or pretend to be one.

This is all well and good if the only thing you want to be able to interact with your application is a real user, but it’s increasingly a bad idea. Users can interact with your application as intended, but should a machine want to get at your data (which may happen for any one of a hundred reasons) they’ve got to muck about pretending to be a user and scraping dataEverybody is building with APIs nowadays, and if you aren’t then you’re going to be left behind, cold and frightened, in a world which no longer subscribes to the notion that monolithic software can stand on its own and provide useful functionality.

So the next step is to bolt on an API.

APIs like this are notorious for only exposing part of the functionality of an application.

This is the most common form of API around, and consists of a ‘second view’ on the data and functionality of an application. This is a massive step forwards and makes lives much, much easier in most cases. The only downside is that it’s very easy for this kind of API to provide a ‘bare bones’ functionality, such as only providing a list of items when the ‘real’ user interface lets you not only view the list but also edit its contents. It’s better than nothing but not ideal, which is why Orbital is taking the next step:

In an API-driven model the API is the only way to interface with the application

Under this design the API is the only way to interface with the data and functionality of the system. If a user wants to access it they must go through an intermediary to translate their wishes into API calls, and the results back into a nicely human readable form. The plus side is that any other consumer of the service is free to interact with the application on exactly the same terms as the ‘official’ frontend, providing that it has been granted those permissions. As far as Orbital Core (our actual application) is concerned there is no functional difference between Orbital Manager (our frontend) and an application that a researcher has hacked together to give themselves an easier time inputting data — they are subject to the exact same access controls, restrictions, sanity checking and limitations.

This means that every time we want to build user-facing functionality we have to stop, look at our APIs and work out where the functionality belongs. This also has the added benefit of making it essential to fully document our APIs for our own sanity, as well as ensuring that we have lightweight data transfer and rock-solid error handling baked right in.

The downside is that we have to double up on some bits of development, writing both the Core and Manager sides. It can also lead to the usual frustrations you get when trying to communicate with APIs, but on the plus side we have the ability to change both ends for the better.

Know of any other API-driven development in the fields of higher education or research data management? We’d love to hear about them, so that we can try to make our APIs as compatible as possible and improve interoperability. Drop us a note in the comments.

One of the cool things about Orbital from my point of view is that I’m not just responsible for putting together a bit of software that runs on a web server, but also for designing the reference platform which you run those bits of software on.

At this point I could digress into discussing exactly what boxes we’re running Orbital on top of, but that doesn’t really matter. What is more interesting is how the various servers click together into building the complete Orbital platform, and how those servers can help us scale and provide a resilient service.

You’re probably used to thinking of most web applications like this:

A 'traditional' server model

It’s simple. You install what you need to run your application on a server, hook it up to the internet, and off you go. Everything is contained on a single box which gives you epic simplicity benefits and is often a lot more cost efficient, but you lose scalability. If one day your application has a traffic spike your Serv-O-Matic 100 may not be able to cope. The solution is to make your server bigger!

Throw more power at it!

This is all well and good, until you start to factor in resiliency as well. Your Serv-O-Matic 500 may be sporting 16 processor cores and 96GB of RAM, but it’s only doing it’s job until the OS decides it’s going to fall over, or your network card gives up, or somebody knocks the Big Red Switch.

(more…)