APIs first!

After testing the new SWORD2 endpoint for our new ePrints 3.3 instance, we found that a significant change was needed for the SWORD library. Minor changes included the endpoint, which became …/id/contents instead of …/sword-app/deposit/inbox, and the structure of the XML changing from <eprint> tags to <entry> tags. The main change was the implementation of how the XML was posted. The SWORD library swordappv2-php-library was forked from the github repository so that an XML string could be posted. This was because our current method posted a string, which the endpoint read as a file rather than metadata. So the dataset had the XML attached to it as a file, with no metadata. We have made additions to the library, changing it to post a string of XML metadata rather than a file. This fixed the problem, giving the dataset metadata once posted rather than attaching it as a separate file.

Now heres the main problem. The dataset gets posted to ePrints in a deposited state, which ePrints classes as ‘in Review’. Now, ePrints requires a minimal set of metadata before a dataset can be ‘in review’. But only if the dataset is made manually within ePrints. Not via the API. Over the API, you can post a dataset straight to ‘in review’ without the mandatory set of metadata. Which brings me to the title of this post; APIs first! API driven development would mean that the APIs are built first so this kind of situation would be avoided.

Another problem we came across during the change was that the test account we had for testing deposits no longer existed due to the migration of user accounts skipping it. This is fine, as an unauthorised response should be received on an attempted deposit. This was not the case, as we got an ‘Invalid XML’ response. Which was unusual, as the XML was valid and everything we tried was to no avail. It was by chance that we found the solution, by switching to an account we knew existed and the deposit working as planned. What had happened was that the depositing had failed, due to the account not existing, but the wrong error message being sent back.

So I reiterate; APIs first. Knowing what the response is, and that the functionality of the application works first, is the most important aspect of said application.