Open Resources and Open Standards

The Orbital project is about a lot more than just developing a cool bit of software. In fact, the majority of the project impact is to do with policy and training rather than development. However, we think there are some good practices in software development which apply equally to the development of documentation around policy and training. Specifically, revision control.

Throughout the day as we make changes to the source code which makes up Orbital Bridge we record significant states in the development against our revision control software (specifically Git). We can then rewind the state of the entire codebase to any one of these conditions, compare differences between the two, and even pick and choose specific changes to move between states on a line-by-line basis. We can create diverging versions to test new features in isolation and merge them together again with no fear of messing up the working version.

Given that we’re planning to release all of our RDM policy and documentation under an Open licence (specifically CC-BY) it made a lot of sense to use a platform for revision control which makes the most of the community and both allows and encourages people to view our stuff, take it, make changes and even propose changes back to us. Enter GitHub, the most popular source code sharing site in the world. GitHub provides us with a ready to go Git hosting platform, as well as a load of really easy to use tools to help us and other people make the most of our resources.

At the University of Lincoln we already use GitHub for Open Source software projects from both the Online Services Team and the LNCD development group, so it made sense to use it for our RDM documentation as well. The definitive copy of our RDM policy and training materials can now be viewed in the state it was at any given point in time, branched, merged and so-on — but there’s a problem with making documents the Old Fashioned Way that people in the University may be used to. Namely, using Microsoft Word to store a document will cause all kinds of problems for revision management in that Word doesn’t just keep the text, but a whole load of other stuff which is then compressed down into a single binary blob. Using Word would mean that although technically the main features of revision control (versions, branching etc) would work we’d lose some of the more elegant solutions to problems such as line-by-line comparisons of versions and merging of different branches.

A better solution was needed for writing documents, and we ended up with a shortlist of three potential plain-text markup standards. These are ways of marking up a plain text document (such as you’d write in Notepad) with semantic structure and styling so that we can take the document and re-render it in a number of different places. Our three contenders were LaTeX, Markdown and reStructuredText. All three have pros and cons, but have the same basic idea behind the scenes – plain text is surrounded with bits of other plain text that give it meaning. All three result in a document that is fundamentally human readable without the need for any proprietary software, and all three allow for the document to be re-rendered in a form appropriate for the audience.

LaTex is by far the most powerful of the three, having a background in typesetting complex scientific academic papers. It would allow for policy documents to be rendered for both the web and print, but has the downside of being the most complex to use and having a less user-friendly syntax. We want the policy to be as accessible as possible, without needing to understand what a set of tags means.

Markdown and reStructuredText both take a much simpler approach, and use almost identical syntax for most things. However, reStructuredText has a bundle of other markup which mades it better suited to long, structured documents with nested lists. reStructuredText would be ideal if we ever decided to convert the University’s Regulations to a plain text format, but for a simple document such as the RDM policy doesn’t really have any advantage over Markdown.

The tipping point for our decision then lay in the technical implementation of Markdown over reStructuredText. Fortunately this was an easy call, as reStructuredText is very tightly linked into the Python ecosystem whereas Bridge is built entirely in PHP. We could easily drop a PHP library to do Markdown rendering into Bridge, whereas reStructuredText would need additional work to call an external Python library to do the best job of rendering. Should we decide in the future that we need the extra capability of reStructuredText then the migration as far as the document is concerned is virtually non-existent.

You can view our current draft RDM policy in Markdown in our RDM repository on GitHub, as well as fork it and submit pull requests if you want to use it as a basis for your own or propose changes. We will be moving all our training presentations to use a Markdown based in-browser format in the near future.

The Toolchain: First Pass

Today I’ve been kicking around the ICT office with Alex, figuring out how to make Jenkins (our wonderful CI server) build and publish the latest version of the CWD with all the bells and whistles like compilation of CSS using LESS, minification, validation of code and so-on. As part of this we managed to fix a couple of bits and pieces which had been bugging me for a while, namely the fact that GitHub commit notifications weren’t working properly (fixed by changing the repository URI in the configuration) and the fact that Campfire integration wasn’t working (fixed by hitting it repeatedly with a hammer).

This brought me to thinking about how our various things tie in together, so I set about charting a few of them up. After a while I realised the chart had basically expanded into a complete flowchart of the various tools and processes that hang together to keep the code flowing in a steady stream from my brain – via my fingers – into an actual deployment on the development server. Since it may be of interest to some of you, here’s a pretty picture:

This is (approximately) the toolchain I currently use for Orbital, including rough details of what is being passed around

The beauty of this is that the vast majority of the lines happen completely by themselves — I get to spend my days living in the small bubble of my local development server and dipping in and out of Pivotal Tracker to update stories. The rest is magically happening as I work, and the constant feedback through all our monitoring and planning systems (take a look at SplendidBacon for an epic high-level overview) means that the rest of the project team and any project clients can see what’s going on at any time.