Last week Joss and I had a chat with one of our engineering researchers about the kinds of data he handles in his research. This was an incredibly useful meeting, leading to a whole bunch of notes on data types, requirements and workflows. The one I’m taking a look at today is the flow that data takes from its source, through storage and processing, and into a useful research conclusion.
The existing workflow looks something like that shown above. Source data is manually transferred (often using ‘in-the-clear’ methods) from its point of origin to local storage on a researcher’s machine, where it will reside on the hard disk until it’s used. From there the data is processed (Engineering love using MATLAB, as do a lot of other science disciplines, so that’s the example here) and potentially the results of that analysis are recombined with the local storage for further work. At some point the processing will arrive at conclusions for the data, and from those an output can be drawn.