Dr Edward Fisher, Agile Tomography Group, School of Engineering, University of Edinburgh
Software sustainability is crucial as academia looks to commercialise its outputs and progress designs up the technology readiness levels (TRLs). One of the key concepts that transcends software, firmware and indeed silicon hardware design, is the concept of parameterisation. It may seem simple, but a surprisingly large number of us are happy to copy code, thereby creating multiple versions, or simply to leave a code parameter "hard-wired" or "hard-coded" as we are under pressure to "just get it working".
But what can the industrial, commercial mindset teach us when it comes to software sustainability for both projects with multiple developers, or projects with lengthy time scales where we must think of our future selves, perhaps even remove some personal effort and strife further down the line? Can code parameterisation allow us to get closer to our long-term motto of "Better Software, Better Research", or indeed "Better Firmware, Better Results", or "Better Hardware, Better Reproducibility"?
So, what do we mean by code parameterisation? What advantages does it allow, and of course what effort must be expended to achieve these advantages?
To explain, let’s start with an example, you have some MATLAB code that sorts a large input file into separate data containers. Let’s say you want to put all even numbers into one storage array and all odd numbers into another. You could quickly code this up with N=2 data output bins (even and odd) and perhaps you choose to input your text file by reading M=1000 lines per iteration of the code, something we would do to help reduce the amount of data held within RAM. The values of N and M can of course be hard-coded, in which case you're done and you can start getting data. But that if the code might be used by other developers? What if two output bins are just one of many possible ways the data might be separated? As an example, in communications, data is often interleaved, but the number of interleaved streams, (effectively N above), might be one, two, four, eight, sixteen, or perhaps odd values such as three or five, each allowing a different communications performance. Another developer, or even yourself in many months' time, may well need to change the value of N or M.
Wouldn't it be great if we had a block of code, that once mature and well-tested works for any input value of N or M? This is the power of parameterised code. It allows code to develop from initial first "get it working" versions to final revisions that any developer can pick up and re-use for their purpose, without needing to modify the base code and without needing to re-test the code for their specific values. The function prototype may get extended with these parameters, option flags and mode switches, but quickly a block of code can become a high-level function, not needing the end user to know the details of the code itself. We see this in MATLAB all the time, the plot function shows this beautifully, where we can input M-length arrays, input line styles and colours and choose marker types. We can even pass it a handle to a previous 'axes' handle allowing us to plot that line onto a specific named figure.
So, parameterisation is a way of making a block of code become universal, generalised and suitable for multiple applications. Signal processing blocks such as the finite impulse response (FIR) filter, or the fast Fourier transform (FFT) inherently have many parameters, it is therefore in our interest to code these as parameterised, well-tested blocks so we don't have to reinvent the wheel if we want an eleven-tap FIR filter instead of a five-tap filter, or if we need the two-dimensional FIR filter rather than a standard single dimension filter. In a company, each time a function such as the FIR filter or FFT is needed, it is developed, tested, used, stored in a version control system, and then later modified by another developer for a new use. It is re-developed allowing the code to become mature, it is parameterised and tested with the old and new values of that parameter, retested, re-used for the new application and returned to the version control system. Quite quickly a library of robust parameterised blocks become available.
As an advocate of software sustainability practices as applied to electronics firmware and hardware, i.e. the textual description of digital logic gates etc., I'm always hopeful that concepts can freely move throughout the software, firmware, hardware trichotomy. Code parameterisation is one such concept, and indeed from a firmware and hardware perspective is more critical, for as the design become hardened, the development and verification time increases significantly. The problem becomes almost intolerable when logic is synthesised into silicon integrated circuits as the design is now literally "set in stone". But how does code parameterisation help?
Rather than software like C or even MATLAB, now let's say you have the firmware (digital logic description) for a serial peripheral interface (SPI). This is a common enough interface used to communicate with chips on a printed circuit board. Perhaps we need to send some configuration data to a chip, or we need to poll a temperature sensor. Let’s also say we initially develop this code for a serial transaction of 16-bits. We synthesise the Verilog language code of the design onto a field programmable gate array (FPGA), for our end system, and see that it needs to be debugged slightly. We notice that while it may clock data on the positive edge of the clock, some chips from some manufacturers choose to use the negative edge of the interface clock. The developer therefore adds a small mode setting, in this case a binary parameter called "clock polarity" (CPOL).
The SPI code seems quite robust and is added to a library of blocks that can be used by other developers. Perhaps a new academic project, with a Ph.D. student just starting needs to use the SPI interface. The code is there for him/her to use allowing them to progress more quickly than having to code the block up themselves. The chip they need to interface to needs transactions that are exactly 1 byte in length (8-bit). They see that the code includes a simple counter that counts to 16, as well as a parallel 16-bit word used as the SPI interfaces write/read input/output. To get cracking with their project, they copy the working 16-bit SPI code, rename it SPI_8 and change all instances of "16" in the code to "8". They try it out, great it works! But of course, we now have two versions. Later the post-doc within the research group, combines the two versions, knowing that the base code works for both 8-bit and 16-bit, and has been tested for both settings of the CPOL parameter. He/She makes the 8 or 16-bit a parameter, and quickly simulates the code to ensure no silly bugs become evident (8-bit transactions would require a 3-bit bit counter, while 16-bit transactions would require a 4-bit bit counter).
Parameterisation of code inherently increases both the development time and verification time of code. It is therefore most suitable for low-level blocks that will most certainly be re-used within a project or within a research group. But, with design effort now, we can reduce effort further down the line, creating truly re-usable blocks. I chose the firmware (FPGA) serial interface example for a reason, that being that during my Ph.D. the silicon chip, an application specific integrated circuit, used two SPI interfaces. One was implemented to configure a 32-bit control register on the chip. The other was implemented in parallel to configure a 512-bit string of control bits that were placed within the pixels of the custom silicon chip. By re-using and parameterising the SPI code, it became trivial to adapt multiple instances of the same code (working in parallel as firmware does), to control interface to a 16-bit temperature sensor, a 32-bit chip control register and a 512-bit pixel mode control register string. Later, that same code was re-used in a different project, to interface to the 16-bit control register of a high-speed (40MS/s) eight-in-one package analogue to digital converter (ADC) and to grab data from a slow (1MS/s) 12-bit ADC.
What have we learnt about code parameterisation? How might it help us, and how might it enable software sustainability and the ideal of code re-use? How can the concept be used within both firmware and hardware?
Parameterisation takes a hard-coded or hard-wired value and adds it as a parameter that can be changed when a function, module or sub-routine is instanced or invoked. It allows the end user to adapt the code to the needs of their application without needing to modify the code itself. By re-using code, not needing to verify that code and by allowing adaption, long-term gains can be made in productivity and software sustainability.
Code parameterisation, as a concept allows:
code adaptability for other applications
increased verification of code (multiple passes by multiple developers)
decreased end-user involvement, with reduced learning curve for usage.
But code parameterisation, also requires more effort:
initial code development, testing and debugging
change of variable’s value, re-testing, re-debugging
parameterising to add variable as a function input or compile-time setting.
Of course, in the case of the SPI code we have looked at above, we can input values of 8-bit, 16-bit, 32-bit and 512-bit. But this implies that the parameter can have any value in-between. Part of the path towards a generalised, parameterised block, is to test the code with odd values. Perhaps we need transactions that are 6-bits in length, or perhaps we should test for transactions that are 256-bits in length. Code parameterisation therefore increases the job of verification, to which we need to apply reasonable coverage, non-exhaustive tests. As a concept, it does include extra development work, but the payoffs can be significant if applied to the right selection of low-level, re-used codes.
Through code parameterisation, together we should be able to enable sustainable, maintainable, re-usable software. And ultimately we should be able to both share code, and progress towards "Better Software, Better Research", "Better Firmware, Better Results" and "Better Hardware, Better Reproducibility".