Starting to write academic code can be a bit intimidating, especially if you don’t come from a coding background - there’s a lot of myths and preconceptions around it. This guide is for those who are just starting to write scientific code, to give you a few clear tips about getting started writing code! Hopefully, you will leave more confident in not just your ability to write clear, helpful academic code but also the benefits of it.
Whilst most researchers write code , most of us haven’t got very much formal training to do so  - if we’re lucky, maybe a module or two at undergrad. This can make branching out from spreadsheets to ‘real’ programming a bit of a mystery. Here’s a few tips for getting started:
1. Write easy-to-read code
Code has a reputation for being incomprehensible, but it really doesn’t have to be! Common scientific languages like Python can make it very easy to write clear, simple code that anyone can read. You might see code that looks like:
for m in ms:
But it doesn’t have to look that way! If we use clear, descriptive variable names instead of short ones, suddenly it’s a lot clearer what’s going on:
for month in months:
total_rainfall = total_rainfall + process_rainfall(
When you try to learn from cryptic, poorly-named code it’s easy to get the idea that writing code is harder than it really is. This goes both ways too: make sure you write code that looks like the kind you’d want to read, to help everyone else out!
2. Share your code!
Lots of researchers really don’t like sharing their code with others. Usually, they’re worried that it’s not up to scratch, and that other people will judge them for it. This is just another angle on the common academic problem of Impostor Syndrome. Many of us are convinced everyone else is doing it right, apart from us.
Nobody started off writing perfect code, and nobody writes perfect code now - every time I look at something I wrote a few years ago I see things I’d have changed. So you won’t be expected to write perfect code either. Getting extra eyes on yours is the best, fastest way to learn about bugs, possible improvements, and even just ways you can write your code to make it clearer.
3. Don’t stress too much about languages
If you don’t know much about coding, you’ll see a lot of arguments about which language is best – each has passionate advocates, and you might worry that picking the wrong one can really set you back. Each has its own strengths and weaknesses, but there’s no single ‘best language’. But also, picking up a new language is a lot easier than learning your first one, so don’t be afraid to try something new if you run up against the limits of your first language. There’s lots of courses that can introduce you to new ones!
4. Make your code re-usable
If you’re writing a script to produce a figure for a paper, it can be tempting to just rush it: write something incomprehensible (just like the example in point 1!), save it in a file called figure.py, run it once, and forget about it. Unfortunately, it never actually works that way, and you can guarantee that Reviewer 2 will ask for the axis labels to be made larger, or wants the line colours changed.
Suddenly, the keystrokes you saved by writing nc instead of negative_charge end up costing you hours of frustration sifting through a folder of loose scripts trying to figure out which one actually made that figure and how to use it. A lot of effort that can be saved by clearer code and a few lines of comment at the top of a script!
This script generates the comparative spectra plot for the paper on supernovae with Jones et al
python figure_compare_spectra.py [path to simulated spectrum ] [path to real spectrum]
You’ll end up handing the code to others more than you expect too, like new PhD students so they can duplicate your plots, or to potential collaborators to reproduce your analysis on their data. Plus, more and more journals subscribe to the Open Science model, and encourage sharing of the code used to write the paper.
5. Code is Science
It’s tempting to assume that your code isn’t really science. Nobody’s interested, so you can just give it a brief mention and move on. But that’s short-selling yourself, and the field! If you read an experimental paper with a methods section that just said “We analysed these samples in our lab with chemicals”, you’d be a little concerned - you can’t really reproduce the paper without that detail, and if it’s not reproducible then is it really science?
New developments like executable papers  really drive this home - letting readers reproduce the paper’s analysis directly themselves. Websites like Zenodo also let researchers get DOIs for specific versions of their code , allowing you to cite exactly how you generated your results, as well as making it easy for you to get citations from other people using your code. The more code is shared and reused, the faster progress your field can make - standing on each other’s shoulders.
Code doesn’t have to be complicated! No matter how complicated the science behind the code is, you can always write the code that executes it in a clear, straightforward way that’s friendly to you and your collaborators - and that’ll make it easier for them to use it (and cite you). It’s easy to become a part of an open community of researchers who share code with each other, to the benefit of everyone.