If you’re at least midway through your PhD, you’ve probably already gained a sense of how quickly that data can accumulate. And if proactive data management steps aren’t taken early on it’s all too easy to find your research in a confusing mess. At best this results in a headache and at worst it could necessitate starting some of your experiments over from scratch.
Scary stuff!
Thankfully this can all be avoided by putting good data management practices in place. In this post I’ll run through why data management is such a crucial skill for researchers and share the techniques I used during my own PhD, so you’ll be well equipped to avoid any research data nightmares in future.
Note – While this post was originally written for those working towards a PhD, the techniques are equally applicable to anyone working with research data.
Why Data Management in Research Is Important
Data management is critically important as a researcher for three key reasons:
1. To set yourself up for success
You really don’t want to find yourself in a position where you’re at risk of misplacing, or misidentifying either:
- raw data collected from experiments, or
- your accompanying notes which detail the experimental setup and are themselves a source of data.
One of the best ways to ensure that your PhD is pain and stress-free is to stay organised with all sources of data. It isn’t the most interesting use of your time but a critical one to set yourself up for success.
2. It allows you to accurately recollect experiments over long periods of time
Since PhD projects span a number of years, there can sometimes be long periods (months or even years!) between conducting experiments and using the resulting data for things such as papers or your thesis.
Now, your memory may be better than mine, but I think we can all agree that it’s easy to forget details over such long periods, either in terms of which data files relate to which experiments, or exact details of how you conducted the experiment. This is particularly problematic given you’ll have likely run a number of similar experiments in the meantime. The different experiments can end up blurred in your mind faster than you expect.
Good data management allows you to easily find your original data and accurately describe what took place.
3. For your own sanity!
It’s only too easy to forget what conditions you tested things under and this can be a particular problem when you want to reproduce the results. This is extremely difficult if you don’t know exactly what you did!
Having detailed records accompanying the raw experimental data to refer back to will give you the best chance of seeing the same outcome. And should the results differ, it will help you to eliminate a number of possible causes.
If you don’t manage your data effectively you’ll quickly find yourself at risk of losing or misidentifying experimental results which can jeopardise whole portions of your PhD. For your own sanity please make efforts to manage your data effectively!
How to Manage Your Research Data
So hopefully you’re convinced that it’s important to manage your data, but how do you go about doing so?
Well, at a high level it’s having a system for:
- Storing raw data in a way that makes logical sense to you.
- Capturing the experimental setup at the time you’re conducting it. Alongside details of how you conduct the experiments, I also strongly suggest writing down anything noteworthy that happens during the experiment. Examples could be an unexpected reaction, low cell viability, or temperature levels higher than normal.
- Tracking the series of experiments you’re doing and having a system for ensuring you don’t lose track of the pairing between experimental data and experiment setup. For example, writing the filename for your raw data files in your notes of the experimental setup.

It’s worth saying that while in an ideal world this system would be set up from the start of your PhD, it’s never too late to adopt better methods. So if you know you’re guilty of poor data management make sure to set some time aside in the coming week to start getting on top of things.
I’ll next touch on some practical steps your can take to establish your data management process.
1. Take digital files off shared equipment quickly
Using any shared equipment for your experiments, such as for mechanical testing, analysis, imaging etc? Make sure to take off any digital files straightaway, so you can be sure that they don’t get deleted or lost.
By doing so immediately you’re also most likely to know exactly which experiment they relate to. If you leave it for days, weeks, or months you may lose track. Which brings us nicely on to…
2. Store raw data systematically
To stand any chance of staying sane during your PhD I’d highly recommend storing your digital files in a way that makes logical sense to you. This is one of the most important things to get in place early on, as it will turn into more of a nightmare as time goes by.
During my PhD I used the following folder structure:
Subprojects > Dates of experiments > Experiments that day > Equipment used > Data files
In practice this would mean for a given piece of work I’d end up with a folder of all the dates that I conducted experiments on. Then multiple subfolders if I conducted more than one experiment that day. Then for each experiment there would be subfolders if different equipment was used, then finally the actual data files. Each data file would be named in an appropriate way.
Example data file naming convention:
[date:DD/MM/YYYY]_[sample_number]_[experiment number}.csv
A more complicated naming structure may be required for certain experiments. It’s extremely important to make sure that the experiment number in the filename is consistent with whatever you have written in your lab book. Likewise, ensure that any physical samples are well labelled.
Alongside data files, usually I would have a Word document in the overall subproject folder to help me to keep track of the different experiments conducted. For instance, which dates I did a certain set group of experiments.
I’m not prescribing that you follow exactly this structure yourself, if you have something else which works for you stick with that. Just make sure that you have a system in place and stay consistent. Definitely make sure you don’t simply end up with lots of files on your Desktop or in your Downloads folder which are not well labelled!
3. Back up everything
A lesson I’m sure most of us have learned the hard way is to keep everything backed up. This is now easier than ever with universities almost always having storage available for you, either as part of the university network or cloud-resources. The University of Bristol for example has OneDrive and while I was at Imperial we had access to Box (Microsoft).

If these don’t provide enough storage then speak to your supervisor. In any case, ensure stuff is backed up. I’ve known of people who had hard drives fail or computers get lost and they lost months of work. Don’t let this become you!
During my PhD I set up my entire Documents folder to be synced to the cloud, this meant that I didn’t have to faff with moving content to another folder or manually backing it up. More than a few times it also proved very useful being able to access files from another computer.
Similarly, if you work on a project involving coding, make sure you’re pushing things to Github or similar version control systems regularly. Usually I’ll do this every day or two, perhaps I’m overly paranoid but I’d rather be safe than sorry.
4. Keep detailed notes
When it comes to noting down what’s happened during an experiment, it’s often a case of the more detail the better. You want to make a note of as much as would be necessary to completely reproduce the experimental setup.
If you always do something the same way then of course you don’t need to note it down, but make sure you’re focussing on everything which changes between experiments.
However, as a minimum I would suggest:
- Date;
- All settings/parameters/variables of the system. For example, how much of a chemical you’ve used, how long you kept something in the oven for, what settings you used on the analysis equipment;
- Noteworthy external factors outside of your control. I’ve known of experiments which were heavily influenced by the humidity of the air in the lab. If this is an impactful variable, I suggest recording it if you can;
- Anything weird which happened;
- What filename(s) the experiment relates to.
If you have all of those aspects written down you should be a good position in the future to reproduce the experiment if necessary. Furthermore it puts you in a great position to analyse your results in a more informed way.


I strongly suggest also taking pictures of the setup:

More than once during my PhD having pictures of an experiment saved the day and helped me to both reproduce the setup better than if I’d just had my notes, and enabled me to better understand the results.

5. Digitise paper notes
Paper notes are fragile but they can sometimes be the only option (e.g. in a wet lab). If you are in a position where your notes are on paper, consider whether to make a digital copy, even just by taking an image of them on your phone.
This is especially true of environments where you can’t even have your normal lab book with you. During my PhD I sometimes did work in a biohazard lab where you couldn’t bring in anything, which sometimes resulted in having to temporarily write notes on blue roll…

What to Do if You End Up in a Muddle
Nobody’s perfect and even those with the best intentions can have a few days where best practice falls by the wayside.
I personally had an issue once where I had to figure out which samples were which for a paper. Although I thought I’d kept great notes, I’d forgotten this key detail even though I’d gone to the effort of printing out a diagram of the anatomy (a knee) and marking which samples were which.
Thankfully I had taken a few pictures, but that still wasn’t enough:

Thankfully I was able to work backwards by piecing together different notes about each sample to figure out which was which. But I got lucky, I could have easily not been able to figure it out.
If you find yourself in a position where there’s some confusion around your data I’d suggest doing the following:
- Relax. Panicking likely isn’t going to help your memory.
- Be methodical. It’s up to you how you do this. If there are certain experiments you can confirm the details of by cross-referencing with your lab book or from other files, note these down. You could consider starting with a new folder system and transfer across files as you confirm which experiments they belong to.
- Use this as a learning exercise and motivation to implement a better system going forwards.
Summary: How to Master Data Management in Research
As a quick recap, below are the main takeaways from this post.
- Effective data management when conducting research is crucial for success, preventing setbacks and allowing you. to maintain your sanity.
- It’s important to establish clear systems for storing and documenting data, using logical folder structures and doing regular backups.
- Make sure to digitise any paper notes and experiment set-ups with photos.
- If you get in a muddle stay cool, calm and collected and organise your data in a methodical manner.
I hope you’ve found this post helpful. While research data management is not a particularly exciting topic it’s a vital skill to have and will set you up for success both for your PhD and beyond.
For further tips on best practice when running experiments you may wish to download my experiment checklist which is part of the free resource library. You can find out more here.

Do you have any other tips you’d like to share for how to master data management in research? Let me know in the comments!
