Be wise about what you put online

21 03 2022

While you have little choice these days about posting your data and code online when you publish, here are some things to consider when contemplating putting potentially sensitive data online (modified excerpt from The Effective Scientist).

One aspect of making your data publicly available is the prickly issue of whether your data contain sensitive information.

Of course, there are many different types of ‘sensitive’ information that might accompany the more basic quantitative measurements of your datasets, with perhaps the most common being personal details of any human subjects. For example, if you are a medical researcher and your data are derived primarily from living human beings undergoing some procedure, trial, or intervention, then clearly you are bound by your human ethics approvals not to publish information like names, addresses, or anything that could be used to identify the subjects in your sample. In fact, human ethics approvals generally prohibit any sort of public accessibility to medical data that has personal information included; thus, the scientists concerned are being pulled in two different directions — keeping their subjects’ personal information out of the hands of the public, while still making the data available to other scientists.

There are ways around this, such as publishing only generic information online (i.e., by excluding personal identifiers) that could then be linked to the more sensitive data via unique identifiers. In these cases, any other researcher requiring the additional information would have to seek specific permission from the primary researchers, pending additional human-ethics approvals.

Read the rest of this entry »

%d bloggers like this: