One thing that ICH-GCP emphasises is collecting “high-quality” data. It then goes on to spell out the considerable efforts that are required to achieve this focusing on having complete data and having that data verified against the patient’s medical notes. This is usually done by the trial sponsor sending a trained person called a monitor (aka CRA, Clinical Research Associate) to the site, say a hospital, to track down the missing data and to compare the data collected in the trial with the original data recorded in the patient’s notes. These are done every few months so not surprisingly these monitoring visits are a major cost component of the trial along with all of the other checking of data that is involved. Again, it sounds sensible doesn’t it? But, it’s not.
The reason it’s not sensible are to do with the unique value of randomisation. As you can read more about here, proper random allocation, ideally with blinding, ensures that any difference seen between the treatment groups are the result of the treatments being tested and not caused by other factors. A non-randomised study does not have this unique advantage in assessing treatments as any difference seen between treatments might be due to the treatments, but could also be due to other factors called confounding. Some of these confounders, like age and gender, are known and can as such being taken account of in the statistical analysis, but other forms of confounding are not known and might rather than the treatments being tested be the reason for the difference observed.
So, getting back to “high-quality” data, not having complete data on every data point might not matter as the proportion of missing data will be the same in both treatment groups as the result of randomisation so might not affect either the reliability of the result or the safety of the participants. If the data that is missing is important to either of these then this would be followed up and corrected, possibly remotely or by a monitoring visit. So, having complete data, that is not having any missing data is not as important as it first seemed.
The second part of “high-quality” data that GCP emphasises is verification, what GCP calls SDV (Source Document Verification), which involves confirming that the values collected for the trial are correct by comparing them with a source document, usually the patient’s medical notes. You frequently see “100% SDV” done in trials, which means that every piece of data collected in the trial has been checked against an original record for the participant at the site and which is presented as a badge of “high-quality” data. There are two problems with this. The first is the time, effort and cost that goes into SDV, so rather than mindlessly doing 100% SDV it would be better to determine what data is important in terms of the reliability of the result or the safety of the participants and focus on verifying that. The second reason is how we do trials now compared to twenty years ago when GCP was written. Most trials now collect data directly into computers with software that does checks on the values as they are being entered and will ask for errors in entering the data to be corrected if, for example, the value entered is not possible. If the data entered in this way is only collected as a part of the trial then there isn’t a source document in the patient’s notes (although because of ICH-GCP trialists might have to do silly workarounds like printing out a copy of the data collected on the computer and putting that in the patient’s notes or a study nurse having to write the values in the patient’s notes after the visit).
In summary, when GCP emphasises “high-quality” data it focuses on having complete (i.e. no missing data) and complete, source document verified (i.e. 100% SDV) data. The problem is that for the reasons above this does not directly lead to a “high-quality” trial, which is one that reliably addresses an important question and keeps participants safe. High-quality data might be neither complete or totally verified, but if the trialists have carefully considered what data is important they will focus on and tailor activities like monitoring visits to this on the basis of the importance of the data to the reliability of the trial result or the safety of the trial participants. The the outcome is a high-quality trial. GCP promotes bad-trials by emphasising this notion of “high-quality” data, trialists then do trials to collect complete and fully verified data while forgetting to reliably answer the question that the trial was meant to answer. If it sounds as if GCP stops people thinking, it does. If it sounds to you that GCP allows bad-trials that fail to reliably address the original question and disguises these as good-trials it does.
If you want to read more about this, my colleague, Martin Landray, has worked with CTTI over the last few years to promote a new trial concept, “Quality by Design” (QbD) that tries to overcome some of these problems with ICH-GCP. Martin, summarises QbD like this, “identify what errors matter, plan to avoid them, and monitor and respond to them accordingly.” In one word, think!
If you have any thoughts or comments, let us know below.
Cartoon courtesy of Gap in the Void