As I’ve previously discussed, I’ve been running a side by side trial of the currently available CGM systems in the UK. It focused on those that are available now, or very soon, on the NHS prescription tariff, and should be available via your UK GP.
I also added in the Medtrum Nano, as an alternative that’s not yet made any reference to doing the same thing, and the Dexcom G6, which is used to drive my closed loop.
During the course of this process, the following CGMs were in use:
- Dexcom G6
- Dexcom ONE
- GlucoRX Aidex
- Glucomen Day
- Medtrum Nano
This is the reference document that is currently being provided within the NHS. As you can see, we are missing the Libre3, Dexcom G7 and Medtronic offerings in this test:
The intention was to use the same reference data for all the sensors to provide a comparison.
Unfortunately, during the course of the two weeks, I lost sensors.
Libre2 fell off quickly, and the replacement didn’t arrive in time to participate. As a result, a previous set of data using the same fingerprick device has been used to provide an error grid.
Glucomen Day fell off after 4 days, so a spare was applied and the trial continued.
The first set of error grids shows all the data collected during the period for each of the devices. I’ll also show the MARD vs fingerpricks (MARDf).
I’ve included a MARDf excluding Day 1 value as well, as we know that Day 1 is generally worse in our experiences with Libre and Dexcom..
Full dataset error grids
This initial error grid overlays data from all the sensors to try and give an idea of how the different sensors compare.
What it tends to highlight is that the three newcomers (Medtrum, Glucomen and GlucoRX) showed more incidents of dispersed readings at high and low levels than the incumbents (Dexcom and Libre). It also shows that, in general, most of the sensors, appeared to provide values slightly higher than the fingersticks.
The individual plots make it easier to see what’s going on.
As mentioned, the Libre2 data is from a different dataset so shouldn’t be considered representative for this test, and is included for illustration only.
What’s clear from all the individual plots is just how widely the data points are dispersed on pretty much every sensor other than the big providers.
What’s perhaps more concerning are the numbers in Zone C, where the blood level is much lower than the sensor indicated reading and could present a significant risk to the user.
This is much higher for the newcomers.
Suffice to say, these error grids clearly show the differences in found in using the sensors.
MARD vs Fingerpricks
Based on the data captured during the 10 or 14 days of using each of the sensors, I’ve calculated a MARD value versus the fingerprick data from a Contour Next One metre. As mentioned previously, this is described as MARDf. Also included in the table are the manufacturers’ stated MARD values, registering the sensor vs a Yellow Springs Instruments analyzer.
As can be seen here, the dispersion shown in the error grids is reflected, unsurprisingly, in the MARDf values that came about from this test.
It’s worth noting that the Dexcom 9.0% value (as reported in their accuracy study of a factory calibrated sensor) is a composite number made up of both adult outcomes (9.8%) and paediatric outcomes (7.7%). It’s also worth noting that the same data was used in both the G6 and ONE manuals, further cementing the point that the glucose sensing and translation components are identical.
Previously, the values were:
Based on this, I don’t consider the numbers this time around to be an aberration, in fact, they’re very much in line with previous experiments. In many ways, this is disappointing.
All of this analysis comes with the caveat that this is n=1, and that we’re talking about only one or two of these sensors. On this individual, even where nsensor>1, some of them really aren’t very accurate.
Conclusions so far, and additional analysis
Given previous experience of many of these sensors, I was unsurprised, if a little underwhelmed, by the performance I saw when I put them head to head. While there are six sensors, I think descibing them as “Six of the best” may be a little generous to some of the offerings.
Before we draw final conclusions on the accuracy/safety of these devices, I’ll have a further dig into the analysis of the data, and produce a view of the results showing what percentage of pairs were within the FDA iCGM standard. This is 15mg/dl when reference value was below 3.9mmol/l, 15% above that, for each sensor. That will give a better idea of how they perform in comparison to the FDA standard for use with closed loops.
Thus far, I think we can conclude that while all provide CGM capabilities, all CGMs are not created equal. However much you cover off features in the software, if the key component isn’t particularly effective, is it really worth the money?