Why is the cheese sandwich the downfall of AI carb counting, even with “3D scanning”?

A slightly rhetorical question, I’ll admit, but a pertinent one. Once again, I have been trying out the latest (read most heavily pushed on Facebook Diabetes groups by its founder, to the extent that using your mum as an advocate – or your mum’s account – is really taking it a little too far) app that someone thinks should work well for carb counting.

This time we have Carbetic. With it’s flagship “3D scanning” claims to provide carb counts for food, and a social media marketing campaign that was very much “in your face”. And what do we do when someone does that? We try it out and see what the results are…

And boy is that a claim. So to prove that it’s as good as they say it is, there are two studies planned to test accuracy.

Two clinical accuracy studies are currently in preparation: with Dr. Martin Tauschmann, PhD at MedUni Vienna (children) and with Dr. Katharina Vafiadis-Maruszczak, MSc, PhD at Klinik Wien Landstraße (adults). Carbetic is, however, not a medical device — it is a tool for estimating carbohydrate content.

Which is always a useful way of checking that these things work. Now the other thing that’s important is the replicability of results, as we’ve previously discussed on Diabettech, so I hope that the studies also include something referencing this.

But onto Carbetic. There’s not much information on how it works, other than it’s AI based using some form of size analysis based on 3D scanning your food by taking three photos. So we thought we’d do what we always do at Diabettech, and put it to the test to see whether it’s as good as the marketing says.

Testing Carbetic. The method

This is not a particularly in depth or scientific examination of Carbetics capabilities. Most of it stems from making food, following the model provided and seeing how well it analyses what it has been given.It shouldn’t be that hard. We did a small amount of repetition to see if we got the same results from multiple attempts, but that wasn’t really the core of the testing It was to see whether the 3D scanning produced decent results.

The photos that were provided included cheese and ham sandwiches on plain bread and on baguettes, because I love a sandwich, and AI carb counting doesn’t seem to be very good at estimating what the size of one of those is. It also included various cakes, biscuits, sausage rolls and a couple of cooked meals. And of course, for each thing I ate, I had to take the three photos.

I also explicitly didn’t add info to the photos, again, because what we’re testing here is the machine’s ability to interpret photos, and because we already have enough to do with the three photos, so adding more stuff feels like overkill.

But what you really care about is… So, what happened?

Carbetic. The results.

As mentioned, the cheese sandwich got its place in the sun, along with the following items, and an estimate of the carb counts for each of them based on weighing the items and the content involved. But first, here’s a little hint on the cheese sandwich…

Multiple images of a cheese sandwich in Carbetic with Carb Counts

Below is a table of the foods eaten, the weighed portion sizes of those foods and the carb estimates as evaluated by Carbetic alongside human calculated carbs based on weighing the items and using “typical sizing” or the pack nutritional information. I’m not going to claim that the human carb counts are perfect, however the portion sizes are a good guide as to whether Carbetic’s 3D scanning technology works.

As the table shows, in this small subset of tests, the 3D scanning regularly underestimated the portion sizes of the items presented to it. In fact, on average, Carbetic underestimated the portion size content by 17% and the carb content by a whopping 38.5%.

In terms of “Within Image consistency” (as we discussed here) on our favoured cheese sandwich, it was asked five times to assess the meal. The variance over those five samples was 6g of carbs, which turns out to be 37.5% of the lowest value it estimated and 27.2% of the highest value it estimated.

What are your thoughts?

For an app that has been so heavily pushed on diabetes facebook groups, its performance was disappointing. Whilst this is only a small sample of food tests, a number of things became clear.

  • Taking three photos of your food is, in terms of sitting down for meals with family and friends, intrusive.
  • The major breakthrough of 3D scanning your food with three photos taken from three different angles isn’t as good as I’d hoped. In terms of both portion size and carb count, it underestimated a lot.
  • The carb counting left a lot to be desired. A mean error of -38.5%, or -17g +/- 13.6 is worse than human performance in trials and also not that different compared to various generic AI tests (1, 2)
  • The tell of generic AI being involved (the repeatability issue that I’ve previously raised) is present here again, with the same cheese sandwich being photographed multiple times and submitted within quick succession, generating significant differences between carb counts.
  • This is another AI tool that can’t portion size a sandwich. Much like the previous experiments, both the cheese sandwich and the ham and cheese sandwich were significantly underestimated. Given that the last time we saw this, both Claude and Gemini ended up going down that rabbit hole, it suggests that one of those two is responsible for carb counting in this case.

So while it seems like a nice idea, the “3D scan” model deployed here doesn’t seem to have significantly improved the performance of the underlying models, and the carb counting error across this admittedly small sample appears to be worse than the experiment that we have previously run. It doesn’t appear to be vastly different to Snaq (without Lidar) in its evaluations of food.

Now the original question was “Why is the cheese sandwich the downfall of AI?”, and the answer for this appears to be because of the bread, as across the multiple tests of different AI systems, it seems that portion sizing on bread is incredibly difficult, in spite of it being immensely simple to look at, from a human perspective. Carbetic seems to fall into the same trap.

But would I pay for the subscription for this? No, I wouldn’t, as it has proven not to work very well on foods I already know what to expect, and would therefore raise questions on every other food I took three photos of.

When AI can recognise, and correctly carb count, a cheese sandwich, then maybe I’ll consider buying in.


Discover more from Diabettech - Diabetes and Technology

Subscribe to get the latest posts sent to your email.

Be the first to comment

Leave a Reply

Your email address will not be published.


*