Inside the world of chocolate judging

Inside the world of chocolate judging

Why judging chocolate is still a human art

Words by Spencer Hyman

Print / PDF

At a time when AI is swallowing up everything in its wake, it’s refreshing to reflect on one of the few areas – JUDGING AND TASTING – that, for now at least, seems relatively safe. Tasting happens in bodies, in conversation, and in the moment, which makes it stubbornly resistant to being turned into clean digital data points from the relentless rise of Large Language Models.

There are some obvious reasons for this. Taste, texture and (even more) flavour are problematic for AI. For starters, there is no standard language for these systems to consume and then regurgitate – no Pantone-equivalent for flavour. More than that, language doesn’t just describe flavour — it shapes it. The moment someone says “red fruit” or “citrus”, others begin to find it too — not because it wasn’t there, but because attention has been articulated and directed. And then there is the fact that people literally taste differently: genetics, “oral microbiome”, mood and much more all affect perception (for some, even the “time of the month” or the “state of the stars” is thought to be critical). See here for more on this.

As we’re planning our first tasting evening in partnership with the Academy of Chocolate sharing award-winning bars this May, it seems a good moment to linger on some of the more HUMAN aspects of awards and judging. This tasting will share a selection of award-winning bars – including some we’ve sourced from makers we don’t otherwise stock. We’ve already managed to get our hands on bars from New Zealand and Jamaica exclusively for this event!. We’ll be discussing what we look for when we’re judging, and why these particular bars achieved their accolades.

The logistics (and politics) of judging

First and foremost, some congrats and thanks. Judging requires a MASSIVE logistical exercise to ensure it’s as accurate and fair as possible. Hats off to Katie and the AoC team for arranging to gather and then redistribute, multiple times, to multiple different people, literally thousands of bars.

The human role in tasting

Above all, though, it’s very HUMAN. Judging always involves very HUMAN preferences and impressions. It also involves lots of HUMAN interactions – a delightfully complex, often messy, interactive process; it’s the antithesis of punching numbers into a calculator or questions into a chatbot. It’s about personal opinions and discussions.
While the end result of a tasting may look like exam grades neatly spat out by a machine, that’s not how it feels as you judge. One super-experienced judge from the Great Taste Awards once gave a wonderfully human way to interpret the different stars: one star means it’s something you’d be happy to purchase for your own cupboard; two stars means you’d not only purchase it but also be delighted to gift it; and three stars means you’d be super grateful to anyone who introduced you to the bar/condiment/jam/cheese/whatever. Wine judging adds a further dimension by factoring in price and “typicity” — assessing how “true” a wine is to being, say, a Bordeaux, Burgundy or Australian Shiraz. Underlying all of this is a slightly uncomfortable truth: what you most enjoy is not always what is objectively “best”, and one of the hardest disciplines in judging is separating personal pleasure from perceived quality and from what is “expected”. That’s why on our wave assess what, and how much, you’ve ENJOYED a bar from its BLIC (balance, length, intensity, complexity).

Chocolate, typicity and frameworks

Chocolate doesn’t have “typicity” in quite the same way that wine, coffee, cheese, hams or mustards do. In many ways, this opens up huge opportunities for craft chocolate makers to express their creativity. At the same time, it can be frustrating for consumers trying to find their “favourite” origins in the way they might have a favourite coffee origin, tea, or grape variety.

Chocolate judging can – and does – borrow from other worlds: common terminologies, flavour journeys and concepts like BLIC (balance, length, intensity and complexity). We all do this slightly differently – for example, here is the framework we (Cocoa Runners) use internally. The approach used by the Academy of Chocolate is structurally similar, and frameworks like these are invaluable. Part of what they are trying to do is bring some structure to something inherently unstable — because even our memory of flavour is fleeting, and judging often involves comparing what is in your mouth now with what is already fading from a few moments ago. The easiest way to evaluate is to compare. Trying two or three bars side by side is FAR more revealing than focusing on one in isolation.

At the same time, palate fatigue is VERY real. Most wine competitions cap the number of wines tasted per day (e.g, 70-80 for IWSC) , and even then tasters talk about the need for constant “calibration” so that wine number 60 is judged as fairly as wine number 1. And I was pretty drained after the most recent round of AoC Golden Bean judging (a sort of ‘grand jury’ to pick out a winner from the best of the best), which involved “only” just over 30 bars.

Blindness, bias and different formats

The really fun – and challenging – part comes when individual impressions are brought together. There are LOTS of different ways of doing this. They all start in the same way: with you personally tasting. Almost always you taste “blind” (we try to do this even when tasting new bars in our office). But even then, we are never completely blind — texture, colour, snap and style all provide subtle cues, and with them expectations inevitably creep in.

In addition, you have to try to put aside your personal “preferences” (Gen has a penchant for some floral notes in her bars; I’m more partial to bright berry notes and love astringency). This is where BLIC and the flavour “wave” are invaluable. Even though a bar may not be exactly to your taste, you can (and should) still judge how well it’s been grown and crafted in terms of balance, length, intensity and complexity.

Beyond that, there are many ways to structure a tasting:

  • Sometimes you taste a single bar (or wine, or jam) and then immediately discuss it; other times you wait until you’ve evaluated a whole “flight”.
  • “Flights” themselves can vary: only dark bars, or a mixture of different styles; lined up randomly, or in order of percentage or origin.
  • Then there’s the scoring: is it simple arithmetic, or do you look for consensus on a shortlist of bars and then start again?

Each approach has trade-offs. Some formats encourage independent impressions before discussion; others lean more heavily on conversation and consensus. All of them, though, depend on humans staying present, attentive and honest about their own biases.

Etiquette, influence and practice

One other important note: it takes practice. Like swimming, running or learning a new language, tasting is a skill that develops with attention, vocabulary and repetition. You can get better at noticing senses that are “instinctive” – for example, separating sour from bitter, appreciating texture, parsing acidity, understanding astringency (and how this differs from bitterness….). In particular, for flavours (and smells), this is really like learning a new language (see here for more on our flavour wave).

In all the chocolate tastings and judgings I’ve been part of, what makes a great session is some very basic etiquette (and common sense):

  • Be friendly.
  • Be respectful.
  • Listen to others.

Or, put more bluntly: don’t be too noisy, don’t be too performative, don’t be too “opinionated” or insistent. And yes, don’t wear strong perfume and try not to cycle super fast if you are going to arrive hot and sweaty. One of the quiet risks in group judging is influence — a confident voice early on can tilt the room. The best tastings are those where independent impressions are allowed to form before any consensus begins to emerge.

I’m particularly grateful for a piece of advice given to me by Sarah Jane Evans at one AoC Grand Jury tasting, where she spotted a non-verbal “tell” I have when I’m really enjoying a bar. In case I forget this advice and inadvertently perform this “give-away” sign again, I’m not going to tell you what it is! But her point was clear: try not to broadcast your reaction too loudly, especially early on.

As I was once told at a tasting of sake by a super quiet but super smart Japanese sake expert:

能ある鷹は爪を隠す
(Nō aru taka wa tsume o kakusu)

Or to put it in English
“A smart hawk doesn’t show off its claws.”

Why this matters now

Above all, judging should be fun and HUMAN. It’s a great way to meet people, try some amazing bars, and learn from others. It is also a quiet act of resistance to the idea that everything important about food and pleasure can be captured, scored and optimised by machines.

Tasting is social. It’s a conversation — between palate and memory, between people around a table, between makers and judges who may never meet. It’s precisely because taste is unstable, embodied and socially negotiated that these shared experiences matter so much.

And back to AI … for now, it’s not “eating our lunch” (or bars). Amid all the AI tasting LLM models, data-driven ‘flavour maps’, predictions of massive change etc. is that these judgings remind us that taste remains a HUMAN dialogue — not a digital dataset.

I hope to see many of you at our upcoming tasting. We’re super excited as we’ve managed to source some really special bars from makers we can’t otherwise buy from from all corners of the globe. It’s going to be a unique chocolate adventure!