A central goal Cognitive Resonance is to help people understand how generative AI actually works. I often describe this as going “under the hood” of these models, but what exactly does that entail? It’s a fair question and frankly one that I can only properly answer via the Cognitive Resonance workshops.
Hey, did you know Cognitive Resonance offers an eight-hour workshop to build foundational knowledge of human cognition and generative AI? We also offer a shorter, 90-minute introductory session online. We’re currently booking sessions for 2025, so if you’re interested in learning more please email info@cognitiveresonance.net.
But lucky for you, you’ve subscribed to this free and informative newsletter, so this week we’re going to have some fun building our mental models of AI around a tiny actual AI model.
To start with, the folks over at Brilliant.org have created a delightful interactive visualization of a very simple AI model to recognize handwritten numbers. What’s particularly cool about this tool is that it shows the image processing happening in real time as you “draw” a number in the input field. Before reading further, go play around with it for a few minutes.
Fun, right? What’s particularly useful is that we can see the model adjusting its prediction as it passes data back and forth through its “artificial neurons,” which are just mathematical functions that weigh the inputted data and compute an output. Here's a screenshot of what it looks like in action, just in case a few of you are too lazy busy to play around with it (you can also click on the image to see a short video):
Even if you don’t quite grasp exactly what’s happening here, this visualization shows how an even a tiny AI model built on an artificial neural network is doing a lot of complex work. This number-recognition model has just two layers and only 50 neurons, so it often makes incorrect predictions. In contrast, ChatGPT4 is believed to have 120 layers and literally billions of neurons with trillions of connections between them. This is why when scientists say, “we don’t really understand how large language models work,” what they often mean is that it is extraordinarily difficult to trace through the trillions of mathematical calculations they employ. It’s like trying to track drops of rain in a thunderstorm.
The performance of AI models improves with more layers of neurons, but so too does the computing complexity—this is why Nvidia is now the world’s most valuable company. But another hugely important component of the AI model is the training data, the information fed into the model that determine the mathematical weights the neurons assign. Brilliant’s AI model, for example, likely made use of the MNIST database of handwritten digits:
Ok, so. Let’s pretend for the moment that this dataset represents the entirety of what our model has been trained upon. Under such circumstances, what do you think would happen if we drew something like what you see below?
This is obviously a four, at least to human eyes, and we’ve all seen children make this sort of mistake. But if our AI model was trained exclusively on the set of images I pasted above, it’s unlikely it would correctly identify it. This is because the image is “out of distribution,” as scientists say, meaning the image represents an outlier that the model hasn’t been trained upon.
That seems obvious, right? But the counterintuitive upshot of all this is that LLMs do not necessarily improve in their capabilities as a result of being trained on “high-quality data.” In fact, the opposite is often true—what’s actually needed is fuzzier data that includes mistakes such as backward-drawn numbers and other useful errors.
Why does this matter? I’ve often heard people suggest that large-language models will improve when they are trained on better data, and I see myriad efforts underway to build education-related chatbots using proprietary “high quality” curricular materials. I’m skeptical this will change anything, and hopefully now you see why. My friend Timothy Lee at Understanding AI also recently pointed me to an insightful comment from someone named “Razied” on the LessWrong site that cogently addresses this challenge:
The most salient example of this is when you try to make chatGPT play chess and write chess analysis. At some point, it will make a mistake and write something like "the queen was captured" when in fact the queen was not captured. This is not the kind of mistake that chess books make, so it truly takes it out of distribution. What ends up happening is that GPT conditions its future output on its mistake being correct, which takes it even further outside the distribution of human text, until this diverges into nonsensical moves.
To solve this problem you would need a very large dataset of mistakes made by LLMs, and their true continuations. You'd need to take all physics books ever written, intersperse them with LLM continuations, then have humans write the corrections to the continuations, like "oh, actually we made a mistake in the last paragraph, here is the correct way to relate pressure to temperature in this problem...". This dataset is unlikely to ever exist, given that its size would need to be many times bigger than the entire internet.
This argument is similar to what I’ve described as the “Grover problem” for LLMs, in homage to this great paper by Raji et al. Once these models are forced to make predictions that fall outside the data they’ve been trained upon (“outside of distibution”), they become very brittle and unreliable. The only way to solve for this via training would be to generate impossibly large datasets that encompass every variety of mistake that an LLM might make. There’s just no way. It’s akin to building a map of an empire so large the map becomes the empire itself.
If your head is spinning, well, that’s why Cognitive Resonance offers workshops to explore these concepts in greater detail. But in the meantime, be skeptical of those claiming that the salvation of LLMs lies in better training data. Fuzzy is better.
Last week, Fonz Mendoza of MyEdTechLife graciously invited me to join him for his 300th (!) ed-tech podcast to talk about AI in education. He asked great questions, and we recorded it on Monday when I was in a chipper mood. Please give it a listen on your next long commute:
This made me think of the counterintuitive finding in historical manuscript analysis about the transition from oral history/tales to written accounts. It was long believed that writing something down, whether a genealogy or a creation myth, ensured greater fidelity and less mutability in the information or narrative contained therein.
However, analysis of manuscripts based on known dates of transition, i.e., the date that a monk or scribe first wrote down an ancient poem or record that had been passed down orally in the past, revealed the opposite: the longer a manuscript had been written down and reproduced by copying, the more variants between different versions were found.
The reason is simple once you think about it: if a bard or oral historian makes an error, it affects that one performance and is likely to be corrected in subsequent performances. But a written error is seen by everyone who reads the source document and is reproduced when the document is copied or excerpted.
I’m blown away but what you’re doing. Can’t imagine many better topics to educate on. Knowledge is the only true power 🧠 Now more than ever