ChatGPT and the power of Artificial Imagination
If I give you a sentence with some words missing, can you fill in the blanks? Let’s try:
“We’re heading to the beach next weekend, and hoping for ________”
With no other context, I’m willing to bet you can quickly come up with a plausible word or phrase to fill in the blank. In fact, I bet you can come up with several! Now, maybe you actually are planning to go to the beach next week and are hoping for something (plenty of sun, perhaps). However, most of you probably are not. So while I suspect you can come up with a bunch of valid completions to that sentence – none of those make the sentence true.
Does that make you a liar? No. Are you hallucinating? Probably not. You read the prompts I gave you, you inferred the context in which a person would say or write those words, and you anticipated what a person in that situation was going to say. Put another way, you imagined who the author might be and what their intent was. A lot of you probably jumped to hopes about the weather. But you may have also come up with other valid ideas – good waves for surfing, lots of nifty sea shells, light traffic, easy parking, or perhaps for it to not be too crowded.
Now think about this: how did you and I come up with those answers? This is a complicated question and the best experts in the relevant fields are still working to fully understand the workings of the human mind, so I will defer to them on the details. But I think a fair, basic summary is:
It started with your mind having established the context of the situation - you’re reading this article, I said I was going to give you a sentence, and I asked if you could fill in the blank. You then read the sentence, anticipating that you were going to be asked to complete it. As you did, a bunch of neurons in your brain fired in response to what you saw on your screen. They sent messages via synapses to other neurons, activating them. Through a long series of neurons activating other neurons, your brain identified patterns in what you read: shapes you recognized as letters and words, words you recognized for their meanings and connections to concepts – and for their similarity to phrases you’ve heard and said. Even the context surrounding those memories likely lit up in your mind very briefly. In an instant, you associated those words with all kinds of ideas – perhaps even images, remembered or imagined. From all of that, you extrapolated what the rest of the sentence could be. You might even have considered which answers are more likely/common and which are less so.
That’s more or less how “AI” language models like GPT-3 and the variant behind ChatGPT work.
Now, these models have no lived experience to draw on the way you do. They’ve never felt the sand or the sun. They’ve never tasted salt water or hurt from a sunburn. But they have read about such things. In fact, that’s really all they’ve ever done – read. They have read, read, and read some more. They’ve read more than any human could ever possibly read in a single lifetime. They’ve read countless accounts – factual and fictional, of going to beach. And endless other human experiences.
I’ve never been on a space station. I’ve never been part of an elite team of children recruited out of elementary school to go to space and learn to zip around in zero gravity – like that of the Battle School described in the novel Ender’s Game. But I can imagine it. In fact, I can imagine things that characters in those books would do in situations never described by the author. I’ve learned and internalized their patterns of behavior, and so I can predict how they’d respond based on their established character traits and what I “observed” by reading those novels.
Models like GPT-3 and friends do the same thing. They learn about things they’ve never done by reading the recorded or imagined experiences of others. That’s the power of language. The most powerful tool invented for programming the human mind is now becoming available to machines.
None of Ender’s Game is real – not the space station, not the characters, and certainly not my ideas about what those characters would do in situations I made up. But it is coherent. That’s the implied underlying objective of (virtually) all human writing. Not truth. Not correctness. Coherence. Plausibility. Ability to be accepted as a valid idea. This objective generalizes to fiction and non-fiction alike. It’s also the objective for which “Large Language Models” like the GPT-x family are trained.
These models take an input just like you did. We call it a “prompt”. They then process that prompt using a massive network of artificial neurons. Through the activation of these neurons they imagine words which could plausibly, coherently follow that prompt.
Now, I’m of course not saying that ChatGPT or its underlying LLM is “conscious”, or that its imagination works just like that of a human being. As far as I understand, we don’t yet know enough about how the human mind works to say exactly how similar LLM imagination and human imagination may or may not be. I think we can say that the human mind is capable of much more than imagining words that follow other words, among a whole host of other differences (including how we learn to perform that particular task). I’d argue many of those other things are important for achieving intelligence, which is why I shy away from using that word to describe LLMs. But I’d also argue that imagination of this sort is likely a key ingredient for human-like intelligence. A critical step in the journey toward true AI.
So how do you get from “imagine what the next words might be” to an assistant like ChatGPT? That all comes back to prompt. In the case of ChatGPT, the prompt is more than what you as a user type into the chat box. The actual prompt given to the model is something along the lines of:
“There is an assistant who can answer questions or fulfill requests posed by a person.
The person’s question or request is: <what you type>
The assistant’s answer is: _______”
The underlying model isn’t trained to answer questions. It’s trained to imagine words that make sense following other words, and to rank its imagined completions in terms of how likely or common (or positively reinforced by human feedback) each imagined completion might be. ChatGPT isn’t an AI bot programmed to answer your questions. ChatGPT is a figment of GPT-3’s imagination.