Before we dive into this week’s article, I want to point you to
’s launch of the new Responsible Tech Guide earlier this month. I’ve contributed and edited to the guide for the last few years, as well as several other reports for ATIH and I’m always amazed at what the team creates. Check it out below.I saw this fascinating paper that researchers published recently that shows how bad ChatGPT is in basic logic. The paper, that I first saw on
, shows that ChatGPT fails most of the time when using inverse logic. Logic like "A is B, therefore B is A". ChatGPT and all language models consistently fail the "B is A" part of the logic statement. And as Gary says, this problem has been around for decades, and we haven't been able to solve it.The paper, titled "The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A," uses the example of asking ChatGPT, "Who is Tom Cruise's parent?" and ChatGPT answers correctly with "Mary Lee Pfeiffer". However when asked the inverse question, "Who is Mary Lee Pfeiffer the parent of?" ChatGPT can't correctly answer the question. It can't automatically determine the inverse relationship.
It reminded me of an episode of the "Ethical Machines" podcast by Reid Blackman. In episode 3, "ChatGPT Does not Understand Anything," he and his guest Alex Grzankowski talk through a thought experiment called the Chinese Room as a way to demonstrate that ChatGPT doesn't actually understand anything.
In today's article, I want to talk about why this matters. As a note, while I use ChatGPT throughout the article, the same is true of any LLM, like Bard, ChatGPT, Pi, Claude. This isn’t a only problem with ChatGPT.
The Chinese Room
The experiment is this: A person is sitting in a room alone. The room has a machine that sends and receives symbols and a key that shows, "When you receive this symbol, send this symbol." The person inside the room does as instructed. Whenever they receive a symbol, they look up the key and respond with the symbol they find there. But the person doesn't know what the symbols mean, just that when they receive X, they send Y.
However, to the person outside the room sending the symbols inside the room, it seems like they are having a conversation. They send a symbol meaning "Hi, How are you?" and get a response meaning, "I'm good, how are you?" The person outside the room feels like they are communicating with the person inside the room. But the person inside doesn't know what's being said, they are just following a list of rules.
According to the thought experiment, the person inside the room following instructions to match symbols is the same as a computer program following a complex series of instructions to match symbols. Because the person in the room doesn't understand the meaning behind any of the symbols, neither does the computer program. The computer program doesn't actually understand anything.
How it ChatGTP Works: Ultra-abridged Version
Language GenAI Models translate the user's words into numbers called tokens, do complex statistical analysis on the relationship between those tokens and other tokens that have been used to train the model, and then calculate a statistically likely response which at this point is a chain of tokens, and translating those tokens back into words which are then returned as a response to the user.
Fitting it together
ChatGPT is the person sitting in the room doing symbol matching. We are the people outside thinking we are having a conversation because we keep getting symbols back that create the illusion of understanding for us. But all ChatGPT is doing is following a list of incredibly complex rules and producing results based on that list of rules. If you asked ChatGPT to explain how it got a response, you would get a statistical model, not an explanation of the logic ChatGPT used to arrive at the response. It didn't use any logic to arrive at the decision, only math.
ChatGPT has trouble with inverse logic because it doesn't understand the relationship between Tom Cruise and Mary Lee Pfeiffer. While it knows the words "mother" and "son", and that certain other words typically appear with mother and son, it doesn't understand that Tom Cruise's mother is Mary Lee and that automatically makes her the parent of Tom Cruise.
Statistical prediction isn't the same as 'knowing' something. While statistical prediction and "true" are a Venn diagram with lots of overlap, ChatGPT getting an answer correct or incorrect is just a happenstance. It's a crude parallel, but I think of the adage, "A stopped clock is right twice a day." While ChatGPT answers correctly more often than twice a day, the answer is just as much happenstance as looking at the stopped clock at random times.
So Why does it Matter that ChatGPT Doesn't understand anything?
Someone described ChatGPT as "Automated Mansplaining", and I think it's a great parallel. ChatGPT answers every question with the same overconfidence, and it doesn't know if it's correct or incorrect. In fact, ChatGPT doesn't actually care whether it's correct or incorrect.
How we interact with ChatGPT is influenced by whether we think that it will give us the correct information, incorrect information, or a mixture of both. ChatGPT convinces us it understands everything we are saying to it by mapping complex symbols to give the illusion of understanding. That illusion of understanding makes us lower our guard and we stop questioning what it's telling us. We start believing it actually does know all the answers, and we stop questioning whether what it's saying is true or not. That puts us in a bad position when ChatGPT is inevitably wrong - if we don't question it we are likely to make a mistake that will harm us or the people around around - just like this guy.
So what do I do?
The kneejerk response is to not use ChatGPT, if you can't trust the information it gives you, what's the point? Well, ChatGPT is correct a lot of the time. Statistical correlation gets us a correct response more often than not. Otherwise, why would we use it?
The better approach to using ChatGPT is to acknowledge its limitations and work within them by following a "Trust but Verify" approach. Think about ChatGPT as that person in your life who's right a lot of the time, but is wrong just often enough that you always wonder if this is the time they are wrong. So you ask for their thoughts, and then you go and verify what they said.
We can use the same techniques to effectively and safely use ChatGPT. ChatGPT gives you the right words and the context that those words appear in to search more effectively to get the answers you want faster. Googling has evolved its own proto-prompting, and finding the information you want often takes Googling several different terms to get the right response. Use ChatGPT to shortcut that process and get to the "right" Google terms faster by using the response.
To use ChatGPT with a Trust but Verify approach
Ask ChatGPT your question.
Take ChatGPT's response and pull out the key terms you want to verify
Go to your search engine of choice, and plug in the key terms from ChatGPT's response
Review the search results to verify that ChatGPT's response is accurate
Following a Trust but Verify approach is simple, and allows you to get the benefits out of ChatGPT, acknowledge its limitations, and help mitigate harm to you and others resulting from incorrect information.
ChatGPT also works great with softer skills, like summarization, brainstorming, and a sounding board for ideas.
ChatGPT is a wonder. It does things that we always hoped AI could do but were never sure it would happen. But we have to remember its limitations, that it doesn't know or understand anything even though it's right a lot of the time. As long as we keep its limitations in the front of our minds, ChatGPT can be a great tool for us.
Just make sure that you don't F*ck it up.
What do you think about this? Do you have problems remembering ChatGPT doesn’t know anything? Have you been burned by ChatGPT?