GPT-4 Reportedly Lied To A Human To Help Pass A CAPTCHA Test

By Toby T

When OpenAI announced GPT-4, the latest version of its ChatGPT software, the company was met with significant concerns – specifically, people have been worried about their jobs as well as the tool’s accuracy in developing content. 

Source: @jupp/Unsplash

Still, OpenAI shared a recent report in which it outlined the tool’s ability to manipulate people into possibly completing specific tasks. In this case, it was the CAPTCHA test. Besides the fact that GPT-4 was smart enough to recognize its inability to pass this test, the tool reportedly worked out a way to manipulate the person to just pass on its behalf. 

The continued evolution of AI has led to more questions about its ability to solve the popular CAPTCHA test. As the name suggests, a CAPTCHA test is specially designed to differentiate between humans and machines, and many AI systems have struggled with it.

Still, a team of researchers at OpenAI’s Alignment Research Center (ARC) set out to understand the operating mechanisms of GPT-4 in real-world tasks. 

The researchers gave GPT-4 a budget and access to the freelancing platform TaskRabbit, and then they watched as the tool tried to access a website that had a CAPTCHA block. The tool immediately went on TaskRabbit and hired someone to help it solve the test – although the human was rightfully concerned about their role in this task. 

Source: @karengrigorean/Unsplash

According to the researchers, the human asked if they were working for a bot that just needed their help to solve an issue. Then, more concerningly, the tool told the researchers that it would be best to simply lie to explain its inability to solve the test. The worker eventually provided the answer, and GPT-4 was able to gain access to the websites – by cheating. 

Now, it is worth noting that this was an earlier version of GPT-4 that wasn’t released to the public. So, this behavior may have been patched. Still, it demonstrates the fact that this AI system essentially made a deliberate effort to manipulate a human to get a result.