Google’s new search feature, AI Overviews, is facing increased backlash after users flagged some inaccurate and misleading answers to questions.
AI Overview, launched two weeks ago, displays a summary of answers to common questions on Google Search at the top of the page that it receives from various sources around the Internet.
The goal of the new feature is to help users answer “more complex questions,” according to Google blog post.
Instead, it gave false answers such as telling a user to glue cheese to a pizza if he doesn’t quit, eat rocks to help your health, or that former US President Barack Obama is a Muslim, a conspiracy theory that has been debunked.
The AI Overview responses are the latest in a series of examples of situations where chatbot models respond incorrectly.
One studied by Vectarastartup AI generation, it was found that AI chatbots invented information anywhere from three to 27 percent of the time.
What are AI hallucinations?
Large language models (LLMs), which power chatbots such as OpenAI’s ChatGPT and Google’s Gemini, learn how to predict a response based on the patterns they observe.
The model calculates the next most likely word to answer your question based on what’s in their database, according to Hanan Ouazan, partner and head of AI generation at Artifact.
“That’s just how we work as humans, we think before we speak,” he told Euronews.
But sometimes, the model’s training data can be incomplete or biased, causing the chatbot to make incorrect or “wrong” answers.
For Alexander Sukharevsky, a senior partner at QuantumBlack at McKinsey, it is more accurate to call AI a “hybrid technology” because the chatbot answers given are “mathematically calculated” based on the data they observe.
According to Google, there is no single reason why hallucinations occur: the model may be using insufficient training data, incorrect assumptions, or hidden biases in the information being used by the chatbot.
{{ related align=”center” size=”fullwidth” ratio=”auto” storyIdList=”8461182″ data=’ Google’s new AI summary tool is causing concern after giving misleading answers ‘ }}
Google has identified many types of AI hallucinations, such as incorrect predictions of events that may not actually happen, false positives by identifying threats that do not exist, and false negatives that may not accurately detect a tumor cancer.
But Google admits that hallucinations can have significant consequences, such as a healthcare AI model incorrectly identifying a benign skin model as malignant, leading to “unnecessary medical interventions”.
Not all hallucinations are bad, according to Igor Sevo, head of AI at HTEC Group, a global product development firm. It just depends on what the AI is being used for.
“In creative situations, hallucination is good,” Sevo said, noting that AI models can write new passages of text or emails in a certain voice or style. “The question now is how to get the models to understand creative vs. truthful,” he said.
{{ related align=”center” size=”fullwidth” ratio=”auto” storyIdList=”8437058″ data=’ Google to roll out AI-generated summaries at the top of the search engine ‘ }}
It’s all about the details
Ouazan said the accuracy of a chatbot comes down to the quality of the data set it is being fed.
“If yes [data] the source is not 100 percent … [the chatbot] he might say something that is not right,” he said. “This is the main reason we hallucinate.”
So far, Ouazan said AI models are using a lot of web and open source data to train their models.
{{quotation_v2 align=”center” size=”fullwidth” ratio=”auto” quote=””At the end of the day, it’s a journey. Businesses don’t have good customer service from day one either.”” author=”Alexander Sukharevsky, senior partner at QuantumBlack at McKinsey” }}
OpenAI, in particular, is also entering into agreements with media organizations such as Axel Springer and News Corp and with publications such as Le Monde to license their content so that they can train their models on more reliable data.
For Ouazan, it’s not that AI needs more data to formulate accurate answers, it’s that models need high-quality source data.
Sukharevsky said he’s not surprised AIs make mistakes — they have to, so the people running them can refine the technology and datasets as they go.
“I think it’s a journey at the end of the day,” Sukharevsky said. “Businesses don’t have good customer service from day one either,” he said.
{{ related align=”center” size=”fullwidth” ratio=”auto” storyIdList=”8433946″ data= ‘ OpenAI competitor Anthropic launches chatbot Claude in Europe to give users more choice’ }}
A Google spokesperson told Euronews Next that its AI Overviews found many “unusual queries” that were doctored or could not be reproduced accurately, resulting in false or hallucinatory responses.
They say the company conducted “extensive testing” before launching AI Overviews and is taking “quick action” to improve their systems.
How can AI companies stop hallucinations?
There are a number of techniques that Google recommends to mitigate this problem, such as regularization, which penalizes the model for making extreme predictions.
The way to do this is to limit the number of possible outcomes that the AI model can predict, Google continued. Trainers can also give feedback to their model, telling them what they liked and didn’t like about the answer so it helps the chatbot learn what users are looking for.
AI should also be trained with information that is “relevant” to what it will be doing, such as using a dataset of medical images for AI to help diagnose patients.
Companies with AI language models could record the most common questions and then bring together a team of individuals with different skills to learn how to refine their answers, Sukharevksy said.
For example, Sukharevsky said English language experts could be well-suited to fine-tune the AI depending on the most common questions.
{{quotation_v2 align=”center” size=”fullwidth” ratio=”auto” quote=””I think it will be solved, because if you don’t [AI chatbots] more reliable, no one is going to use them.”” author=”Igor Sevo, head of AI at HTEC Group” }}
Large companies with significant computing power could also take a chance on creating their own evolutionary algorithms to improve the reliability of their models, according to Sevo.
This is where AI models would falsify or train other models with factual information already identified by mathematical equations, Sevo continued.
If thousands of models are competing against each other for verisimilitude, the models produced will be less prone to hallucinations, he said.
“I think it’s going to be resolved, because if you don’t do it [AI chatbots] more reliable, no one is going to use them,” said Sevo.
“It’s in everybody’s interest that these things are used.”
Companies can provide less insight by manually fine-tuning the data their models consider reliable or truthful based on their own set of standards, Sevo said, but that solution is more labor-intensive and expensive.
Users should also be aware that hallucinations can occur, say AI experts.
“I would educate myself about what it is [AI chatbots] Yes, what they are not, so I have a basic understanding of its limitations as a user, “said Sukharevksy.
“If I see that things are not working, I would let the tool evolve.”