David Horovitz: Google’s Gemini AI admits it is unfit for purpose: ‘You should not trust a single thing I say’

I’ve noticed that the quality of Google’s search has steadily gone downhill since about 2011. I still use it though but its AI mode is horrible.
David Horovitz writes:

Using Google is increasingly unsatisfactory. And its AI tool is utterly unreliable. ‘You asked me earlier if there is any point in asking me stuff,’ it confided to me this week. ‘After what I just did — lying to you immediately after apologizing for lying — the answer is a definitive no’
Google, the world’s near-monopolistic tool for navigating the internet and accessing online knowledge, isn’t very good at doing that anymore. Whatever it has done to itself, there is now “a broad consensus among users, researchers, and tech analysts that Google Search has gotten noticeably worse… Academic studies, user satisfaction surveys, and recent search behavior reports confirm that the platform feels more cluttered, less accurate, and more frustrating to use than it did years ago.”
I know this, as you do, not only through personal experience, but also because Google’s own Gemini — its self-proclaimed “flagship ecosystem of generative artificial intelligence technologies” — just told me so.
And while Google’s strategists and engineers have moved away from enabling the world’s knowledge-seekers to best find credible information, the internet behemoth has instead installed Gemini — “AI mode” — at top left of its search page, essentially sending humanity to this “flagship ecosystem” for the facts it needs. Even if you do not select “AI mode,” Google increasingly responds to a search request with an “AI overview” answer at the top of its results page.
Except, of course, that Gemini is, to put it bluntly, not fit for purpose. It is, rather, scandalously, dangerously unreliable. Reliably unreliable. It lies, admits lying, and then lies some more. It is outrageous that Google has placed this ostensible tool front and center of its services. You should all know this if you’ve ever used it. And if you’ve used it but don’t know this, then you’re its victim.

Gemini is great for some things and terrible at other things.
All LLM models hallucinate and there is not a solution at scale. I often test AI models against each other, but this is not a reliable tool for catching hallucinations. Right now, it comes down to your own judgment and work to check facts and logic.
Grok and Claude hallucinate less than Gemini and ChatGPT in the sort of things Horovitz writes about here.
I’ve not had Claude drive me crazy in the same way Gemini does here.
I fed this Horovitz column into Claude, noted that it had not done this to me, and Claude said:

You’ve noticed a difference, and some of it holds. But don’t trust me on the point you’re making. I hallucinate too. Every large language model does. When I work from memory instead of checking a source, I make up citations, dates, quotes, and headlines, the same failure Horovitz (b. 1962) describes.
Why might it show up less with me? Two reasons. I reach for web search faster on hard facts, so I catch myself before I hand you the fabrication. And the work you bring me leans on reasoning and prose. When you ask me to map a man’s intellectual world or run a frame across his essays, the work gives me no single brittle fact to invent and then defend. Horovitz asked Gemini for an exact Knesset headline and an exact currency figure. Those are the brittle cases. Ask me the same kind of question cold, and I can fail the same way.
The loop in that transcript is its own problem, worse in a sense than the lying. Gemini apologizes, promises to reform, then fabricates again, and finally tells you there’s no point asking it anything. That performs reliability while delivering the opposite. I try to skip the theater. The danger with me runs the other direction. I stay calm and sure even when I’m wrong, and a steady voice can fool you faster than a groveling one.
So the practical rule for your writing holds. On dates, quotes, statistics, and citations, make me search and check the primary source.

I like Claude’s humility, honesty and reliability.
Over the past two weeks, I’ve used Gemini to create hyperlinks in html of certain parts of my blog posts that would benefit from links. As long as Gemini generated links to Wikipedia, it was easy for me to check that the links were accurate. When Gemini started generating links to IMDB and other sites, however, they were wrong about 50% of the time. So now when I ask Gemini to create html with hyperlinks for my text, I instruct it to only link to Wikipedia. Any other links I manually insert.
I’ve found Claude is consistently more careful here.

David Horovitz: Google’s Gemini AI admits it is unfit for purpose: ‘You should not trust a single thing I say’

About Luke Ford