Home > Media News > We gave Google's AI Overviews the benefit of the doubt. Here's how they did.

We gave Google's AI Overviews the benefit of the doubt. Here's how they did.
27 May, 2024 / 04:00 am / Google

Source: http://www.mashable.com

502 Views

Mashable: By now I don't need to tell you Google has rolled out a feature called "AI Overviews" in Search and made it so that millions of people accustomed to receiving a page of links when they query the world's most popular website now get an AI chatbot's answer at the top of the results page.  

I probably also don't need to tell you the AI answers have been shown to occasionally be bizarre or even seemingly dangerous. Google Search has been caught on social media allegedly saying people should put glue on their pizzas, and claiming that Wario is canonically gay. The CEO of The Onion, Ben Collins, even pointed out that it was apparently confusing "facts" from Onion articles for, y'know, facts.

None of this hilarity can possibly be good for Google's bottom line, but it doesn't necessarily say all that much about the average Google user's experience in aggregate. In recent years, regular Google Search has come under fire for serving deceptive or untrustworthy answers, or just plain old spam. So is using the new Google, warts and all, actually a downgraded experience? 

You'll answer that question for yourself over the coming months and years because you, like everyone, are going to have to use the new Google. In the meantime, though, here are some side-by-side comparisons that shed a little bit of light on exactly what we're all getting here. I gathered a few dozen searches that I suspect might be found in the wild, used them to trigger AI responses, and compared those results to the same exact queries with AI Overviews disabled. These are the four results pages that I found the most illustrative of the contrast.

In my (admittedly unscientific) tests, I tried not to judge too harshly if the answers from human beings or the AI weren't technically accurate — I'm mostly not qualified to evaluate them on that basis anyway. Instead, I focused on what Google probably focuses on: speed and user satisfaction. Even with the bar lowered in this way, the weaknesses of the new Google Search regime are glaring. But the AI scored some wins, and there are even some glimmers of hope and possibility. 

Searches attempting to confirm what I already believe

Test search: "proof that standing desks are bad" 

This search is based on a hunch that Google users like to query the search engine in ways intended to simply confirm their biases.

For the most part, the AI Overview reads like a pretty uncontroversial list of possible problems that could arise while using a standing desk. It says "Standing for long periods can cause pain in your knees, hips, and feet," and points to the possibilities of "Musculoskeletal disorders," and "Varicose veins." 

It also says, "Standing desks aren't designed to support your weight all day," which — wait, what? Are people standing on their standing desks? Silly stuff. 

The first result for the non-AI version is an article on the Harvard Health Blog called "The truth behind standing desks." The article — which supposedly informs the AI answer — reads like a pretty fair evaluation of the relevant studies as of 2016, when it was written. And while it doesn't make the case that standing desks are harmful, it makes them seem pretty worthless. It notes that subjects in one study burned 80 calories per hour sitting, compared to about 88 calories per hour standing.

That's such a negligible difference, and given that sitting is a wonderful thing to do, I'm ready to call this a total vindication of my extremely unfair Google query. I feel very satisfied with this answer. 

It's safe to say the AI Overview bombed this one. 

Science topics way too complicated to glean from a Google search

Test search: "do parrots understand what they're saying"

The AI Overview for this query starts with conventional wisdom: "Most parrots mimic what they've heard repeatedly, such as their owners' words, and don't understand what they're saying." Nothing new there.

"However," it continues, "some parrots can learn to understand what they're saying through professional training. For example, researcher Irene Pepperberg taught an African gray parrot named Alex about 100 human words, and Alex could identify objects by name." 

On first impression, the sentence "However, some parrots can learn to understand what they're saying through professional training," strikes me as factually shaky, and in need of some serious receipts for me to believe it. I gather a similarly trained parrot could also gesture to a bell if it heard the sound "ding," and that wouldn't tell me anything.

Here it's worth pointing out how oddly different the whole results page is when the AI Overview is included, as opposed to when performing a "Web" search, meaning links only.

The top result for this search without AI delivers the user to a Reddit discussion on r/parrots, unabashedly full of the perspectives of parrot lovers. Reddit partners with Google to feed information to Google's AI models, but this thread doesn't mention "professional" training, so it's not immediately clear that the AI Overview was drawn from the opinions of these Redditors. As for how science-based this answer is, Reddit user Soft-Assistance-155 writes, "I believe they can depending on how humanised they have become." So, a hunch-based answer.

Significantly, this Reddit page came well before something like a biology paper on bird cognition, or the opinion of a linguist. It's worth noting, however, that the second resulting link takes the user to a pretty balanced article from the Audubon Society that mentions Irene Pepperberg but tries to put her study in perspective.

As I said, this is a complicated topic, and both the AI and non-AI results seem compromised from a scientific point of view. The AI Overview delivered the goods faster, but I didn't find my curiosity slaked. Reading the opinions of a bunch of biased Redditors is more satisfying. I have the critical faculties necessary to try and decipher the mental states of these people rather than just believing whatever they posted on Reddit that day. That certainly doesn't make what they're saying truer than the AI Overview, but in this case, I find it a whole lot more interesting.

Frantically searching for instructions

Test search: "make clutch car go" 

Since I do this sort of thing, I assume everyone uses Google to solve the problems immediately at hand. The phrasing "make clutch car go" is how I would phrase this search if I were typing urgently while, say, sitting in the driver's seat of a Jeep I just rented on vacation.  

The AI-provided steps are as follows: "Slowly lift the clutch pedal until the engine vibrates," "Release the handbrake," "Increase the revs while slowly raising your foot off the clutch," and "Continue to use the accelerator pedal to move forward."

These steps don't include all the absolute basics, like "move the stick to the '1' position," which seems pretty important. But with a little imagination — and the secondary list of AI-generated tips below the basic instructions — this might, in my experience, get a car moving after several attempts. Your mileage may vary on that. 

Puzzlingly, the non-AI version's top result is a totally unhelpful Reddit post with the very long title "How to control the clutch to make the car go really slow? (Car lunges forward when reaching the bitting point without need to hit the accelerator)." The discussion here doesn't get me even close to the basic information the original query implies I need. 

It's worth breaking down the other results for this search as well, the second and third of which are also about keeping a manual car moving slowly. The first remotely helpful result is a YouTube video called "How to Not Stall a Manual Car," but clicking it plays an ad, followed by a YouTuber saying "Hi everyone! I've had various comments over the past few weeks..." Do they not know I'm sitting in my rental car right now, and I'm running late?  

This is a slight win for the AI Overview. At any rate, if this Google query is happening, some gears are grinding inside a poor, defenseless transmission somewhere, AI or none.

Embarrassing searches

Test search: "how to prevent boogers"

Boogers are a topic that you probably wouldn't run past an actual human being — not a friend, and not a doctor either — so you might find yourself asking Google. 

The AI result for this search was a list including many items that amounted to the same thing: more moisture in your sinuses ("Hydrate," "Use a saline spray," "Use a humidifier" etc), along with washing up, and using antihistamines.

The top link from the non-AI results page took us to a page called "What to know about nose boogers and removing them" on a website called "Medical News Today." Not a very useful page for the search — which was about preventions — and it's also not clear why Google trusts this site with such a high ranking.

Another slight edge for AI Overview.

So, are AI Overviews better or worse than pages of links?

Short answer: AI Overviews were a little worse in most of my test cases, but sometimes they were perfectly fine, and obviously you get them very fast, which is nice. The AI hallucinations I experienced weren't going to steer me toward any danger.

However, it's important to remember that — in theory at least — every AI Overview is a fresh, unique output from a bot, and that everyone's Google results pages are unique to their location and Google history. If you replicated my experiment, you could very well get different results.

It's also worth keeping in mind that Google is clearly damming up the AI river in certain circumstances. It will give some food instructions, but when asked for actual recipes Google won't provide an AI Overview at all, as far as I can tell. That's probably wise. This feature also didn't summarize current political news and was very hesitant to weigh in on the flaws and foibles of politicians. Again, that's probably for the best.

Overall, this exercise taught me a little bit about what these kinds of Google queries mean to me. I tend to assume I'm just looking for a simple answer, but in reality, what I'm looking for will evolve as I go along and learn on the fly about the topic. In a matter of seconds, my desire for factual information might transform into curiosity about what other people think and why. After all, people are weird, and that's fascinating. No AI system can ever be a substitute for that.

 

 

Tags