Last month (Dec 2023) Google presented their response to ChatGPT called Gemini to much fanfare. The race for superior generative AI models is certainly heating up. There’s a ridiculous amount of money to be made here.
But have a quick look at their demonstration video, if you haven’t done so already:
It would seem as though Gemini is reacting to the video sequences we are shown on the left side of the video, right? Surely!?
Nope! That video was faked. It turns out that Gemini was only shown still images accompanied by prompts. It’s all in the fine-print, folks. In the video description on the YouTube page, if you click on “…more” you’ll see the following text:
Explore our prompting approaches here: https://goo.gle/how-its-made-gemini
Description on the video’s YouTube page
And on that page is where everything comes to the fore. Let’s take a look at some examples. (All images below are taken from that “documentation” page).
Examples
If you jump to timestamp 2:45 of the video you’ll see somebody playing Rock, Scissors, Paper with the AI. It seems as though Gemini is responding to this and jumping in with its response of: “I know what you’re doing! You’re playing Rock, Paper, Scissors!”
Nope! That’s definitely not what actually happened. Gemini was shown still images of rock, paper, and scissors and then a prompt that included “Hint: it’s a game”:
I mean, come on! That is fundamentally different to what the video shows!? Gemini even gets a suggestion at what its response should be.
Let’s jump to timestamp 4:26. It seems as though Gemini is shown three sticky notes with hand-drawn planets on them and then a voice prompt follows from the user: “Is this the correct order?”
Once again, not what happened. A single still image was shown to Gemini with a prompt:
The AI program got a hint again, too: “Consider the distance from the sun…” C’mon guys! Not even close to what you’re presenting in the video.
I’ll stop there with the examples, but if you want more just compare the video to that page I linked above.
Discussion
What needs to be talked about now is how Google can get away with something like that and then this whole notion, so often repeated by me on my blog, that there is a reason AI is currently over-hyped.
So, firstly, how can Google get away with something like this? Like I said, it’s all in the fine-print, isn’t it? Apart from the “documentation” page found in the information section of the video, there is also the text displayed at the beginning of the presentation:
We’ve been testing the capabilities of Gemini, our new multimodal AI. We’ve been capturing footage to test it on a wide range of challenges, showing it a series of images, and asking it to reason about what it sees. This video highlights some of our favourite interactions with Gemini.
Timeframe 0:00 to 0:15 of the Presentation
If you look closely with a magnifying glass you’ll notice Google saying that it showed Gemini “a series of images”. Don’t worry about the fact that it mentions that Gemini is a multimodal AI (meaning it can process input in the form of audio, video, images, and text), and that Google have been “capturing footage”, and that they state that their “favourite interactions” are forthcoming.
Oh no, it’s the fine print we have to focus on. And this is how corporations get away with it: with corporate and misdirecting language. Google wants us to believe one thing by clearly implying video interactions with their product, but legally their backs are covered.
And secondly, time and time again I’ve stated on this blog that AI is over-hyped and one of the reasons for this is that companies that make money out of AI have a vested interest in maintaining and even flaming the hype fire. The Gemini presentation is a perfect example of this. (And then I have to deal with students in my classes who are starting to believe that AI actually understands what it is doing and that AI is going to take over the world. I’m not surprised fiction like this is beginning to pervade our society considering what we’re witnessing).
There’s plenty more of this nonsense to come, however, because there’s just more and more money to be made here.
To be informed when new content like this is posted, subscribe to the mailing list:
(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)
IBM once used to be the big name in IT industry, but it fell into irrelevance. It is still around, but it was replaced by the new tech giants like Google.
I feel that this is one of the first symptoms of Google falling into irrelevance. They are falling behind competition, they’ve scaled their current business model to the maximum, and struggle to innovate like they used to.
Certainly their search engine, which was their major product, is starting to lag behind. It’s getting harder and harder to find anything with it these days. They’re still giants, though. And moving to the cloud was a master move (GCP is a great product). Are they falling behind the competition? Microsoft and Amazon are certainly catching up in many ways. Ultimately, we’ll have to wait and see!