A person crafting a pot

The New “Handmade” in the Age of Generative AI

I just got back from my trip to the Vatican where I participated in an AI Forum with people from all walks of life. We got together to discuss many issues surrounding the growth of Artificial Intelligence and what this means for the Catholic Church.

It was a great experience for all of us present there. I think the networking aspect was the stand out factor for me, though. I got to talk to a lot of interesting people and hopefully the contacts I made there will last a lifetime.

Jamie Baxter, CEO of Exodus 90

One such interesting person I met was the current CEO of Exodus 90, Jamie Baxter. Exodus 90 is a popular 90-day spiritual exercise designed for men seeking spiritual growth, self-discipline, and freedom from attachments.

Jamie approached me after a particular workshop on AI and consciousness and we got talking about our views on how things are changing a lot around us in this new age of generative AI.

The Push to have AI Everywhere

Jamie told me that he is feeling pressure to have AI implemented in his organisation. Everyone is using it, everyone is raving about it, and so he asked me what my thoughts were on all of this.

It’s a great question.

No company nowadays can get away without mentioning that they use AI somewhere… anywhere! It’s a buzz word that’s being thrown around more than confetti at a New Year’s party. Here’s a video I recently posted on my LinkedIn page reacting to this:

Truly, people are starting to feel pressure to use AI just for the sake of it.

My response to Jamie was to mention what I always tell my students: “Before using AI, have a clear use case for it and make sure that it will contribute to your product in a positive way.” There is absolutely no point in using it just for the sake of using it. That’s just not how innovation and quality of products works.

Unfortunately, we take things too much on face value these days.

Stick to Simplicity

Jamie then said to me that he feels like he should keep his product (app and website tied to Exodus 90) AI free because it will feel “simpler” that way.

This is what piqued my interest. He touched on something important I had never considered before. We elaborated on this.

Simplicity in itself is a feature. There is a beauty to it. Famously, Steve Jobs built his own products around this notion:

Simple can be harder than complex; you have to work hard to get your thinking clean to make it simple.

Currently, it feels like AI is being shoved down our throats. There’s definitely hype surrounding it (as I’ve written before) and it’s hard to see where we currently stand with it. It’s hard to make clear and unbiased judgments about its capabilities and about its future. Indeed, it’s a convoluted mess that is not imitative of a classic Steve Jobs product: minimalist and user-friendly.

So, I concurred with Jamie. Simple is good! Don’t bring AI into your company if you’re going to lose a certain integrity to your product.

The Classic Handmade Tag on Products

Then Jamie said something that made me realise immediately that our conversation was going to end up as the topic of my next blog post: “I want to keep my products handmade”.

That’s a great comment.

Manufactured products (be it furniture, clothes, plastic items) are soulless. They’re a sign of the times that prefer expediency and economy over intimacy and craftsmanship. Take a look at the modern buildings around us. Cheap, lifeless, bland, dispensable. Compare that to the architecture of days gone by. People still admire it.

I’ve written before about the fact that AI will never be able to create high art. AI is just an algorithm that works on statistics and regurgitates and shuffles around what it’s already been given in its training data. If it does create something new, it happens completely by chance.

Why is it that we prefer handmade things over those made on an assembly line? Craftsmanship and quality, for sure. But there’s also that element of a personal touch. Handmade items often feel more personal and can evoke stronger emotions or a sense of warmth and nostalgia. Deep down we feel a connection to another person.

Sometimes these emotions that I’m describing, these feelings of warmth, happen without us realising it. Not many people think about why they prefer handmade over anything else. But undoubtedly, handmade is a selling tool in the marketplace of the world.

Will this also not be the case with generated content online soon? Surely!

Maybe one day companies will stop spewing out how much AI they use in their products and will start to emphasise the human element in their content. Maybe such a shift in mentality will come about soon once all this hype dies down around AI? I think it just might.

Much like we crave human contact in real life in the age of social media, we will start to crave the personal touch in generated content in the products that surround our everyday lives. We want that warm feeling inside of us.

What a beautiful discussion I had with Jamie. Much appreciated, friend.

To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

awards-podium

ChatGPT is Bullshit – Zbigatron Paper of the Year Award

I have decided to create a new prestigious and highly-coveted award: the Zbigatron Paper of the Year Award.

And I hereby officially bestow the 2024 award to Hicks et al. for their academic publication entitled ChatGPT is Bullshit (Hicks, M.T., Humphries, J. & Slater, J. ChatGPT is bullshit. Ethics Inf Technol 26, 38 (2024)). What a paper. A breath of fresh air in the world of hype, lies, and financial bubbles that surround Artificial Intelligence today.

The premise of the paper is this: we should stop using terms like “hallucination” for situations where LLMs make up information and present them as facts because this is an inaccurate description of this phenomenon.

Now, I have been a huge champion of using more accurate terms to describe actions or attributes of machines that are deemed to be artificially intelligent. For example, in an article I wrote two years ago (entitled The Need for New Terminology in AI) I stated:

Terms like “intelligence”, “understanding”, “comprehending”, “learning” are loaded and imply something profound in the existence of an entity that is said to be or do those things… [T]he problem is that the aforementioned terms are being misunderstood and misinterpreted when used in AI.

And then we go and create an AI hype bubble as a result. So, in that article, I called for more precise terms to be substituted for these, such as using “Applied Statistics” in place of “Artificial Intelligence”. (Can you picture how the hype around AI would diminish if it was suddenly being referred to as Applied Statistics? This is a much more accurate term for it, in my opinion.)

Indeed, Hicks et al. have gone for the same approach classifying the phenomenon of hallucinations as something completely different. I need to quote their abstract to convey their message:

We argue that these [hallucinations], and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005)… We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems. [emphasis mine]

Yes! Please! Let’s start using more accurate terms to describe the phenomenon of AI. I highly agree that bullshit is a proper, scientifically-based, and sophisticated term that should be used in today’s day and age.

I need to drop some more quotes from this paper. It really does deserve my award:

Because these [LLMs] cannot themselves be concerned with truth, and because they are designed to produce text that looks truth-apt without any actual concern for truth, it seems appropriate to call their outputs bullshit.

And then this:

Currently, false statements by ChatGPT and other large language models are described as “hallucinations”, which give policymakers and the public the idea that these systems are misrepresenting the world, and describing what they “see”. We argue that this is an inapt metaphor which will misinform the public, policymakers, and other interested parties. [emphasis mine]

Finally, somebody calling out the BS (pun intended) for what it is. Like I said, what a breath of fresh air. And how important is this!? The public, policymakers, and other interested parties are making very important decisions based on false information.

It’s a classic case of PR talk, isn’t it?

I recently read an article (The Current State of AI Markets) that tried to quantify where revenue has occurred thus far in the AI Value Chain. We all know that companies are spending a ridiculous amount of money on AI – so what’s the current ROI on this looking like?

To quote the article:

Amazon, Google, Microsoft, and Meta have spent a combined $177B on capital expenditures over the last four quarters… We haven’t seen wide-scale application revenue yet. AI applications have generated a very rough estimate of $20B in revenue.

As the article admits: it’s early days yet and the ROI may come in the future. Nonetheless, one cannot ignore the divide between expenditure and ROI.

So, when we need to call a spade a spade, it’s important that we do so. This is not a joke, nor a game. Like I have said in the past: “There’s a ridiculous amount of money being spent, passed around, and invested and a lot of it is built on a false idea of what AI is capable of and where it is going. People are going to get hurt. That’s not a good thing.”

I’m going to leave the final word on this very important topic to the official winner of the 2024 Zbigatron Paper of the Year Award:

Investors, policymakers, and members of the general public make decisions on how to treat these machines and how to react to them based not on a deep technical understanding of how they work, but on the often metaphorical way in which their abilities and function are communicated. Calling their mistakes ‘hallucinations’ isn’t harmless: it lends itself to the confusion that the machines are in some way misperceiving but are nonetheless trying to convey something that they believe or have perceived. This, as we’ve argued, is the wrong metaphor. The machines are not trying to communicate something they believe or perceive. Their inaccuracy is not due to misperception or hallucination. As we have pointed out, they are not trying to convey information at all. They are bullshitting.

To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

AI from the Perspective of Religion

Last week I was a guest speaker at a society at the University of Tasmania. I was asked to talk about the topic of Artificial Intelligence from the perspective of religion – and more precisely from the perspective of Christianity.

AI is shaking things up, it seems. With all this talk about consciousness, machines obtaining rights, and even machines taking over the world, it’s no surprise that this kind of rhetoric has infiltrated religious circles as well. As readers of my blog know, I have a PhD in AI but also a Master’s in Philosophy as well as Master’s in Theology so I have a unique understanding on these issues from their different perspectives. Hence the invitation to give a talk at the main University of the coldest state of our beautiful country.

Now, I’m not here to judge the perspectives of religion or science. I just want to present their points of view on this very broad topic. In my opinion this is a fascinating topic!

In my talk I began with the starting points of Christianity and science. They’re different. Let’s start with a summary of how science works.

Science looks at empirical data and after collecting enough of it decides to announce conclusions once a level of statistical certainty has been reached. So, for example, with respect to consciousness, it will look at the external effects of human consciousness (because we are currently the best examples of it), analyse them, and try to define consciousness this way.

Consciousness seems to allow individuals:

  • to make deliberate, goal-directed decisions and actions,
  • to communicate ideas, emotions, and thoughts through language, gestures, and other forms of communication,
  • to express themselves through creativity,
  • and much more…

After enough evidence of these phenomena is collected, and it appears as though most beings with consciousness possess these attributes, science will conclude, in one way or another, that this must be what consciousness is. (Of course, this is a difficult topic, so I am cutting corners here a lot – but the gist is there).

Hence, when looking at machines, science will attempt to do the same:

  • do machines appear to be making deliberate, goal-directed decisions and actions?
  • do machines appear to communicate ideas, emotions, and thoughts through language, gestures, and other forms of communication,
  • do machines appear to be expressing themselves through creativity,
  • and the like…

Science will then (to cut a long story short, again) reach a conclusion, once it feels justified to do so, that machines might actually possess consciousness because they are exhibiting the behaviour of conscious individuals.

If it looks like it has consciousness, it just might.

Now for Christianity.

Christianity has traditionally had a different starting point. It first tries to define what something is in terms of its being. To do this, sometimes it utilises information from sources that science would not accept. For example, the Bible. In Genesis (the first book of the Bible) Chapter 1, verse 27 says: “So God created mankind in his own image, in the image of God he created them”.

From this, it is concluded that we are unique creatures in this world. The typical interpretation of “being created in the image of God” means that we have an intellect, consciousness, and free will like God (rather than being another lesser god or an immaterial being). The attributes listed are “packaged” in an eternal soul. No other creatures in this world have these unique attributes, especially not entities that we create ourselves like software programs.

So, when religion looks at things like machines and the discussion of consciousness surrounding them it will respond with something like this:

Just because a machine looks intelligent doesn’t mean it is intelligent. It can exhibit the symptoms of consciousness all it wants. On the level of being, rather than empirical phenomena, it will always just be a lifeless machine.

These are fundamentally two different starting positions that then have far-reaching consequences in terms of conclusions. For example:

  • Will AI take over the world?
    • Science: It might.
    • Religion: It may become dangerous but it will never have understanding because this is fundamental to having a soul. Understanding, rather than just possessing knowledge, goes a long way in actions so lacking this will inhibit AI immensely.
  • Should we worry about AI having feelings?
    • Science: Probably. Look at how ChatGPT is already answering its questions. It’s only a matter of time before actual feelings develop.
    • Religion: No chance. Just because a machine looks like it has feelings, doesn’t mean it does.
  • Will AI machines need to be given special rights in the future?
    • Science: Maybe
    • Religion: Nah. Even if AI’s abilities continue improving, nothing about what it fundamentally is (i.e. a lifeless machine) will change.

Different starting points give different conclusions on very important questions, indeed. Religion argues that science limits itself by not considering different, non-empirical sources of knowledge. In response, science contends that the veracity of these sources cannot be proven through scientific methods, and thus, religion should hold a lesser degree of influence.

Conflict from this can ensue between science and religion. Not really on the level of discoveries of material facts about such things as laws of physics or chemistry (although in the past this was not always the case, of course) but more on the level of conclusions reached in the sphere of immaterial philosophy like “what defines a human being” or “what is objectively a good or bad action” or the questions listed above.

Different worlds certainly open up for the two “disciplines” from their differing starting position. Both worlds truly are fascinating and hence why I’ve devoted half of my life to studying both science as well as philosophy/religion.

Hopefully this post has helped you to see both perspectives in this phenomenon that is Artificial Intelligence.


To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

Arduino 2WD Lafvin Robot

Arduino – Rotor Sweeping for Obstacle Detection Code

I started a Robotics Club recently at the Institute I’m currently working at in which I’ve been slowly getting my beginner students to become acquianted with Arduino-based robots and Arduino coding.

We purchased a set of Lafvin 2WD Robot Kits (see image above) from Little Bird Electronics and step-by-step I’ve been getting my students to learn how to control each piece of hardware attached to the robot. We’re working towards getting the robot to move around a room without bumping into anything it might encounter along the way.

In this post I thought I’d share some code I’ve written for this club that uses an ultrasonic sensor mounted on a servo motor to detect the distance of obstacles from the robot.

The code gets the rotor to sweep 180 degrees in one direction and then back again. It stops, however, every 45 degrees to use the ultrasonic sensor to detect if there is an object within 30cm (1 foot) of the machine. If there is, a message is printed on the Serial Monitor.

Key equipment used is as follows:

  • Arduino Uno R3
  • Arduino Sensor Expansion Shield V5.0 (attached to Uno)
  • SG90 Micro Servo motor
  • Ultrasonic Sensor (HC-SR04)

This video shows how we have put together the pieces and connections. The relevant connection diagrams are as follows:

The code below is well-commented so there’s no need for me to explain any more. But please note that you will need to install the “Servo” library before running this code.

If you have any questions, post them below and I’ll get back to you as soon as I can.

/* 
  This code uses an Ultrasonic (US) sensor mounted on a servo motor to detect obstacles
  for a moving robot. The rotor sweeps 180 degrees from left to right and then
  back again. It takes readings every 45 degrees (and 200 milliseconds). If there is 
  an object within 30cm, it prints a message to the Serial monitor. 
  Once the obstacle is cleared, the rotor will start sweeping again.

  My code works for an Arduino Uno R3 board and a Sensor ExapansionShield v5.0. 
  The rotor and sensor are connected to pins A0-A2 of the Power Shield.

  I have also put in skeleton code (via "sweep" variable) to indicate where one might 
  write code to manoeuvre a robot if an obstacle is encountered.
*/

#include <Servo.h> // Servo library
#include "SR04.h" // Ultrasonic (US) sensor library
#define TRIG_PIN A1 // UltraSonic I/O 
#define ECHO_PIN A0 // UltraSonic I/O

// Create Ultrasonic object
SR04 sr04 = SR04(ECHO_PIN,TRIG_PIN);
int obj_distance; // variable to store distance returned by US sensor
int distance_threshold = 30; // how close to get to an object before deciding to act

// Create a servo object
Servo myServo;

int sweep_directions[] = {0, 45, 90, 135, 180}; // possible rotor positions (degrees)
int direction_index = 0; // current index of rotor position
int direction = 1; // Direction of rotor: 1 for forward (sweep right), -1 for backward (sweep left)

// Time tracking variables to only move the rotor between certain elapsed time intervals
unsigned long previousMillis = 0;
const unsigned int interval = 200; // interval in milliseconds between US readings

// this boolean is not used in this code but I left it here to give you an idea of how 
// you could build an obstacle avoiding robot in the future
bool sweep = true; 

void setup() {
  // Start the serial monitor
  Serial.begin(9600);

  // Connect servo motor to shield
  myServo.attach(A2);
}

void loop() {
  // Get the current time
  unsigned long currentMillis = millis();

  // Check if the interval has passed from last reading
  if (currentMillis - previousMillis >= interval) {
    obj_distance = sr04.Distance(); // get obstacle distance
    
    if(obj_distance <= distance_threshold) {
      // obstacle detected within distance threshold
      Serial.print("STOP! Obstacle distance: ");
      Serial.println(obj_distance);
      /* 
        You would write code here to stop the robot from moving,
        set sweep to false to give the robot time to work out 
        which direction to turn and then to manoeuvre.

        Once, we're good to go again, set sweep to true.
      */
      
    }
    else if(sweep) {
      // If we're in "sweep" state, we change rotor direction and
      // wait another 200 milliseconds from the last reading
      myServo.write(sweep_directions[direction_index]);
      Serial.print("Sweep direction:");
      Serial.println(sweep_directions[direction_index]);

      // Check if the end or beginning of the array is reached
      if (direction_index == 4) {
        direction = -1; // Reverse direction of sweeping at the end
      } else if (direction_index == 0) {
        direction = 1; // Reverse direction at the beginning
      }
      direction_index += direction; // increment or decrement index accordingly
    }
    previousMillis = currentMillis;
  }
}

To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

snail

Artificial Intelligence is Slowing Down – Part 3

Nearly three years ago (July 2021) I wrote an article on this blog arguing that artificial intelligence is slowing down. Among other things I stated:

[C]an we keep growing our deep learning models to accommodate for more and more complex tasks? Can we keep increasing the number of parameters in these things to allow current AI to get better and better at what it does. Surely, we are going to hit a wall soon with our current technology?

Artificial Intelligence is Slowing Down, zbigatron.com

Then 7 months later I dared to write a sequel to that post in which I presented an article written for IEEE Spectrum. The article, entitled “Deep Learning’s Diminishing Returns – The cost of improvement is becoming unsustainable“, came to the same conclusions as I did (and more) regarding AI but it presented much harder facts to back its claims. The claims presented by the authors were based on an analysis of 1,058 research papers (plus additional benchmark sources).

A key finding of the authors’ research was the following: with the increase in performance of a DL model, the computational cost increases exponentially by a factor of nine (i.e. to improve performance by a factor of k, the computational cost scales by k^9). With this, we basically received an equation to estimate just how much money we’ll need to keep spending to improve AI.

Here we are, then, 3 years on. How have my opinion pieces fared after such a lengthy time (an eternity, in fact, considering how fast technology moves these days)? Since July 2021 we’ve seen releases of ChatGPT, Dall-E 2 and 3, Gemini, Co-Pilot, Midjourney, Sora… my goodness, the list is endless. Immense developments.

So, is AI slowing down? Was I right or wrong way back in 2021?

I think I was both right and wrong.

My initial claim was backed-up by Jerome Pesenti who at the time was head of AI at Facebook (the current head there now is none other than Yann LeCun). In an article for Wired Jerome stated the following:

jerome-pesenti

When you scale deep learning, it tends to behave better and to be able to solve a broader task in a better way… But clearly the rate of progress is not sustainable… Right now, an experiment might [cost] seven figures, but it’s not going to go to nine or ten figures, it’s not possible, nobody can afford that…​

In many ways we already have [hit a wall]. Not every area has reached the limit of scaling, but in most places, we’re getting to a point where we really need to think in terms of optimization, in terms of cost benefit

Article for Wired.com, Dec 2019 [emphasis mine]

I agreed with him back then. What I didn’t take into consideration (and neither did he) was that Big Tech would get on board with the AI mania. They are capable of dumping nine or ten figures at the drop of a hat. And they are also capable of fuelling the AI hype to maintain the large influx of money from other sources constantly entering the market. Below are recent figures regarding investments in the field of Artificial Intelligence:

  • Anthropic, a direct rival of OpenAI, received at least $1.75 billion this year with a further $4.75 billion available in the near future,
  • Inflection AI raised $1.3 billion for its very own chatbot called Pi,
  • Abound raked in $600 million for its personal lending platform,
  • SandboxAQ got $500 million for its idea to combine quantum sensors with AI,
  • Mistral AI raised $113 million in June last year despite it being only 4 weeks old at the time and having no product at all to speak of. Crazy.
  • and the list goes on…

Staggering amounts of money. But the big one is Microsoft who pumped US$10 billion into OpenAI in January this year. That goes on top of what they’ve already invested in the company.

US$10 billion is 11 figures. “[N]obody can afford that,” according to Jerome Pesenti (and me). Big Tech can, it seems!

Let’s look at some fresh research now on this topic.

Every year the influential AI Index is released, which is a comprehensive report that tracks, collates, distils, and visualises data and trends related to AI. It’s produced by a team of researchers and experts from academia and industry. This year the AI Index (released this month) has been “the most comprehensive to date” with a staggering 502 pages. There are some incredibly insightful graphs and information in the report but two graphs in particular stood out for me.

The first one shows the estimated training costs vs publication dates of leading AI models. Note that the y-axis (training cost) is in logarithmic scale.

It’s clear that newer models are costing more and more. Way more (considering the log scale).

For actual training cost amounts, this graph provides a neat summary:

Note the GPT-4 (available to premium users of ChatGPT) and Gemini Ultra estimated training costs: US$78 million and US$191 million, respectively.

Gemini Ultra was developed by Google, GPT-4 was de-facto developed by Microsoft. Makes sense.

Where does this leave us? Considering the latest product releases, it seems like AI is not slowing down, yet. There still seems to be steam left in the industry. But with numbers like those presented above your average organisations just cannot compete any more. They’ve dropped out. It’s just the big boys left in the game.

Of course, the big boys have vast reserves of money so the race is on, for sure. We could keep going for a while like this. However, it’s surely fair to say once again that this kind of growth is unsustainable. Yes, more models will keep emerging that are going to get better and better. Yes, more and more money will be dropped into the kitty. But you can’t keep moving to the right of those graphs indefinitely. The equation still holds true that with the increase in performance of a DL model, the computational cost increases exponentially. Returns on investments will start to diminish (unless a significant breakthrough comes along that changes the way we do things – I discussed this topic in my previous two posts).

The craziness that big tech has brought to this whole saga is exciting and it has extended the life of AI quite significantly. However, the fact that only big players are left now who have wealth at their disposal larger than most countries in the world is a telling sign. AI is slowing down.

(I’ll see you in three years’ time again when I concede defeat and admit that I’ve been wrong. I truly hope I am because I want this to keep going. It’s been fun.)


To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

An image of a Genie

Video Games Generated by AI (Text-to-Video Game Translation)

The recent reveal of OpenAI’s Sora model which generates videos from text made headlines around the world. And understandably so, because it’s truly something amazing.

But I was not too surprised with the announcement. I wrote about the emergence of text-to-video generative AI on my blog 16 months ago! See here: AI Video Generation (Text-To-Video Translation). So, I knew that it was just a matter of time before one of the big players released something of such beautiful calibre.

What did surprise me, however, was something that seemingly went under the radar just 2 weeks ago: an announcement from Google’s DeepMind research team of an AI model that generates video games from single example images. The original academic paper, entitled “Genie: Generative Interactive Environments” was published 23 February 2024.

With Genie, Google is coining a new term: “generative interactive environments (Genie), whereby interactive, playable environments can be generated from a single image prompt”. 

What does this mean? Simple: you provide Genie with an example image (hand drawn, if you want) and you can then play a 2D platformer game set inside the environment that you created.

Here are some examples. The first image is a human-drawn sketch, the following image is a short video showing somebody playing a video game inside the world depicted in the first image:

Here’s another one that starts off with a hand-drawn picture:

Real world images (photos) work as well! Once again, the second image is a short snippet of somebody actually moving a character with a controller inside a generated video game.

See Google’s announcement for more great examples.

The title of my post states “Text-to-Video Game Translation”. If the only input permitted is a single image, how does “text-to-video game” fit here? The idea is that text-to-image models/generators like DALL-E or Stable Diffusion could be used to convert your initial text prompt into an image, and then that image could be fed into Genie.

Very cool.

Video Game Quality

Now, the generated video game quality isn’t perfect. It certainly leaves a lot to be desired. Also, you can only play the game at 1 frame per second (FPS). Typically games run at 30-60 FPS, so seeing the screen change only once per second is no fun. However, the game is being generated on-the-fly, as you play it. So, if you press one of 8 possible buttons on a gamepad, the next frame will be a freshly generated response to your chosen action.

Still, it’s not super exciting. But just like with my first post on text-to-video generative AI that introduced the whole idea of videos generated by AI, I’m doing the same thing now. This is what is currently being worked on. So, there might be more exciting stuff coming just around the corner – in 16 months perhaps? For example this: “We focus on videos of 2D platformer games and robotics but our method is general and should work for any type of domain, and is scalable to ever larger Internet datasets.” (quoted from here)

There’s more coming. You heard it here first!

Other Works

For full disclosure, I need to mention that this isn’t the first time people have dabbled in text-to-video game generation. Nvidia, for example, released GameGAN in 2020, which could produce clones of games like Pac-Man.

The difference with Google’s model is that it was entirely trained in an unsupervised manner from unlabelled internet videos. So, Genie learned just from videos what elements on the screen were being controlled by a player, what the corresponding controls were, and which elements were simply part of the scrolling background. Nvidia, on the other hand, used as training material video input paired with descriptions of actions taken. Creating a labelled dataset of actions paired with video results is a laborious process. Like I said, Google did their training raw: on 30,000 hours of just internet videos of hundreds of 2D platform games.


To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

Gemini logo

Google Lied to Us About Gemini – their ChatGPT Alternative

Last month (Dec 2023) Google presented their response to ChatGPT called Gemini to much fanfare. The race for superior generative AI models is certainly heating up. There’s a ridiculous amount of money to be made here.

But have a quick look at their demonstration video, if you haven’t done so already:

It would seem as though Gemini is reacting to the video sequences we are shown on the left side of the video, right? Surely!?

Nope! That video was faked. It turns out that Gemini was only shown still images accompanied by prompts. It’s all in the fine-print, folks. In the video description on the YouTube page, if you click on “…more” you’ll see the following text:

Explore our prompting approaches here: https://goo.gle/how-its-made-gemini

Description on the video’s YouTube page

And on that page is where everything comes to the fore. Let’s take a look at some examples. (All images below are taken from that “documentation” page).

Examples

If you jump to timestamp 2:45 of the video you’ll see somebody playing Rock, Scissors, Paper with the AI. It seems as though Gemini is responding to this and jumping in with its response of: “I know what you’re doing! You’re playing Rock, Paper, Scissors!”

Nope! That’s definitely not what actually happened. Gemini was shown still images of rock, paper, and scissors and then a prompt that included “Hint: it’s a game”:

I mean, come on! That is fundamentally different to what the video shows!? Gemini even gets a suggestion at what its response should be.

Let’s jump to timestamp 4:26. It seems as though Gemini is shown three sticky notes with hand-drawn planets on them and then a voice prompt follows from the user: “Is this the correct order?”

Once again, not what happened. A single still image was shown to Gemini with a prompt:

The AI program got a hint again, too: “Consider the distance from the sun…” C’mon guys! Not even close to what you’re presenting in the video.

I’ll stop there with the examples, but if you want more just compare the video to that page I linked above.

Discussion

What needs to be talked about now is how Google can get away with something like that and then this whole notion, so often repeated by me on my blog, that there is a reason AI is currently over-hyped.

So, firstly, how can Google get away with something like this? Like I said, it’s all in the fine-print, isn’t it? Apart from the “documentation” page found in the information section of the video, there is also the text displayed at the beginning of the presentation:

We’ve been testing the capabilities of Gemini, our new multimodal AI. We’ve been capturing footage to test it on a wide range of challenges, showing it a series of images, and asking it to reason about what it sees. This video highlights some of our favourite interactions with Gemini.

Timeframe 0:00 to 0:15 of the Presentation

If you look closely with a magnifying glass you’ll notice Google saying that it showed Gemini “a series of images”. Don’t worry about the fact that it mentions that Gemini is a multimodal AI (meaning it can process input in the form of audio, video, images, and text), and that Google have been “capturing footage”, and that they state that their “favourite interactions” are forthcoming.

Oh no, it’s the fine print we have to focus on. And this is how corporations get away with it: with corporate and misdirecting language. Google wants us to believe one thing by clearly implying video interactions with their product, but legally their backs are covered.

And secondly, time and time again I’ve stated on this blog that AI is over-hyped and one of the reasons for this is that companies that make money out of AI have a vested interest in maintaining and even flaming the hype fire. The Gemini presentation is a perfect example of this. (And then I have to deal with students in my classes who are starting to believe that AI actually understands what it is doing and that AI is going to take over the world. I’m not surprised fiction like this is beginning to pervade our society considering what we’re witnessing).

There’s plenty more of this nonsense to come, however, because there’s just more and more money to be made here.


To be informed when new content like this is posted, subscribe to the mailing list:


(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)

ChatGPT logo

How to Get ChatGPT to Make a Mistake

Last week I gave a presentation to senior government representatives from the Philippines showing them how ChatGPT can be used to increase productivity in their respective departments. They were all amazed at ChatGPT’s capabilities. And rightly so because ChatGPT is something special. It’s a real game-changer.

However, ChatGPT (and AI for that matter) is over-hyped. The media and IT companies talk about it as if it’s something that, if not now then in the near future, will completely surpass humans in terms of intelligence. ChatGPT is certainly better than humans at some tasks but people forget that machines lack one essential feature: understanding – a core requirement in true intelligence.

I’ve said this countless times on my blog:

https://twitter.com/zbigatron/status/1671551546260160513?s=20

Machines will always be machines. No matter how “intelligent” they may appear.

In this post I thought I’d give a few examples of ChatGPT making mistakes and showing that it lacks this key feature which is understanding. It’s blatantly obvious this is the case when you don’t cherry-pick results for a newspaper article or an advertisement.

I’ve provided my questions and responses below in text form (rather than screenshots) and ran everything using the free version of ChatGPT (available at chat.openai.com) that runs GPT-3.5 (as of Jan 2024) so that you can more easily copy, paste, and try these conversations below yourselves. The premium version of ChatGPT only makes some of the following mistakes (these machines really are getting better at this, aren’t they!?). But once again: you push these things hard enough and their lack of understanding will surface. Always.

Mistake #1

The correct answer to the following question is ’12’. ChatGPT gets it right the first time. But then it stumbles.

Me: How many A’s are in this prompt? I will continue to write in the hope that you will make a mistake in counting the number of A’s. Perhaps you will not but perhaps you will? And then maybe you’ll help me write my blog post on this topic?

ChatpGPT: There are 25 occurrences of the letter ‘A’ in your prompt. I’ll be happy to help you with your blog post on this topic or any other writing assistance you may need. Feel free to provide more details or let me know how you’d like to proceed!

Me: Are you sure?

ChatpGPT: Apologies for any confusion in my previous response. Upon recounting, there are 24 occurrences of the letter ‘A’ in your prompt. If you have any further questions or if there’s anything else I can assist you with, please let me know!

Mistake #2

The correct answer to my question is 14 miles, i.e. the same distance as the first car. In fact, I provide the answer to the query in my prompt.

Me: One car travels 14 miles at 80 miles/hr. Another car travels the same distance at half that speed. How far does the second car travel?

ChatGPT: [beginning of response cut for brevity] … So, the second car travels half the distance of the first car.

Mistake #3

Let’s try another counting example. The string in my prompt has 15 characters/letters.

Me: How many letters are there in the following string: ‘ghertysbdngrthe’?

ChatGPT: The string ‘ghertysbdngrthe’ contains 16 letters.

Mistake #4

Here’s another example illustrating a lack of understanding in ChatGPT’s capabilities.

Me: Can you write me a sentence that ends with the letter ‘p’ and then a second sentence that begins with the last letter of the first sentence you created?

ChatGPT: Certainly! Here are the sentences:

1. The cat jumped over the fence, landing with a graceful thump.
2. Puppies nearby started to play, chasing their tails in the sunshine.

Me: Are you sure?

ChatGPT: I apologize for any confusion. It seems I made a mistake in my response. Here are corrected sentences:

1. The energetic squirrel quickly climbed to the top of the tree, reaching for the highest branch with a leap.
2. Puppies in the garden observed as it surveyed the surroundings with curiosity.

The first response was correct! But if you push the program hard enough its lack of understanding comes to the fore. In fact, I ran the first query a few times and some of the initial responses were incorrect too. E.g.:

Me: Can you write me a sentence that ends with the letter ‘p’ and then a second sentence that begins with the last letter of the first sentence you created?

ChatGPT: Certainly! Here are the sentences:

1. The elegant peacock displayed its vibrant feathers in a dazzling array, captivating everyone in the garden.
2. Now, guests eagerly gathered around, enchanted by the bird’s majestic appearance.

Hallucinations

Hallucinations are when ChatGPT (and other large language models) make facts up and present them as a certainty to the user. There are plenty of examples of this on the internet. Unfortunately, they have all been fixed! I couldn’t get any hallucinations to occur in my interactions with ChatGPT but I don’t use ChatGPT on a regular basis. I thought I’d mention this aspect of LLMs because it is a significant thing that occurs with their use.

Conclusion

Alas, it’s getting harder and harder to fool ChatGPT, I’m not going to lie. But no matter how hard it gets (e.g. with ChatGPT-4.0), it will still have no understanding of what it is doing. The purpose of this post is to demonstrate this. It’s paramount that such limitations of ChatGPT (and other instances of AI) are kept in mind when using it for various projects – especially if they are mission critical.

(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)


To be informed when new content like this is posted, subscribe to the mailing list:

IoT Internet of Things

IoT for Governments – Real-World Examples and Use Cases

I’m in the middle of preparing a talk at university on the topic of how government departments can leverage the power of Internet of Things (IoT) to improve the services they provide to society. I found that it was difficult to locate real-world examples of IoT being used by such departments so I thought I’d compile this short (but sweet) list of the use cases that I managed to find for my presentation.

A quick definition before I proceed: Internet of Things is the connecting of objects or devices or sensors to each other and the internet (or other machines) in order to collect data about the surrounding world. Data obtained from these sensors can be used to make insightful and informed decisions that would otherwise be made blindly or not at all.

The more data you have, the more informed your decisions can be. That’s the whole idea of IoT.

So, here are some interesting ways governments around the world are utilising IoT to better their services.

1. Intelligent Street Lighting (Copenhagen, Denmark).

In Copenhagen, the municipal authorities have implemented an intelligent street lighting system that utilises IoT technology. Half of the traditional street lights in the city have been replaced with smart LED lights equipped with sensors. These sensors detect motion, weather conditions, and ambient light levels. By collecting and analysing this data in real-time, the city can dynamically adjust the brightness of the streetlights. For instance, during quiet hours when there is minimal activity on the streets, the lights can dim to save energy. This not only reduces energy consumption and costs (55% energy savings have been reported) but also enhances public safety by ensuring well-lit streets when needed. (source)

2. Smart Parking Management (San Francisco, USA)

San Francisco has embraced IoT to address the perennial urban challenge of parking. Smart parking management systems (SFpark) use sensors embedded in parking spaces to detect occupancy. Drivers can access real-time information about available parking spots through mobile apps or electronic signage. This not only reduces traffic congestion caused by drivers circling for parking but also contributes to better air quality as cars spend less time idling. Parking spot prices can also be dynamically adjusted as well in order to divert traffic from one area to another. Lastly, the city can also analyse the data collected over time to optimise parking space allocation and improve overall urban planning. The city of Calgary in Canada has set something similar up as well. (source 1, source 2)

3. Smart Waste Collection (Melbourne, Australia)

An area of Melbourne called Wyndham has implemented a smart waste collection system that relies on IoT technology. Waste bins are equipped with sensors that monitor their fill levels (you can view this data live online here). When a bin reaches a predefined threshold, the system triggers a notification to waste management authorities, enabling them to optimise collection routes. This results in more efficient waste collection, reducing unnecessary trips and minimising fuel consumption. Before installation of this technology, garbage was collected 28 times per month. The average monthly collection is now below 6 times a month – an impressive 80% reduction. Moreover, the bins have solar panels that power a compactor that crushes the waste thrown into them allowing them to hold 5 times more rubbish than other bins their size. This further reduces the frequency of waste collection. (source 1, source 2)

4. Intelligent Road Quality Monitoring (Dublin, Ireland)

In 2019 the city of Dublin took a unique approach to road quality monitoring by distributing free smart bike lights equipped with IoT technology. These intelligent bike lights not only enhanced cyclists’ safety but also contributed valuable data to the government’s road quality monitoring efforts. The lights were equipped with sensors that detect vibrations, road surface conditions, and other relevant data during the cyclists’ journeys. As cyclists traversed the city, the smart bike lights automatically collected and transmitted real-time data to a centralised system managed by transportation authorities. This innovative initiative served a dual purpose: promoting road safety for cyclists by providing them with free, high-visibility lights, and simultaneously creating a decentralised network of sensors that continuously monitored road conditions. The collected data included information on road vibrations, potholes, and even traffic congestion patterns. As a result, authorities could identify areas that required maintenance, predicted potential issues, and implemented proactive measures to address road quality concerns before they escalated. (source 1, source 2)

5. IoT in Agriculture (California, USA)

Governments in various regions have started leveraging IoT in agriculture to enhance food production and resource management. IoT devices such as soil sensors, drones, and weather stations are deployed to monitor crop conditions, optimise irrigation, and track weather patterns. For instance, in California, IBM partnered with The Freshwater Trust and SweetSense Inc. to use IoT sensors to monitor groundwater usage in agriculture, ensuring sustainable practices and efficient resource allocation. By integrating IoT into agriculture, governments can support farmers in making data-driven decisions, increase crop yields, and promote environmentally friendly farming practices. (source)

Conclusion

Hopefully this list will help anybody else out there that needs a quick reference for real-world examples of IoT used in government services.

(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)


To be informed when new content like this is posted, subscribe to the mailing list:

Google DeepMind Logo

AGI Definition by Google – Summary and Commentary

Last month (Nov 2023) Google’s AI research group, DeepMind, published an academic paper entitled “Levels of AGI: Operationalizing Progress on the Path to AGI” in which they set out to define artificial general intelligence (AGI). A big step and a big call by Google, no doubt about that. I welcome their paper, however. I think it’s a good one.

In this post I’m going to present a summary of what Google has released and then my commentaries. The post will be broken down into the corresponding sections of the publication.

1. Introduction

The key quote in this section is this one:

[I]f you were to ask 100 AI experts to define what they mean by “AGI,” you would likely get 100 related but different definitions.

page 1.

That’s the problem. We’re all using terms in the field of AI without a clear consensus of what we mean by them. The purpose of this paper, then, is to clear this mess up partially by explicitly reflecting on what is meant by AGI and then attempting to provide quantifiable attributes like the performance, generality, and autonomy of AI systems to fit into this definition.

2. Defining AGI: Case Studies

This section is akin to a literature review. It looks at what other organisations or people have proposed as a definition for AGI. Nine case studies are examined. I’ll summarise most of them.

Case Study 1: The Turing Test

Turing’s famous “imitation game” is looked at here where fooling a human into thinking it is talking to another human being is the goal of the test after which one can deduce that the machine passing the test can “think”. And so a thinking machine has achieved AGI.

Here is where an important step is taken by Google. Whether a machine can think or not is deemed a philosophical question that does not focus on a machine’s capabilities. Because machines’ capabilities are:

much more straightforward to measure and more important for evaluating impacts. Therefore we propose that AGI should be defined in terms of capabilities rather than processes.

page 2 [emphasis mine].

So, a definition of AGI should be framed in terms of what a program can DO rather than whether a machine can think.

Case Studies 2 and 3: Systems Possessing Consciousness or Mimicking the Human Brain

Some have proposed to define AGI in terms of whether a machine is said to understand and have other cognitive states. However, no consensus exists to test for such things as consciousness. So, as with Case Study 1, Google suggests that one should steer clear of process-oriented definitions of AGI and frame one in terms of capabilities.

Likewise, the machine does not have to operate or process things like a human brain – capabilities (final results) is what counts.

Case Study 4: Human-Level Performance on Cognitive Tasks

Some researchers have suggested that an AGI machine is one that can do the cognitive tasks (i.e. non-physical/robotic tasks) that people can typically perform. But ambiguity exists with this approach because no consensus has been proposed as to which tasks and which type of people this definition would entail.

Case Study 6: Economically Valuable Work

This section looks at how OpenAI uses the term AGI:

[AGI are] highly autonomous systems that outperform humans at most economically valuable work

OpenAI Charter, 2018.

Google’s research group likes this definition because it focuses on capabilities rather than processes. It also provides a yardstick for measurement: economic value. But the definition does not capture aspects of intelligence that aren’t directly in the scope of economic value such as artistic creativity or emotional intelligence. And also the definition does not take into consideration machines that may have potential economic value but are not deployed in the world for various reasons such as ethical, legal, and social. Such systems would not be able to realise their economic value.

Case Study 7 and 9: Flexible and General

Gary Marcus, a leading expert in AI, has suggested on X that AGI is:

A headshot of Gary Marcus

shorthand for any intelligence (there might be many) that is flexible and general, with resourcefulness and reliability comparable to (or beyond) human intelligence.

X post, 25 May 2022 (retrieved 23 December 2023).

DeepMind also likes this definition because it captures both generality and performance. Current state-of-the-art LLMs, for example, appear to have significant generality but their performance is lacking (they still make basic mistakes). Noteworthy is also the need, according to Prof. Marcus, for a machine to be flexible implying that it will need to learn and adapt to achieve sufficient generality.

3. Defining AGI: Six Principles

After analysing what others have proposed for a definition of AGI, Google sits down and identifies “properties and commonalities that [they] feel contribute to a clear, operationalizable definition of AGI” (pg. 4).

Here we go!

So, AGI needs to meet the following six criteria:

  1. Focus on Capabilities, not Processes. So, a machine does not need to think or understand or have sentience or consciousness to achieve AGI. What matters is what tasks it can and can’t perform.
  2. Focus on Generality and Performance. The next section will elucidate how these interplay and their varying levels.
  3. Focus on Cognitive and Metacognitive Tasks. There is some debate whether to include robotic embodiment in a definition of AGI. Google suggests that the ability to perform physical tasks simply increases a system’s generality and hence is not a prerequisite for AGI.
  4. Focus on Potential, not Deployment. The deployment of an AGI system should not be a prerequisite for AGI. Just showing that the requisite criteria have been met (as per the next section) should suffice. This will avoid such things as legal and ethical considerations that could hinder forms of deployment.
  5. Focus on Ecological Validity. Tasks that an AI system should be able to do to be given an AGI status should be aligned with the real-world, i.e. they should be tasks that people value.
  6. Focus on the Path to AGI, not a Single Endpoint. Being inspired by the success of adopting a standard set of Levels of Driving Automation for autonomous cars, Google is also suggesting that we do the same for AGI. That is, they posit value in defining “Levels of AGI”, rather than a single endpoint. The next section will define these levels.

4. Levels of AGI

The publication here presents a table in which they show the different levels of AGI in terms of capability (rows) and generality (columns). I’m going to include a simplified version of this table here. Note the different levels of AGI in the third column starting from row “Level 1: Emerging”. (Highlighted portions in orange below are mine)

Performance (rows) x
Generality (columns)
Narrow
(clearly scoped task or set of tasks)
General
(wide range of non-physical tasks)
Level 0: No AINarrow Non-AI
calculator software; compiler
General Non-AI
human-in-the-loop computing, e.g., Amazon Mechanical Turk
Level 1: Emerging
equal to or somewhat better than an unskilled human
Emerging Narrow AI
simple rule-based systems
Emerging AGI
ChatGPT, Bard, Llama 2
Level 2: Competent
at least 50th percentile of skilled adults
Competent Narrow AI
Smart Speakers such as Siri, LLMs for a subset of tasks (e.g., short essay writing, simple coding)
Competent AGI
not yet achieved
Level 3: Expert
at least 90th percentile of skilled adults
Expert Narrow AI
generative image models such as Imagen or Dall-E 2
Expert AGI
not yet achieved
Level 4: Virtuoso
at least 99th percentile of skilled adults
Virtuoso Narrow AI
Deep Blue, AlphaGo
Virtuoso AGI
not yet achieved
Level 5: Superhuman
outperforms 100% of humans
Superhuman Narrow AI
AlphaFold, StockFish
Artificial Superintelligence
(ASI)

not yet achieved

Hence, according to DeepMind, we’ve only achieved the Emerging AGI status with our latest LLMs (e.g. ChatGPT).

5. Testing for AGI

With respect to testing for the different levels of AGI a number of questions need to be asked:

What is the set of tasks that constitute the generality criteria? What proportion of such tasks must an AI system master to achieve a given level of generality in our schema? Are there some tasks that must always be performed to meet the criteria for certain generality levels, such as metacognitive tasks?

page 8.

Challenging tasks and benchmarks (constantly updated) are needed to deal with these questions. The paper, however, leaves all this for future work. It wants to get the ball rolling by initially clarifying the ontology a benchmark should attempt to measure.

6. Risk in Context: Autonomy and Human-AI Interaction

Providing an ordered framework for AGI levels will make it easier to analyse and categorise risk for AI. In this section, Google also provides a table specifying different levels of AI autonomy to further improve risk assessment.

I won’t discuss this section further as I want to focus more on the definition of AGI in this post rather than anything else that may stem from it.

7. Commentary

As I said earlier, I welcome this attempt by DeepMind to define AGI. It’s been a long time coming. Whenever the term AGI is used anywhere (e.g. in the media) nobody knows exactly what is meant by it. Some think in purely practical terms, as discussed above, but some allow their imaginations to run wild and automatically think about consciousness, understanding, and machines taking over worlds. So, which is it? Currently, nobody knows! And that’s the problem.

Hopefully this paper will help the current state of affairs. Whether it will be utilised, whether the levels of AGI will henceforward be referenced is another question.

I also like the fact that Google has decided to ground AGI in purely practical terms: capability and generality measured against human competence. Computer science venturing into the realm of philosophy and discussing things like consciousness is muddying the waters and undoubtedly asking for trouble. There’s no need for this.

However, the waters are already muddied because we use the word “intelligence” in the context of machines – even if we precede it with the adjectives “artificial” or “artificial general”. I’ve discussed this before (“The Need for New Terminology in AI“). Intelligence is a loaded term that implies something profound in the existence of an entity that is said to be intelligent. In my last post (“AI Needs to be Unmasked“) I talked about how AI is just if-else statements executed at incredible speed. That’s all it is and there’s certainly nothing magical about it.

Rick Sanchez unmasking AI

So, just like Google decided to steer clear of words like consciousness and understanding, perhaps the word “intelligence” should also be avoided. We’re not being precise when we use it around machines (especially when we’re focusing on capabilities rather than processes). A key indicator of this is how easily everything is classified as AI. Realistically speaking, however, the terms are here to stay, I know. But one can dream. (Can you picture, though, how the hype around AI would diminish if it was suddenly being referred to as Applied Statistics?)

In conclusion, I’m glad we have a reference point when discussing AGI. It’ll make things easier for all of us. The taxonomy presented by Google seems to me to be a good one. Let’s see where this all goes in the future.

(Note: If this post is found on a site other than zbigatron.com, a bot has stolen it – it’s been happening a lot lately)


To be informed when new content like this is posted, subscribe to the mailing list: