(Update: part 2 of this post was posted recently here and part 3 has been posted here.)
Over the last few months here at Carnegie Mellon University (Australia campus) I’ve been giving a set of talks on AI and the great leaps it has made in the last 5 or so years. I focus on disruptive technologies and give examples ranging from smart fridges and jackets to autonomous cars, robots, and drones. The title of one of my talks is “AI and the 4th Industrial Revolution”.
Indeed, we are living in the 4th industrial revolution – a significant time in the history of mankind. The first revolution occurred in the 18th century with the advent of mechanisation and steam power; the second came about 100 years later with the discovery of electrical energy (among other things); and the big one, the 3rd industrial revolution, occurred another 100 years after that (roughly around the 1970s) with things like nuclear energy, space expeditions, electronics, telecommunications, etc. coming to the fore.
So, yes, we are living in a significant time. The internet, IoT devices, robotics, 3D printing, virtual reality: these technologies are drastically “revolutionising” our way of life. And behind the aforementioned technologies of the 4th industrial revolution sits artificial intelligence. AI is the engine that is pushing more boundaries than we could have possibly imagined 10 years ago. Machines are doing more work “intelligently” for us at an unprecedented level. Science fiction writers of days gone by would be proud of what we have achieved (although, of course, predictions of where we should be now in terms of technological advances have fallen way short according to those made at the advent of AI in the middle of the 20th century).
The current push in AI is being driven by data. “Data is the new oil” is a phrase I keep repeating in my conference talks. Why? Because if you have (clean) data, you have facts, and with facts you can make insightful decisions or judgments. The more data you have, the more facts you have, and therefore the more insightful your decisions can potentially be. And with insightful decisions comes the possibility to make more money. If you want to see how powerful data can be, watch the film “The Social Dilemma” that shows how every little thing we do on social media (e.g. where we click, what we hover our mouse over) is being harvested and converted into facts about us that drive algorithms to keep us addicted to these platforms or to form our opinions on important matters. It truly is scary. But we’re talking here about loads and loads and loads of data – or “big data” as it is now being referred to.
Once again: the more data you have, the more facts you have, and therefore the more insightful your decisions can be. The logic is simple. But why haven’t we put this logic into practice earlier? Why only now are we able to unleash the power of data? The answer is two-fold: firstly, we only now have the means to be thrifty in the way we store big data. Today storing big data is cheap: hard drive storage sizes have sky-rocketed while their costs have remained stable – and then let’s not forget about cloud storage.
The bottom line is that endless storage capabilities are accessible to everybody.
The second answer to why the power of big data is now being harnessed is that we finally have the means to process it to get those precious facts/insights out of them. A decade ago, machine learning could not handle big data. Algorithms like SVM just couldn’t deal with data that had too many parameters (i.e. was too complex). It could only deal with simple data – and not a lot of it for that matter. It couldn’t find the patterns in big data that now, for example, drive the social media algorithms mentioned above, nor could it deal with things like language, image or video processing.
But then there came a breakthrough in 2012: deep learning (DL). I won’t describe here how deep learning works or why it has been so revolutionary (I have already done so in this post) but the important thing is that DL has allowed us to process extremely complex data, data that can have millions or even billions of parameters rather than just hundreds or thousands.
It’s fair to say that all the artificial intelligence you see today has a deep learning engine behind it. Whether it be autonomous cars, drones, business intelligence, chatbots, fraud detection, visual recognition, recommendation engines – chances are that DL is powering all of these. It truly was a breakthrough. An amazing one at that.
Moreover, the fantastic thing about DL models is that they are scalable meaning that if you have too much data for your current model to handle, you can, theoretically, just increase its size (that is, you increase its number of parameters). This is where the old adage: the more data you have, the more facts you have, and therefore the more insightful your decisions can be comes to the fore. Thus, if you have more data, you just grow your model size.
Deep learning truly was a huge breakthrough.
There is a slight problem, however, in all of this. DL has an achiles heal – or a major weakness, let’s say. This weakness is it’s training time. To process big data, that is, to train these DL models is a laborious task that can take days, weeks or even months! The larger and more complex the model, the more training time is required.
Let’s discuss, for example, the GPT-3 language model that I talked about in my last blog post. At its release last year, GPT-3 was the largest and most powerful natural language processing model. If you were to train GPT-3 yourself, it would take you 355 years to do so on a decent, home machine. Astonishing, isn’t it? Of course, GPT-3 was trained on state-of-the-art clusters of GPUs but undoubtedly it still would have taken a significant amount of time to do.
But what about the cost of these training tasks? It is estimated that OpenAI spent US$4.6 million to train the GPT-3 model. And that’s only counting the one iteration of this process. What about all the failed attempts? What about all the fine-tunings of the model that had to have taken place? Goodness knows how many iterations the GPT-3 model went through before OpenAI reached their final (brilliant) product.
We’re talking about a lot of money here. And who has this amount of money? Not many people.
Hence, can we keep growing our deep learning models to accommodate for more and more complex tasks? Can we keep increasing the number of parameters in these things to allow current AI to get better and better at what it does. Surely, we are going to hit a wall soon with our current technology? Surely, the current growth of AI is unsustainable. We’re spending months now training some state-of-the-art products and millions and millions of dollars on top of that.
Don’t believe me that AI is slowing down and reaching a plateau? How about a higher authority on this topic? Let’s listen to what Jerome Pesenti, the current head of AI at Facebook, has to say on this (original article here):
When you scale deep learning, it tends to behave better and to be able to solve a broader task in a better way… But clearly the rate of progress is not sustainable… Right now, an experiment might [cost] seven figures, but it’s not going to go to nine or ten figures, it’s not possible, nobody can afford that…
In many ways we already have [hit a wall]. Not every area has reached the limit of scaling, but in most places, we’re getting to a point where we really need to think in terms of optimization, in terms of cost benefit
This is all true, folks. The current growth of AI is unsustainable. Sure, there is research in progress to optimise the training processes, to improve the hardware being utilised, to devise more efficient ways that already trained models can be reused in other contexts, etc. But at the end of the day, the current engine that powers today’s AI is reaching its max speed. Unless that engine is replaced with something bigger and better, i.e. another astonishing breakthrough, we’re going to be stuck with what we have.
Will another breakthrough happen? It’s possible. Highly likely, in fact. But when that will be is anybody’s guess. It could be next year, it could be at the end of the decade, or it could be at the end of the century. Nobody knows when such breakthroughs come along. It requires an inspiration, a moment of brilliance, usually coupled with luck. And inspirations and luck together don’t come willy-nilly. These things just happen. History attests to this.
So, to conclude, AI is slowing down. There is ample evidence to back my claim. We’ve achieved a lot with what we’ve had – truly amazing things. And new uses of DL will undoubtedly appear. But DL itself is slowing reaching its top speed.
It’s hard to break this kind of news to people who think that AI will just continue growing exponentially until the end of time. It’s just not going to happen. And besides, that’s never been the case in the history of AI anyway. There have always been AI winters followed by hype cycles. ALWAYS. Perhaps we’re heading for an AI winter now? It’s definitely possible.
Update: part 2 of this post was posted recently here.
To be informed when new content like this is posted, subscribe to the mailing list:
9 Replies to “Artificial Intelligence is Slowing Down”