This is a bit of a fun post about a “dataset” I stumbled upon a few days ago… a dataset of cat videos.
As I’ve mentioned numerous times in various posts of mine, the deep learning revolution that is driving the recent advancements in AI around the world needs data. Lots and lots of data. For example, the famous image classification models that are able to tell you, with better precision than humans, what objects are in an image, are trained on datasets containing sometimes millions of images (e.g. ImageNet). Large datasets have basically become essential fuel for the AI boom of recent years.
So, I had a really good laugh when a few days ago I found this innocuous and unexposed YouTube channel owned by a Japanese man who has been posting a few videos every day of him feeding stray cats. Since he has been doing this for the past 9 years, he has managed to accumulate over 19,000 cat videos on his channel. And in doing so he has most probably and inadvertently created the largest cat video dataset in the world. My goodness!
Technically speaking, unless you’re a die-hard cat lover (like me!), these videos aren’t all that interesting. They’re simply of stray cats having a decent feed or drink with their good-hearted caretaker on occasion uttering a few sentences here and there. Here’s one, for example, of two cats eating out of a bowl:
Or here’s one of two cats enjoying a good ol’ scratch behind the ears:
On average these videos are about 30-60 seconds in length. And they’re all titled by the default name given by his cameras (e.g. MVI 3985, etc.). Hence, nothing about these clips is designed in order for them be found by anybody out there.
However, despite all this mundanity, to the computer vision community (that needs datasets to survive like humans need oxygen), these videos could come in handy… one day. I’m not sure how just yet, but I’m sure somebody out there could find a use for them. I mean, there’s over 19,000 cat videos just sitting there. This is just too good to pass up.
So, if there are any academics out there: please, please, please use this “dataset” in your publishable studies. It would make my year, for sure! The cats would be proud, too.
Oh, and one more thing. I found this guy’s twitter account (@niiyan1216). And, you guessed it: it is full of pictures of cats.
To be informed when new content like this is posted, subscribe to the mailing list: