This article was written by Richard Lawler. Richard’s been tech obsessed since first laying hands on an Atari joystick.
Millions of images and YouTube videos, linked and tagged to teach computers what a spoon is.
It seems like we hear about a new breakthrough using machine learning nearly every day, but it’s not easy. In order to fine-tune algorithms that recognize and predict patterns in data, you need to feed them massive amounts of already-tagged information to test and learn from. For researchers, that’s where two recently-released archives from Google will come in. Joining other high-quality datasets, Open Images and YouTube8-M provide millions of annotated links for researchers to train their processes on.
The Open Images set comes from a collaboration between Google, Carnegie Mellon and Cornell, with 9 million entries that were tagged by computers first before having those notes verified and corrected by humans. The Google Research team says it has enough images to train a neural network «from scratch,» so if you’d like to try your hand at a DeepDream-style project, better version of Google Photos or the next Prisma then it’s ready to go.
To read the full article, click here.