Today we’ll be talking about perception, specifically how we the animals perceive the world around us, and why it’s necessary to bring together computer scientists and neuroscientists, not just to understand machine learning, but to understand our own brains as well.
The future that’s been promised us by science fiction, the houses that understand us, autonomous cars, computers that contribute ideas. It all boils down to how well a machine can perceive the world around it.
The key is perception. Without perception it’s not possible to make a correct decision, whether you’re an animal or a machine. But first let’s try it. Everyone in this audience has a fantastically capable sensory system. We’re going to test that out right now.
What did everyone see on the screen? Right. So we can all agree it’s a fox. Now that, in and of itself, is fantastic. I didn’t tell you where to focus your attention. I didn’t tell you that it was going to be an animal. I didn’t tell you how to look or how to respond.
Now for a computer, even locating the screen in the room, presents non-trivial technical challenges. But, so with all of us agreeing that’s a fox, those of you who are familiar with machine learning research know that there are algorithms out there that exist that can do that same type of task, that can do it very well.
For example here’s what Clarifai, an online machine learning algorithm has to say about the fox. Of course it recognizes that as well. But, here’s the question. Is the Fox cute? Of course it is. And the algorithm actually recognizes it as well. But, here’s the difference. That question gets at the fundamental difference between machine learning and actual intelligence. Yes, we can go ahead and train a computer to identify cute animals are not cute animals by showing it thousands upon thousands of images of cute animals and thousands upon thousands of images of less traditionally cute animals.
But, that’s not how we decided if that Fox was cute. We learned the concept of cute based on the unique combination of personal experience, natural instincts, cultural influences, and perceived context. In the machine learning world, we call this “The Ground Truth”. In other words, it’s the absolute or correct answer as we understand it to tell the machine that it is.
And the great challenge in machine learning today, doing now, is to program algorithms that agree with your ground truth. Success in that case means an algorithm that is as good at a task as you are. But that’s not the end goal. We want to create a future in which artificial systems extend and augment our ability, and in which artificial systems also can help us to create and help us to imagine. That would be called artificial intelligence by many people.
And to get there, we have to learn something entirely new, and the best way to learn something new is to watch it being done. So, this is a photo of my son Ari and his cat brother Maxwell. Ari was born very early and he spent ten weeks in the neonatal intensive care unit before he came home.
Now when he did come home, he was a lot like a newborn, but he also had a life experience beyond that of a typical term baby. For the parents in this audience who have premies you’ll understand exactly what I’m talking about. For those of you who don’t, it’s enough to say that it’s a unique and complicated experience. What isn’t complicated was Ari’s reaction to Maxwell. Ari loved Maxwell immediately.
He was reaching out to touch him before he knew how to smile or crawl. Maxwell was friendly to Ari immediately. Let’s break that down in a little bit of detail. Ari had seen cats during his lifetime zero times. Maxwell’s a rescue. He’d seen children before, and how do I put this nicely, he’s domesticated but safe around children, definitely not.
We brought them together very carefully, after Ari had been home just a day or two. Watching each of them carefully to see how they would react, and ready to snatch one or the other of them away from either teeth and claws or the baby hands of doom.
And what happened. They were fine together. Somehow, Maxwell perceived that Ari was different than other humans, particularly other children, and Ari was not scared of a predator that was three times his body weight at the time.
Somehow the algorithms running in their brains perceived that they were safe together. Maybe it was the way we were standing when we brought them together. Maybe Maxwell could smell that Ari was somehow family. Sadly neither of them are speaking yet so I can’t ask them how they figured it out.
Now in the machine learning world, we call this a zero shot learning. In other words, on the first exposure, with no training examples, Ari recognized that Maxwell was something he should reach out and touch, and Maxwell recognized that Ari was not a threat, or at least a threat he should do anything about.
This stands in sharp contrast to how machine learning is done today. Today, it we want to train a machine, we show it thousands upon thousands of correctly labeled test examples. We feed them through an algorithm. When the algorithm gets them correct, a correct signal goes back through, it adjusts the algorithm, and the next time it sees something that’s similar to something it’s seen before, it’s more likely to get it correct.
Now you probably expect the rest of this talk to be me telling you how we invented some fantastic shot algorithm that’s going to save the world, right? It’s not going to be. That kind of behavior, in which your machine is correct based on instinct, and context, and minimal training, that’s the future we’re trying to reach. So how do we get there.
Well, we all know that animals are amazing at handling what ever nature can throw at them. They can find food and mates they can identify friend and foe. They do all of this in a complex and diverse environment. Animals have to be good at this. The animals that aren’t good at this get eaten before they reproduce, and they don’t get to pass on their genes. Not a great strategy for actually continuing. They also have to be efficient with this. If you burn up all your food trying to do mental computations you won’t make it to your next meal.
So, to create the algorithms of the future we are using neuroscience to constrain the world of machine learning by working in the intersection between the two fields. We know that good solutions exist for our problems because we can see nature demonstrating them.
So here’s a thought experiment to explain what I mean. If you were running a machine learning algorithm in your head, how long would it take you to recognize a face. Well, first of the person next to you. How long does it take you think, “Hey look at that, it’s a face”.
Well, the scientists who do this experiment for a living, seriously, they tell us it takes about 150 milliseconds, or just a little bit longer than a tenth of a second. Now during that tenth of a second, the electrical pulses in your brain, spikes, have to travel from your retina to your deep visual cortex via a series of about 10 neurons in the gaps between them.
We know from the neuroscience and the anatomy that it takes about 10 milliseconds for one of those spikes to cross a gap. So if we have 10 gaps, 10 milliseconds times 10 gaps, that’s 100 milliseconds, and that tells us that your brain is doing the facial recognition task just about as fast as you expected to be able to.
Now, hold that thought, because we’re going to compare this to a machine learning algorithm. Now modern machine learning algorithms use modeled neurons, essentially just little calculators and large sets of layers. The connections between those layers are just like the gaps between the neurons that we just talked about. But here’s the trick. Calculators require a number and a spike isn’t a number. To actually get a good estimate of the number we want to enter into those calculators, we need to count a few spikes, say 10 or so. So what does a machine learning algorithm actually look like?
Here’s an example of a very popular deep learning algorithm. This is way too complicated, so we’re going to clear away everything we don’t need to think about right now. So, to decide if something is a face. Or decide if it’s not, the information has to flow through about eight of these calculators across eight gaps. Now we just said we need to count about 10 spikes per calculator.
So ten spikes, times ten milliseconds per spike in your head, that’s a 100 milliseconds, times the eight layers, that’s 800 milliseconds. That means if you were running an AlexNet in your head, it would take you close to a second to recognize a face. Kind of awkward at parties.
So do we throw away AlexNet. Of course not. This is a right way to solve this problem, and artificial systems solves a challenging problem very well. But the thought experiment serves to show that our brains, nature, is a very different way to solve this type of problem.
And that’s exactly why we want to bring together the machine learning experts and the neuroscientists. Experimental observation of nature, shows us how good nature can be at the task that we’re interested in solving, while the machine learning algorithms show us what can be accomplished in artificial systems. Bringing them together opens the door to what I call biologically feasible computing. So here’s a concrete example. At the Space and Naval Warfare System Center Pacific, a Navy research lab where I work, we’re very interested in understanding the digital language spoken by radio transmissions.
Now we tried regular machine learning to solve that problem, and we were about as good at solving it as flipping a coin. Not good enough. So we drew inspiration from the visual system, the same one we’ve been talking about today, and we constrained our model to look for the same things that your eyes do. We then pointed our newly constrained model at our transmissions and it made essentially no mistakes from that point on.
We took the machine learning model, and by constraining it, made it closer to a biological simulation. In other words, by better mimicking nature, we had a dramatic performance increase in our algorithm. So if simple constraints can reap rewards, what’s the next step? Places like the University of Manchester, Brain Corporation, IBM, Intel and others, they’re developing special computers that can mimic and simulate neurons in biophysical detail. My research team is collaborating with those groups with a simple goal in mind.
We want to create algorithms that compute in biologically feasible ways. In other words, we want to develop algorithms that could potentially be running in your head, and if we’re able to do that, we kick off an important feedback loop. By developing the principles for biologically feasible computing, we demonstrate features that can be used to develop better brain models. Those better brain models in turn lead to better simulations, the better simulations lead to new neuroscience hypotheses, new hypotheses lead to new experiments, and the new experiments illuminate new principles that can be used for biologically feasible computing.
By bringing machine learning and neuroscience together, we make them greater than the sum of their parts. Now, I can’t tell you how many times we have to go around this loop before our algorithms are as good as Ari’s or Maxwell’s, but I can tell you that this is the road we take towards creating that future.