Harvard neuroscientist Sam Anthony says self-driving vehicles must learn to see and interpret the world like humans before they can become part of urban life.
Mr. Anthony, one in three Americans say they are scared of self-driving cars. Another study found that 55 percent of people would not get into an autonomous taxi. What makes it so hard for humans to accept their new companions on the road?
Sam Anthony: The bar is very high for how well these vehicles have to perform to be accepted. To see an autonomous car on the highway is one thing, but if a self-driving vehicle in a city shows behaviour that deviates unexplainable from human behaviour, any person is going to be really soured on them for a long time. I drive, but I also walk and bike a lot in cities, and I have been thinking about the question of how autonomous vehicles will fit in from a personal and scientific angle. What I realized is that these vehicles lack a basic ability that's effortless for humans: They are not able to play well with others in shared spaces.
How smart or how dumb are autonomous vehicles —what are the things they are good at, and what are they horribly bad at?
Anthony: In short, they can’t mimic the most natural human behaviour called intuition. But let’s look at the positive side first. An autonomous vehicle system doesn't get tired, stressed out or angry. It doesn’t become inattentive or altered by substances. It doesn't lose precision in its steering or braking and has an attentional field that covers 360 degrees of the roadway at all times. All of those are huge pluses, but the things that are easiest for humans are hardest for computers —that’s true for all of artificial intelligence.
The things that are easiest for humans are hardest for computers — that’s true for all of artificial intelligence.
What, on the other hand, are the capabilities at which humans excel?
Anthony: People are unbeatable at understanding the visual world. We have an intuitive sense for how objects around us will behave and should behave, if they're dangerous or not. Humans, in fact, are absolutely the gold standard at understanding other humans. And it’s a facility completely independent of whether I have driven before. You can ask four-year-olds to perform very complex tasks where they have to make judgements about what's going on in somebody's head, and they're able to do it. It comes to us so effortlessly that we don't even really realize how endemic it is to the task of driving. This is what I see as the biggest remaining problem for self-driving cars.
Meaning it’s a skill set often overlooked when we discuss the progress of machine vision and machine learning?
Anthony: Exactly. If you ask someone what skills are necessary for driving, they'll say you have to be able to read road signs, you need to know the rules and know where the edges of the road are, plus you have to pay attention, perhaps wear glasses if your vision is not very good. Way down the list, if ever, would they say you have to know if somebody wants to cross the street. Because our intuition works so automatically, that it doesn't really reach the level of conscious awareness. We're so good at it that it’s an accepted background part of driving. Our systems of insurance fault are designed around this facility, our roadways are designed around everyone has this facility.
Do humans and machines stand a chance to get along better on the road?
Anthony: Tough question. If self-driving cars don’t get better at mind reading, we run the risk of perceiving them as jerks on the road. Some human drivers are unbelievably timid on the road, while others will just cut you off and act like they didn't even know you’re there. Current autonomous vehicles have the possibility to be both of those drivers at once — be incredibly slow and timid and still cut you off.
If self-driving cars don’t get better at mind reading, we run the risk of perceiving them as jerks on the road.
You actually ran experiments and taped how people interact in traffic. What did you learn?
Anthony: I took a camera from my lab at Harvard and recorded an intersection—not a terribly busy street, it doesn’t even have a traffic signal. When you watch a 30-second clip, you see a lot of complex things happening, about 45 interactions of one person trying to read another person’s mind. These powerful and accurate judgements happen all the time, in fractions of a second. Behavioural scientists call this concept ‘Theory of Mind’.
Give us an example, please.
Anthony: A scooter pulls up to the intersection and wants to drive through it. He's waiting to see if the oncoming cars will let him pass. Someone drives a car in the opposite direction and wants to turn left. The scooter sees this, and he actually backs up slightly. When he backs up, he's very clearly communicating to that car: “Oh, I see that you're there, and I understand that you have the right of way, and if you're going to go left I want to be out of your way so you can make that turn.” The scooter moving back four inches or so contains this very rich signalling what goes on in the head of the scooter driver. If that left-turning car was an autonomous vehicle it would go into a failure mode and be stuck there forever.
Is there hope that an algorithm will ever be able to approach the fluid calculations that go on in our heads without us even knowing it?
Anthony: The good news is that we have a very big toolbox we can use to get people to answer questions for which they don’t consciously know the answer. It’s called behavioural science. Experts have spent decades trying to figure out how you measure people doing things they don’t know they’re doing. For instance, when humans are looking at the visual world, the human eye is moving. It constantly makes ballistic movements called ‘saccades’, many times per second. People might just say they are looking at an object, but in fact their eyes are traversing a very sophisticated pattern. They’re looking at different aspects of a thing: the periphery, the centre, features that are brighter or look like a face. If you measure the locations their eyes land on, the so-called fixation points, you can draw a map of what people care about in the visual world. You can use that map to reverse-engineer the features that compel our attention before we even know our attention is being compelled.
That’s the science. How exactly we can teach machines to have intuition and figure out human intent.
Anthony: My co-founders and I started a company called Perceptive Automata to apply these insights from behavioural science to AV systems. We find a problem, a situation or task where humans excel, and then use techniques from behavioural science to build an extremely sophisticated characterization of how humans behave. The challenge is to turn this into a machine learning model. That's the higher level goal of a lot of machine learning, but the traditional approaches went about it differently. You usually try to determine a verifiable fact about the world — for instance, if an image contains a tree or not — and then do as good a job as possible of nailing down that fact. The classic machine learning model will give a system 30,000 examples of trees and 30,000 examples of not-trees, and then the algorithm has to figure out the difference. It’s a great technique, but it runs into problems when you’re dealing with questions that are ambiguous or contingent: is this person going to walk into the crosswalk or will she wait? So instead of pushing the model in the direction of figuring out patterns in ground truth and data about things in the world, we want software to look at a problem like a human does. You could say we treat people like a black box and carefully measure the output — that’s how we understand the intuitive capabilities we want to mimic.
We want software to look at a problem like a human does.
When will we get to the point where autonomous vehicles possess intuition powers on par with humans?
Anthony: They will get good enough in the relatively near future. You probably don’t need to exactly match human behaviour for autonomous vehicles to do a good job in cities, in part because of all the other strengths they have.
Why don’t we design autonomous cars that signal more clearly to the surrounding humans how limited they are, with lights or sounds, instead of expecting the robots to become smarter anytime soon?
Anthony: Communication channels between an AV and its environment, as well as visual signals like light displays are important. But if you make autonomous vehicles signal that they’re a little ‘dumb,’ it opens them up to being gamed by humans. Driving is not a zero-sum activity, but navigating dense urban environments means competition. There are times when people are trying to get an advantage in traffic or choose an optimal route that involves others staying out of their way. To gain that type of advantage you could fake out an AV and force it to make an emergency stop. You could end up in a situation where these cars are mistreated, and the people riding in them have a very negative experience. So we need more than just signalling. The quickest way to integrate these vehicles in dense, mixed environments is to have them behave like humans do.
Autonomous vehicles can play a big part in making cities friendlier places for multimodal transport.
What’s the worst-case scenario if we don’t give autonomous systems intuition as they become more widespread?
Anthony: There are a couple ways these vehicles could fail to live up to their promise. One scenario is that they just never work right in mixed environments, which would leave us with cars that have very cool cruise control but not much more. To truly transform mobility, these vehicles have to drive on city streets. Another bad outcome I could imagine is that there’s enough investment and excitement to put them on urban streets, but they don’t play well with others. That would be a pity, since we are still suffering from the consequences of a transportation monoculture centred around the car. A lot of cities around the world, especially in Europe and in parts of the US, have been moving away from that, since they have realized it doesn’t accomplish the goal of a liveable city. I think autonomous vehicles can play a big part in making cities friendlier places for multimodal transport. But if they can’t solve this problem, they could reverse all the progress we’ve made.
Sam Anthony is a neuroscientist and computer scientist who did his graduate work at Harvard University’s Vision Sciences Laboratory. In late 2014 he co-founded Perceptive Automata with two academic colleagues. The tech start-up in Cambridge, Massachusetts, is in stealth mode to develop software systems that wants to give autonomous vehicles the ability to understand what’s in the minds of humans and human-driven vehicles around them.