Daimler’s AI and machine learning experts Rigel Smiroldo and Timo Rehfeld explain how autonomous cars learn to see and behave well in traffic.
Artificial intelligence is a hot topic, but what exactly can AI and machine learning contribute to make autonomous driving a reality?
Rigel Smiroldo: At its very core, an autonomous vehicle has to make decisions. That involves understanding two questions. First, what is the environment I’m in right now? That is complicated enough. The second question is about taking action, such as turning or increasing the throttle. And how will that action influence my perception of the environment? Both of those aspects intersect with the realm of AI and machine learning.
An autonomous vehicle has to make decisions.
Timo Rehfeld: AI refers to artificial intelligence as a whole. In the past, if you had a set of rules, you would call it AI. The first chess computer was no more than a set of rules that were hard-coded. It made decisions, but those were pre-programmed. Machine learning is an improvement that gives a computer the tools to learn decisions and complex patterns based on examples we provide — for instance how to detect a pedestrian in a camera image. We use all the different sensors you can think of: radar, cameras, and LIDAR, which is a key sensor because it has very precise distance measurement.
You recently attended an AI conference in Southern California and left most of the highway driving to a Mercedes-Benz E-class. How hard was that feat for the car to accomplish?
Smiroldo: When you use Distronic Plus with distance keeping and lane keeping, you can be relatively lazy on that long, long stretch of freeway going from the Bay Area to Los Angeles. Which goes to show that certain aspects of autonomy are not 10 or 15 years out but are already part of your car today. When you look at the advanced driver assistance and safety systems (ADAS) of a Mercedes-Benz today, they give you some level of autonomy. But what we’re working on is the whole picture of autonomous driving.
And how do we get from driver assistance to autonomy?
Smiroldo: A lot of people think that someday there will be this magic switch when they wake up and suddenly the whole world has changed. In reality, it’s more of a gradual shift.
A lot of people think that someday there will be this magic switch when they wake up and suddenly the whole world has changed. In reality, it’s more of a gradual shift.
Rehfeld: You also have to distinguish between different use cases. Unlike me driving on a highway, urban mobility services are an entirely different scenario. They don’t drive that fast, so you can increase the computer’s processing times. You have more hardware, more sensors because those vehicles are part of a fleet. The algorithms change, too, if you’re driving in the city. I think those technologies will co-exist for a quite a long time. Eventually, they might converge.
What are the things that cars can do today, and what are the things that still stump them?
Smiroldo: It’s mostly about reasoning about a complex environment — the psychological factors. As a vehicle, you want to anticipate if the pedestrian is crossing the street or not, so you need to make sure that he sees you. If the person looks at the vehicle, I know he or she will probably stop. If he’s looking at his phone, you’d rather brake. There are subtle things, too, like the pedestrian waving you through, although the traffic light is red. Those are all corner cases that are really hard for a self-driving car to handle properly because it’s not a physics modeling but a human behavior modeling problem.
What about cars interacting with other autonomous vehicles?
Rehfeld: The mixed environment is a challenge. If you have car-to-car communication and every single vehicle can communicate with every other vehicle, the problem is basically solved. But that’s not the case in real life, and it will probably never be the case for the next decades. So you have to enable your car to drive in a mixed environment, and that involves teaching the car.
Smiroldo: Think about what it takes for a person to gather that cultural data. You have 18 years until you’re an adult. Imagine how many data it takes for a car to get there. With one key difference compared to humans: Once you have the data, you can roll it out to all the other cars.
OK, so you gather a lot of training data instead of hard-coding rules. How exactly does machine learning work?
Rehfeld: Let’s say we want to see two pedestrians in an image, the first thing we need to do is have the data annotated by a human to give the computer the desired outcome. A human marks all the pedestrians in one image so that the computer knows what it needs to detect. Machine learning in the end is an optimization problem. Eventually we will arrive at unsupervised learning, where the computer doesn’t need human annotation anymore, but right now we are heavily dependent on good annotations.
Will humans have to change their behavior as more and more autonomous vehicles hit the road?
Smiroldo: Like with any other emerging technology whenever new things become possible, the culture shifts around it. There will be an evolution in the way pedestrians behave once they get used to autonomous vehicles. They’ll make their movements more, let’s say, obvious — just like we talk differently to voice assistants. That makes it easier for the car. That being said, cars can also become more obvious about their intentions. We can build external user interfaces with visual cues, acoustic cues, lights. One of the nice things is that a vehicle doesn’t have an ego. It doesn’t have that angry New Yorker attitude, insisting that it’s “my turn.”
People who’ll grow up around and inside autonomous vehicles will have faith in this technology by the time they are adults.
What about the experience for people inside the vehicle? What do you show them to build trust?
Rehfeld: We need to think about building a holistic system where the AI that interacts with the customer is connected to the AI driving the car. You don’t want the customer asking: “Do you see that pedestrian in front of us?” And the system responds: “Well, I don’t know, because I’m not connected to the autonomous driving (AD) system.” We need to expose the answers from the core of the AD system to build trust, then humans feel comfortable letting go and letting the car drive. People who’ll grow up around and inside autonomous vehicles will have faith in this technology by the time they are adults.
Rigel Smiroldo is a senior principal machine learning engineer in the data and artificial intelligence department of Mercedes-Benz Research & Development North America (MBRDNA) in Silicon Valley. The University of California, Berkeley graduate leads a small team of researchers to develop novel machine learning techniques for the automotive space. He joined MBRDNA in 2010 and has worked on various projects at the intersection of man and machine, including the new Mercedes-Benz User Experience (MBUX) in the A-Class.
Timo Rehfeld is a principal engineer in the sensor fusion team. His focus is on combining information from cameras, laser scanners and radars into a holistic representation of the environment that the algorithms inside autonomous vehicles can use. Before joining MBRDNA in 2015, he conducted research as a Daimler-internal PhD candidate, working on automotive applications of computer vision and machine learning. He is co-publisher of the Cityscapes dataset, which has spurred research interest in computer vision for autonomous driving.