Tesla AI Day Preview

Tesla Space AI Robot.png

Tesla AI Day is coming. This summer we finally get to see what this company has been up to with their artificial intelligence division. And it’s safe to say they have been up to a lot. Elon has been hinting for a while now that we need to think of Tesla not just as a car company, and an energy company, but as an AI company as well. And not just any AI company, but one of the biggest in the world. Elon has conceded that Tesla might not be on top of the game, he says that Google is still way ahead of them, but it’s probably very fair to say that Tesla is way ahead of most others. Especially other car makers, that’s for sure. Today we’re setting up a preview to what Tesla will be showing us at their AI Day this summer.


So, we were very fortunate recently to get a lengthy presentation from Tesla’s head of artificial intelligence, Andrej, at the 2021 Conference on Computer Vision. During his presentation, Andrej provided some great insights into Tesla's AI projects like Pure Vision, but he stopped short of revealing any new information that we weren’t already at least somewhat aware of. For example, he did mention Tesla’s AI training supercomputer Dojo, but he chose not to get into any details about it. I’m guessing that AI Day is going to be all about the things that Karpathy didn’t want to get into during the conference.


Who is Andrej Karpahy anyways? He doesn’t get nearly as much publicity as Elon Musk, but he’s just as important to the company, so I think he deserves a bit of backstory here. Andrej is only 34 years old, born in Slovakia and immigrated to Canada as a teenager. He scored two university degrees up here in the North before moving down to California to complete his PHD at Stanford, focusing on the intersection of natural language processing and computer vision. In 2016 Andrej joined the research group Open AI, which was co-founded by Elon Musk - which is, I assume, where Elon found him and brought Andrej on to head up Tesla’s artificial intelligence program in 2017.  It’s good to know who Andrej is, because when AI day rolls around, I think we’re going to be seeing and hearing a lot of him. I wouldn’t be surprised if Andrej does more talking than Elon.


The problem that Andrej Karpathy and his team are trying to solve is a pretty big one, but I like the way that he describes it. Basically, we have 99 percent of cars on the road today being driven by meat computers. Tesla is aiming to replace as many of those with silicon as possible. (make sure you say sill-li-con, not silli-cone, con like Con Air not cone like Cone Heads, people get weird) 


And the way we get to self driving cars is through computer vision. Karpathy says that the only effective and scalable way to reach an autonomous vehicle future is through a vision based system. No other technology is going to be able to get us there on a global scale. So far, the only competition to Tesla’s Vision system has been good old fashioned radar. The other autonomous vehicle companies like Waymo, use humongous lidar arrays on the top of their vehicles. The biggest problem with radar is that it can’t handle the job on it’s own. Waymo radar needs to be paired with high definition mapping of the city streets that the vehicles can operate on. The obvious downside there is that mapping needs to happen first before autonomous driving can be possible. They’re basically geo fenced. It’s almost like taking a train, it can only go places where the track has already been laid down.


So far, Tesla have been able to avoid mapping by using a combination of radar and cameras in their self driving features. Up until recently it was the radar sensor’s job to judge depth, velocity and acceleration of objects on the road, while the cameras are in charge of identifying those objects and deciding whether they’re a car or a person or a horse or whatever.


Over time, what Tesla figured out was that the vision sensors were actually able to do the majority of the heavy lifting and were leaving the radar sensors in the dust in terms of signal resolution. For their vision system, Tesla uses 8 digital cameras that surround the vehicle and see in a 360 degree field of view. Each camera is recording at a resolution of 1280 by 960 pixels and a rate of 36 frames per second. Andrej says that the data collection from those cameras is so robust, that the other sensors on board were just becoming crutches. So they’ve decided to delete the radar entirely from all newly built Model 3 and Model Y vehicles.


Karpathy gives a few reasons for making this decision to drop radar from the system. Chief among those reasons was that vision had reached a point where it was so good, that radar was actually starting to hold the system back. When the car hits a situation where radar and vision disagree, who does it believe?


Tesla did a whole lot of testing where they compared readings for distance between radar and vision. They did some very scientific experiments with graphs and lines and the whole bit. What they learned was that radar is super accurate for judging speed and distance as long as it has a solid lock on the subject that it is tracking. But as soon as there is a sudden change in the target, like a hard braking event, then radar actually tends to get disrupted and lose the target completely, then it will suddenly require it again. So you end up with a very jagged line on the graph. At the same time, they found that the vision based sensors were able to track the speed and distance of the target vehicle just as good as as radar under normal circumstances, and then during the hard breaking events, vision actually tracked the subject through the entire emergency brake and resulted in a much more smooth line on the graph. 


Basically what Tesla found was that more sensors does not always equal more signal, radar was actually contributing more noise than signal and that on the whole, vision was actually more reliable. At that point it becomes a question of resources, why bother spending time and money on radar, when vision is already better? So they removed the radar sensors completely from the Model 3 and Model Y. And so far it’s worked out just fine. Karpathy says that there had been 1 point 7 million miles of no radar autopilot driven at the time of his presentation, and no accidents to report. On average, Tesla has 1 crash for every 4.2 millions miles driven on autopilot. Of course, you never get a news story when things work.


Karpathy says that vision is the only solution for an autonomous vehicle that is scalable, meaning it can actually be rolled out to the general public, to millions of cars around the world. Any other system that still relies on HD mapping can never succeed on a global scale, it’s just too much work to keep the maps up to date. Solving vision isn’t exactly easy either, it takes a lot more brain power than just driving a mapping truck around town. But, with a vision based Tesla can self drive on any road in any place. And that’s something that only Tesla will be able to do.


And there are two things that make this all possible - one is the neural network supercomputer located at Tesla’s Headquarters. Karpathy says that this is currently the fifth largest supercomputer in the entire world. And two is the Tesla fleet, over 1 million cars around the world all contributing the data that feeds into this massive computer system.


Basically what the computer is doing with all of this data is labelling it. For every frame of video, it’s finding objects and assigning them a label, then storing that information in the brain of the self-driving computer. You can think of the car a bit like a young child, like a toddler, they can see everything that’s around them, but they’re stupid, they don’t know what they’re looking at until you teach them. And obviously we need some humans to do that teaching, but once human labelling reaches a certain point of detail, then the teaching computer can take over and start auto labelling on its own. That machine learning is essential, because we’re talking about millions of video clips, Tesla would have to hire tens of thousands of people to ever be able to keep doing manual labelling for every frame of video. And Karpathy said that Tesla’s AI division only employs about 20 people. They run very lean.


Now, the network doesn’t process every second of video from every drive in every Tesla… that’s just too much. Karpathy says that they have established 221 triggers that will tell the vehicle to record 10 seconds of video and upload it to headquarters for processing. What Karpathy and the team are really looking for are edge cases. They need a large dataset and a clean data set, but they also need to record as many fringe cases as possible, weird driving situations that don’t happen to everybody, every day. One of the main triggers would be any time that the vision system and the radar disagree or any time the self driving computer and the human driver disagree - obviously they’re interested in figuring out what happened there. But there are hundreds of other scenarios that they need more data on, like steep hills or dips, sharp corners, stop and go traffic, tunnels, cars with weird stuff on the roof like a canoe or something… Tesla needs all of this weird stuff to fully and truly create a vision based system that can function with zero driver interventions.


It’s an absolute ton of data that needs to be processed though. And to really get over the hump, Tesla needs more than just the fifth most powerful supercomputer, they need the GOAT supercomputer. And that’s Dojo, the thing that Andrej doesn’t want to talk about yet, but undoubtedly will be talking about when AI rolls around.


And the last point that we need to keep in mind is that Tesla are doing all of this themselves, there are no third party systems involved in their self driving operations. They collect and own all of their own data. They write their own code. They develop and build their own hardware, they even make their own computer chips. And they do this all with less than two dozen very, very smart people. This is a world class example of vertical integration and efficiency. And this is the only reasonable way that we are going to reach a future of safe, self driving cars on a global scale.


Seth Hoffman

Seth is the Owner & Creative Director at Known Creative.

http://beknown.nyc
Previous
Previous

Tesla Cybertruck Update July 2021