Tesla Published A Patent ‘Estimating Object Properties Using Image Data’

Autonomous driving systems typically rely on mounting numerous sensors. They usually include a collection of vision and emitting distance sensors such as radar sensor, lidar sensor, ultrasonic sensor, etc. Then the data captured by each sensor is then gathered to help understand the vehicle’s surrounding environment and to determine how to control the vehicle. But as the number and types of sensors increases, so does the complexity and cost of the system.

For example, emitting distance sensors such as lidar are often costly to include in a mass market vehicle. Moreover, each additional sensor increases the input bandwidth requirements (the difference between the upper and lower frequencies in a continuous band of frequencies) for the autonomous driving system.

Therefore, there exists a need to find the optimal configuration of sensors on a vehicle. The configuration should limit the total number of sensors without limiting the amount and type of data captured to accurately describe the surrounding environment and safely control the vehicle.

Tesla has published a patent, ‘Estimating object Properties Using Image Data’, which discloses a machine learning training technique for generating highly accurate machine learning results from vision data is disclosed.


FIG. 1 is a block diagram illustrating an embodiment of a deep learning system for autonomous driving.
Source: Tesla patent

Using auxiliary sensor data, such as radar and lidar results, the auxiliary data is associated with objects identified from the vision data to accurately estimate object properties such as object distance.

The collection and association of auxiliary data with vision data is done automatically and requires little if any, human intervention. For example, objects identified using vision techniques do not need to be manually labeled, significantly improving the efficiency of machine learning training. Instead, the training data can be automatically generated and used to train a machine learning model to predict object properties with a high degree of accuracy.

For example, the data may be collected automatically from a fleet of vehicles by collecting snapshots of the vision data and associated related data, such as radar data. The collected fusion data from the fleet of vehicles is automatically collected and used to train neural nets to mimic the captured data.


It is a diagram illustrating an example of capturing auxiliary sensor data for training a machine learning network.
Source: Tesla patent

The trained machine learning model can be deployed to vehicles for accurately predicting object properties, such as distance, direction, and velocity, using only vision data.

For example, once the machine learning model has been trained to be able to determine an object distance using images of a camera without a need of a dedicated distance sensor, it may become no longer necessary to include a dedicated distance sensor in an autonomous driving vehicle …

When used in conjunction with a dedicated distance sensor, this machine learning model can be used as a redundant or a secondary distance data source to improve accuracy and/or provide fault tolerance.

The identified objects and corresponding properties can be used to implement autonomous driving features such as self-driving or driver-assisted operation of a vehicle.

Source: tesmanian