Tesla has managed to attract the best artificial intelligence specialists to its Autopilot team who are committed to developing software that makes full self-driving possible. The company recently published two patents that relate to improvements in this area.
Tesla filed a patent for ‘Machine learning models operating at different frequencies for autonomous vehicles’ on December 3, 2019, and published it on June 4, 2020. This application relates generally to the machine vision field, and more specifically to enhanced object detection from a vehicle.
In the field of machine vision for autonomous vehicles, automotive image sensors (e.g., cameras) are typically capable of high frame rates of 30 frames per second (fps) or more. However, deep learning based image processing algorithms may be unable to keep up with the high camera frame rates without significantly reducing accuracy, range, or both.
Such algorithms may be run at 20 fps or less. This may result in a waste of the additional camera information available, which may thus be unused in image processing and object detection tasks.
Typically, slower machine learning models (eg, object detectors) which run at slower frame rates than the cameras’ frame rates, may have high accuracy, but long latencies, meaning that it can take longer for these slower machine learning models to produce an output . The output may therefore become stale by the time it’s outputted. For example, a slower machine learning model detecting an image may take 200 milliseconds to do so. In the 200 milliseconds it takes for the machine learning model to output the detected image, the image has likely moved.
To resolve this, a faster machine learning model may be employed. However, the faster machine learning model may be less accurate. As may be appreciated, less accuracy may result in a higher likelihood of false negatives and false positives. For automotive applications, for example, a false negative may represent vehicles in an image that the machine learning model fails to detect, while a false positive may represent a machine learning model predicting a vehicle in a location of the image when there is no vehicle present .
Embodiments of the present invention relate to techniques for autonomous driving or navigation by a vehicle. As described, one or more image sensors (e.g., cameras) may be positioned about a vehicle. The image sensors may obtain images at one or more threshold frequencies, such as 30 frames per second, 60 frames per second, and so on.
FIG. 1 is a schematic representation of an example object detection system according to one embodiment.
Source: Tesla patent
The obtained images may depict a real-world setting in which the vehicle is located. As an example, the real-world setting may include other vehicles, pedestrians, road hazards and the like located proximate to the vehicle. The vehicle may therefore leverage the captured images to ensure that the vehicle is safely driven.
Source: Carrus Home
For example, the vehicle may generate alerts for viewing by a driver. In this example, an alert may indicate that a pedestrian is crossing a cross-walk. As another example, the vehicle may use the images to inform autonomous, or semi-autonomous, driving and / or navigation of the vehicle.
The machine learning models may be implemented via a system of one or more processors, application-specific integrated circuits (ASICs), and so on.
According to the present invention, an object detection method may include:
- receiving a first frame from a camera;
- processing the first frame with a first image processing engine receiving a second frame from the camera while the first frame is being processed;
- sending the processed output of the first frame to a second image processing engine with a faster processing speed that the first image processing engine; and
- combining the processed output of the first frame with the second frame to generate an object detection result for the first frame.
FIG. 2 is a flowchart of an example process for object detection according to one embodiment.
Source: Tesla patent
The method functions to provide an image processing system that combines multiple image processing engines to provide object detection outputs at a frame rate and accuracy much higher than either single image processing engine would achieve.
Systems and methods include machine learning models operating at different frequencies. The method disclosed in the patent includes obtaining images with a threshold frequency from one or more image sensors located around the vehicle. Location information associated with objects classified in images is determined based on the images. Images are analyzed using the first threshold-frequency machine learning model.