top of page

The detection pipeline has three major steps:

1. At first, each input image frame is converted from RGB color space to HSV to better distinguish the car from background. Then HOG features, color histograms are extracted from each image and combined as a new feature vector. Since the magnitude of different features may vary a lot, the combine vector is then normalized. 

2. A linear SVM classifier is trained for on car and non-car image data.

3. For each video frame, I run a sliding window search to identify cars. There may be duplicate detections and false positives. So I built a heat-map to draw tighter bounding boxes and reject false positives. I also used a queue to store the car positions from previous 10 frames and calculated their average to smooth the detection results.

STEP 1. Feature Extraction
STEP 2. Train Classifier
STEP 3. Sliding window search and heatmap
More Thoughts:

The HOG+SVM algorithm did a good job in vehicle detection. But today a state of the art deep network (eg. yolo and SSD) could do much better and achieve real time performance on GPU. Below is the output when I apply Single Shot MultiBox Detector(SSD) to the same project video.

bottom of page