A Brief Introduction to Smart Video Surveillance Technology
One of the core technologies of smart video surveillance is the automatic tracking of specific objects. The target tracking can be divided into 5 steps, i.e. motion detection, object classification, object tracking (first at the classification level), behavior analysis, and finally individual object tracking. For example, the tracking of a human body consists of the following steps: first, to detect the moving bodies from the real-time image sequences (i.e. video); second, to distinguish the human body from other moving objects; third, to track the motion trajectory of the human body; fourth, to analyze and select the person with abnormal behavior, such as those abandoning baggage and parcels at depots and airports; and fifth, to conduct continuous tracking of the person with abnormal behavior.
Motion detection picks up the moving region of the image from the background in the image sequences. The effective partition of the motion zone will largely reduce the calculations in the procedures that follow. However, instabilities, such as shadows, lightning, slower movements, and static movement, make accurate motion detection extremely difficult.
Basic motion detection can be implemented in two ways; the first is to directly use the intermediate result of the video compression algorithm (motion vectors generated during MPEG4 or H.264 encoding), to implement simultaneous motion detection and video compression. The second method is to detect motion independent of the encoding process.
Motion detection algorithms can be categorized into different kinds. For example, Institute of Automation, Chinese Academy of Science, sums up the motion detection algorithms as three classes: background subtraction, optical flow, and temporal difference.
Background subtraction and temporal difference methods can be regarded as image-difference methods. Background subtraction is one of the most frequently used methods for motion segmentation, which utilizes the difference between the current image and background image to detect the moving zone. The time-difference is to distill the moving regions from the image, by taking the time-difference and thresholding based on pixels in the neighboring 2 or 3 frames in a continuous image sequence.
Motion detection based on the optical flow method adopts the optical flow properties of the moving objects, which vary over time, and then calculates the displacement vector of the optical flow field to initiate the tracking algorithm based on contours, so that the moving object may be picked up and tracked effectively. The advantage of this method is that, unattached objects can be detected even during motion of the camera.
The goal of object categorization is to extract the moving region of the objects of the specific class, from the moving regions that have been detected. Categorization may be classified into methods based on the information utilized, categorization based on motion properties and that based on the shape properties. Categorization based on motion characteristics identifies the object motion by its periodicity, which is not susceptible to errors due to changes in colors and lighting. Shape-based recognition is made by matching the shape features of the moving regions against templates or statistical data.
Object tracking establishes the location of the target based on the position, velocity, shape, texture and color features between the successive image frames. Tracking techniques can be categorized into methods based on modeling, methods based on regions, methods based on moving contours and methods based on features.
Joint tracking and categorization technology is one of the burgeoning research fields in the information fusion domain. The basic idea is to implement bidirectional information exchange between the object tracker and the object categorizer, so that the tracking precision and the categorization performances can be effectively and simultaneously enhanced.
In special cases, it may be required to subdivide the tracking object from classes into individuals. This requires analysis and understanding of the object behaviors. The key issue in the understanding of the behaviors is how to obtain the reference behavior sequence from the learning samples, and how to handle the slight variation of the features on the spatial and temporal scales between similar motion pattern categories during the matching process.
NEXT: Challenges in Implementing Smart Video Surveillance Systems
Page 2: next page



