Surveillance systems are widely utilized to monitor public and private locations. Real-time detection of humans, vehicles, objects, attributes and events requires analysis of a 24×7 video streaming. The surveillance activity is primarily done manually, as accomplished by security personnel by watching real-time CCTV footage in a control room. As a result, the reliability and efficacy of such systems are dependent on the technological capabilities of security staff. Artificial intelligence-based surveillance systems developed in the last few years have overcome these limitations. They can process thousands of camera feeds and detect the events in real-time. These approaches detect simple events like the person wearing a mask, intrusion, collapse, loitering, etc. However, the real challenge is in detecting complex scenarios like person action, anomaly, violence, attributes, crowd, etc. It opens up the research requirement, and Vehant’s ongoing research focuses on these aspects.
Pedestrian attributes are humanly searchable semantic descriptions and can be used as soft-biometrics in visual surveillance, with applications like pedestrian retrieval, person re-identification, pedestrian tracking, etc. Given a person’s image, Pedestrian Attribute Recognition (PAR) aims at predicting a group of attributes to describe the characteristics of a person from a predefined attribute list, for example, gender, clothes type, color, accessories, pose, etc. In the last few years, many algorithms have been proposed to solve this problem, but still, specific challenges like occlusion, shadow, blurring, multi-view, illumination, low resolution, etc., exist and are affecting the performance. The research at Vehant focuses on dealing with such challenges and designing techniques that exploit the relationships among various attributes and recognize them jointly. It will also help re-identify and combat the person’s ID switching in pedestrian tracking tasks.
Crowd detection and analysis have a broad range of applications in video surveillance. Nowadays, crowd detection is not limited to people counting, it also includes analyzing complex crowd scenarios such as people standing in a long queue, sudden crowd movement, crowd cluster detection, crowd motion analysis and tracking, and crowd density map generation. Our research focus is on developing solutions that work beyond counting and deal with the existing challenges.
Human Activity Recognition (HAR) refers to detecting the activity of a person based on motion, the interaction between human-to-human, human-to-object, and human to the environment. Over the past few years, extensive research is going on using a sensor or vision-based features. Due to several challenges like occlusion, background clutter, long-term temporal dependency, etc, we are still far away from an efficient solution. Our research focus will be on designing the solution using visual temporal features for AI-based surveillance system.
Surveillance videos capture a variety of realistic anomalies like running, crowd, person collapse, wrong direction movement, person/object violence, etc. The detection of these anomalies requires designing of the problem-specific algorithm. The research focuses on designing the view independent general anomaly detection algorithm by exploiting the short-long term trajectories of normal videos, modeling the motion pattern or instances using unsupervised or weakly supervised approaches.