Stereo vision refers to obtain depth image of certain scene based on images captured with two cameras positioned parallely. Depth image helps in extracting 3D information which can be further used for 3D visualization or 3D object detection etc. The 3D information sensing provides an extra ease of visualization and helps to get rid of occlusion and overlap problems among different objects which might be encountered in 2D images. Vehant’s NuvoScan is an automated under vehicle scanning system (UVSS) which is based on stereo vision. It simultaneously captures the left and right view of underside of a vehicle in order to identify any possible threats hidden underneath. Furthermore, point cloud based 3D visualization enables hard-to-view/occluded areas in vehicle image to scan easily. Another important feature available in the system is the ability to scan vehicle of all sizes and shapes with different travelling speed.
3D data provides an opportunity for a better understanding of the scenes compared to 2d images because two objects might overlap and occlude each other in 2D, but their spatial locations are typically separable in 3D. These properties enable numerous research problems based on 3D data namely 3D shape classification, 3D object detection, 3D point cloud segmentation, and 3D point cloud registration etc. 3D data can usually be represented with different formats, including depth images, point clouds, meshes, and volumetric grids. Point cloud is the most commonly used format and is nothing but collections of points in three dimensions which are processed to form any object in 3D. It preserves the original geometric information in 3D space without any discretization. Therefore, it is a preferred representation for our under vehicle scanning system as well. Objective of acquiring such high representation data is to do vehicle inspections that help in threat object detection underneath vehicles and different images registration of the same car.
Novel view generation, or image-based rendering, is a classic problem wherein goal is to render a novel view of a scene given a sparse set of sampled points captured at point clouds. 3D warping utilizes point clouds to capture the geometry of the object, which can be freely rotated into the desired view and then projected into a new image generation.
Along with 2D mages, 3D point clouds are also available due to the recent surge in outdoor LiDAR scanning technologies. The 3D point cloud is an important geometric data structure with an irregular format. Unlike 2D images, which are stored in a structured manner that provide explicit neighborhood relations point clouds in the general case have only implicit neighborhood relations and are therefore considered as unstructured data. Due to the lack of explicit neighborhood relations point cloud processing requires extensive effort and research. The recent proliferation of multimodal data (2D and 3D) fusion techniques has led to producing high end results in various applications such as object detection, tracking and segmentation.
The stereo camera system is one of the effective instruments for the depth estimation by capturing the two images of a scene. The true 3D coordinates of a point that is visible in both the images can be obtained using triangulation when both the cameras are calibrated. Camera calibration is the process of estimating the intrinsic and extrinsic parameters of a camera. Though the calibration problem is majorly solved for simple camera models like pinhole, the solution is non-trivial when distortions are present. Our research focuses on solving the stereo calibration problem for a camera model that accounts for various distortions like fisheye, de-centering, and thin prism distortion etc.