Tuesday, September 25, 2012

Computer Vision Research

My graduate study in KAIST was concentrated on computer vision and machine learning research. Here I present some of research projects and implementations I worked on at Visual Communications Lab, KAIST.



Pseudo-stereo Detection via Epipolar Geometry Estimation

Detecting reversed 3D image for fail-proof stereoscopic video processing system
Jun. 2009 ~ Feb. 2010, @KAIST

The inverted-stereo effect is a type of visual fatigue caused by reversed stereoscopy – i.e., the left-view image is delivered to the right eye and the right-view image to the left eye. Because the human visual system has no mechanism for recognizing inverted-stereo, the human brain struggles to overcome the oddness but ends up feeling strong visual discomfort.

We proposed a feature-based method for estimating the relative positions of stereo images. The proposed method is based on the properties of epipolar geometry and finds the relative positions from a pair of stereo images without prior knowledge of camera configurations. Furthermore, its computational complexity is relatively low because most of the required computation can be conducted by well-known linear algebraic operations.

Proposed algorithm detects configuration of stereo-scopic images based on epipolar geometry

Publications
[1] W. Kim, S. Leigh, N. Hur, J. Choi, “Feature-based detection of inverted-stereo for stereoscopic 3D viewing comfort,”IEEE trans. on Broadcasting, June, 2012 [PDF]
[2] W. Kim, S. Leigh, G. Lee, N. Hur, J. Kim, “Automatic detection of pseudo-stereo,” International Conference on 3D Systems and Applications, General Academy Center, Tokyo, Japan, May 19-21, 2010 [PDF]
[3] W. Kim, S. Leigh, S. Kim, N. Hur, J. Kim ,"Pseudo-stereo Detection via Feature Extraction“, 2010 Image Processing and Image Understanding Workshop, Jan. 2010 [PDF]



Unified Approach on Batch/Sequential Estimation of 3-D Structure and Motion

Expectation-Maximization based estimation of Structure and Motion
Master's Thesis Research
Mar. 2008 ~ Dec. 2009, @KAIST

I presented a unified analysis on batch algorithm and sequential algorithm for structure from motion (SfM), and devised a framework based on expectation-maximization (EM) algorithm. The proposed method is not restricted to any specific motion model or input data type, moreover, the proposed framework has a flexibility to be modified to various implementation.

The proposed method is tested on both virtual and real image data, and its performance is compared to conventional optimization-based algorithms. Even with simplistic sub-procedure for each computation step, the proposed method gives comparable results to well-known conventional methods. And the sequential algorithm robustly converges to a reliable solution without any assumption on a camera movement.

The probabilistic model of estimated 3D-point is defined as above.
 
Then the proposed method iteratively estimates 3D position of the point and camera position/orientation.
 
In sequential algorithm, as the number of input image increases, accuracy of 3D structure is refined.
 
Publications
[1] S. Leigh, "Unified Algorithm for Batch/Sequential Estimation of 3-D Structure and Motion“, Master’s Thesis, Feb. 2010 [PDF (kor)], [PDF (eng manuscript)]
[2] S. Leigh, S. Kim,"2D-to-2D Homography Based Camera Pose Estimation Method Using EM Algorithm“, 2009 Proc. of Institute of Electronics Engineering of Korea, Jul. 2009 [PDF]



ISP Advanced Features

Developing Efficient and Enhanced Image Processing Algorithms for Embedded ISP Module
Jan.1 2009 ~ Dec.31 2009, @Visual Communications Lab, KAIST
Supported by Institute of Information Technology Advancement (IITA)
 
Left: Original Image, Center: Enhanced Image, Right: Extremely Enhanced Image

Left: Original Image, Center: Enhanced Image, Right: Extremely Enhanced Image
 
In this project, I proposed and implemented algorithm for enhancing Dynamic Range of digital images. The primary goal of this research was to create a computationally efficient algorithm for embedded Linux systems (for digital camera modules). The algorithm enhances an image through pixel-wise texture analysis, which requires complicated computation steps. However, proposed method effectively lowered the computational burden, by adopting novel image processing and algorithmic techniques. You can see in the first image, that the color of sky does not change much, while the contrast of the bottom half of the image got enhanced significantly.
 
 
This is the overview of the whole system. The details are not shown on this diagram for legal issues.
 
Patents
"APPARATUS FOR ENHANCING IMAGE AND METHOD THEREFOR", KR, 10-2009-0110743, 1010734970000
 


3D Structure Acquisition from Multiple Images


As a member of Visual Communications Lab, I significantly contributed to the initial setup for 3D reconstruction research - including building multi-camera rig and hardware infra for 3D reconstruction, and multi-view test data acquisition. The picture above depicts a sample reconstruction based on SFS method of our lab (Visual Communications Lab).



ISP Tuning Tool

Implementation of a tool that automatically set ISP module setting parameters based on image analysis
Mar.1 2008 ~ Dec.31 2008, @Visual Communications Lab, KAIST
Supported by Institute of Information Technology Advancement (IITA)



Odometry by Planar Homography in Catadioptric Images and Comparison with IMU Data

Finding camera motion parameter from catadioptric images via planar homography estimation
Undergraduate Thesis Research
Jul. 2007 ~ Dec. 2007, @Robotics Computer Vision Lab, KAIST



SOME MORE IMPLEMENTATIONS

[1] FFT for Image Rotation
Rotating image using computations in frequency domain, based on the duality of Fourier Transform.


[2] Temporal difference based moving object detection
System for detecting moving objects in front of a static RGB camera. Temporal Differences of 3 consecutive frames are computed and the object region is determined.


[3] Image-based localization- Nearest image recognition + Camera Resectioning
With an image collection where only a portion of images having GPS data, the system automatically computes the geo-location of images with no GPS data.

[4] Digital image deblurring- RL deconvolution based kernel estimation
Given a blurred image and a low resolution preview image, we can obtain deblurred image using RL deconvolution. The blur kernel is estimated from small-sized preview image and original-sized blured image through iterative calculation.
  • Richardson, William Hadley, Bayesian-Based Iterative Method of Image Restoration, JOSA, 1972
(1) Blurred Image, (2) Low-resolution Preview Image, (3) Deblurred Image, (4) Estimated Kernel


[5] MSER+ SIFT feature
MSER is detected from a given image, and SIFT descriptor is used to describe the salient regions of the image.
  • J. Matas, O. Chum, M. Urba, and T. Pajdla, Robust wide baseline stereo from maximally stable extremal regions, BMVC, 2002
Detected extreme region

Matching Result using SIFT descriptor. Based on Nearest Neighbor.


[6] Video Compass for camera intrinsic parameter estimation
Self-calibration of camera from a single picture taken in urban environment.
  • Kosecka J., Wei Zhang, Video Compass, ECCV, LNCS 2350, Springer Verlag, 2002


[7] Vocabulary Tree for image searching
Automatically classifying and recognizing object based on a vocabulary tree. Based on the extracted SURF feature points, the system automatically recognizes objects in images by finding the most probable match in the database with significant speed.
  • D Nister, H Stewenius, Scalable recognition with a vocabulary tree, CVPR, 2006
  • H Bay, T Tuytelaars, L Van Gool, Surf: Speeded up robust features , LNCS, 2006


[8] Stereo Matching based on Belief Propagation
Two state-of-the-art stereo matching algorithms were implemented and compared. Both algorithms are based on Belief Propagation, however the latter one introduced the occlusion map to enhance accuracy of the estimated disparity.
  • P. F. Felzenszwalb, D. P. Huttenlocher, Efficient Belief Propagation for Early Vision, CVPR 2005
  • J. Sun et al, Symmetric Stereo Matching for Occlusion Handling , CVPR 2005



[9]Face recognition
A face recognition system. Well-known algorithms including PCA, FLD, and Neural Network are implemented for the system, and tested to compare each other's applicability and performance.




[10] MATLAB modules/library for computer vision
For my Master's Thesis Research, I had implemented all the MATLAB function modules for 3D computer vision research. It contains most of the well-known and stable algorithms to deal with 3D computer vision computation tasks. The modules are not very well-organized for the moment, however, I am planning to share the source codes to public whenever I got the codes re-organized. :)
 

about