Computer Vision

Face Detection and Recognition :

Face Detection dataset
Face Recognition dataset
Face recognition Home Page : Points to important papers, data sets, algorithms and conferences

Content based Image Retrieval :

Zurich Building dataset
UCID dataset
UKBench dataset
INRIA Holiday dataset
Link to publications from Cordelia Schmid's group (INRIA)
INRIA CBIR demo
List of CBIR Engines
Publications page of Malcom Slaney : Malcom is Research Scientist @ Yahoo

Video Content Analysis :

Work on Crowd analysis by Mikel Rodriguez

Image Segmentation :

Segmentation evaluation database

Visual Perception :

Publication's of Pawan Sinha

Computer Vision Blogs :

Shervine Emami's Blog : Shervne Emami is very active OpenCV developer, you can find compact explanation along with source code of some of the important algorithms like blob detection, face detection , face recognition on his blog.
Shubhendu Trivedi's Blog

Computer Vision News :

Computer Vision Central

Will add more later ....

KMeans clustering algorithm is one of the the most popular technique in the field of pattern recognition, data mining and unsupervised learning. All though, it gives no guarantee (theoretically) about accuracy, its speed and simplicity are very appealing for practical applications. KMeans algorithm is significantly sensitive to initial selection of cluster centers. Usually algorithm begins with k arbitrary centers, typically chosen uniformly at random from the data points. However, to reduce sensitivity of KMeans towards initialization of cluster center, KMeans is run multiple times and only one that minimize the sum of squared distances is selected. This method performs better but still does not guarantee accuracy. There are many datasets for which KMeans generates arbitrary bad clustering [1]. I assume, like me most of you also would have struggled with initialization of KMeans.

KMeans++ algorithm overcomes this weakness in KMeans and makes it even more effective. KMeans++ algorithm proposes a simple probabilistic means of initialization for KMeans clustering that not only has the best known theoretical guarantees on expected outcome quality, it reportedly works very well in practice.

Algorithm :

KMeans++ algorithm uses simple probabilistic method for generating initial centers for K-means from set of points X. At any given time, let D(x) denote the shortest distance from a data point x to the closest center we have already chosen. Then algorithm defined as follows :

Sample the first center c[1] from a uniform distribution over X.
For k = 2 to K

Note : Probability of selecting x as next cluster is proportional to D(x) ^ 2.

Code :

C implementation of KMeans++ is provided by authors : Code1 Code2
Java Implementation
Matlab Code

KMeans++ in OpenCV :

To use KMeans++ in OpenCV follow this link.

References :

[1] David Arthur and Sergei Vassilvitskii. k-means++: The advantages of careful seeding.

Links :

Computer Vision

Monday, July 25, 2011

Links to Vision Resources

Tuesday, October 19, 2010

KMeans++