...

Basic Modules for Computer Vision Jitendra Malik December 9, 2009

by user

on
Category: Documents
28

views

Report

Comments

Transcript

Basic Modules for Computer Vision Jitendra Malik December 9, 2009
Basic Modules for Computer Vision
Jitendra Malik
December 9, 2009
Important modules for computer vision
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Contours (gPb)
Regions (gPb-owt-ucm)
Computing descriptors on points, regions, or windows
Vector quantization of descriptors (k-means)
Nearest-neighbors for high dimensional descriptors
Training of SVMs (linear, additive,rbf)
Evaluation of SVMs (linear, additive, rbf)
Hough transform voting
Optical Flow
Tracking objects/humans
Semantic Segmentation
Object detection by multi-scale scanning
Ask this question repeatedly, varying position, scale, category…
Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection.
Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08
UC Berkeley
Computer Vision Group
Object detection by multi-scale scanning
Ask this question repeatedly, varying position, scale, category…
Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection
Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08
UC Berkeley
Computer Vision Group
PASCAL VOC 2009 Detection
AP=0.16
Challenges
• Sub-categories
• Aspects
• Occlusion
Addressed by Poselets (Bourdev & Malik, ‘09)
AP =0.394
Important modules for computer vision
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Contours (gPb)
Regions (gPb-owt-ucm)
Computing descriptors on points, regions, or windows
Vector quantization of descriptors (k-means)
Nearest-neighbors for high dimensional descriptors
Training of SVMs (linear, additive,rbf)
Evaluation of SVMs (linear, additive, rbf)
Hough transform voting
Optical Flow
Tracking objects/humans
Contours & Regions
(Arbelaez, Maire, Fowlkes)
Descriptors
•
SIFT, HOG, GB ..
–
–
•
Typically high dimensional, 100-1000
Computed on points, regions, or windows
Used in different ways
–
–
•
Evaluate using SVM
Vector Quantize to a “word” and then use “Bag of Words” models
Computational problems
–
–
–
–
Vector quantization of descriptors (k-means)
Nearest-neighbors for high dimensional descriptors
Training of SVMs (linear, additive,rbf)
Evaluation of SVMs (linear, additive, rbf)
SIFT descriptor on region
UC Berkeley
Computer Vision Group
Efficient Training of Additive Classifiers (Maji & Berg)
• SVMs with additive kernels are additive classifiers
• Histogram based kernels
– Histogram intersection, chi-squared kernel
– Pyramid Match Kernel (Grauman & Darell, ICCV’05)
– Spatial Pyramid Match Kernel (Lazebnik,Schmid, Ponce CVPR’06)
• IKSVMs can be efficiently evaluated at runtime (CVPR ‘08)
• New result: one can train these classifiers up to two orders of
magnitude faster w/o loss in accuracy compared to kernel SVM
Efficient Training of Additive Classifiers
• Approximate classifiers where h is piecewise linear
• Use standard linear SVM techniques to solve
Encourages smooth functions
Closely approximates min kernel SVM
Custom solver : PWLSGD (see paper)
• Trains classifiers up to two orders of magnitude
faster w/o loss in accuracy compared to kernel SVM
Max-Margin Hough Transform (Maji)
1. Local parts vote for object pose
2. Complexity : # parts * # votes
Can be significantly lower than brute force search
over pose (e.g. sliding window detectors)
3. Learn weights for the votes in a maxmargin framework to optimize detection
Learned Weights (ETHZ shape)
Naïve Bayes
Max-Margin
Influenced by clutter
(rare structures)
Important Parts
blue (low) , dark red (high)
Region based Detection (Gu,Lim,Arbelaez)
Hough
baseline1
Det. rate at
0.3FPPI
31.0%
kAS
1
62.4%
1. Ferrari et al. PAMI 2008. 2. Ferrari, Jurie, Schmid. CVPR 2007
Shape 2
Ours
67.2%
87.1±2.8%
Important modules for computer vision
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Contours (gPb)
Regions (gPb-owt-ucm)
Computing descriptors on points, regions, or windows
Vector quantization of descriptors (k-means)
Nearest-neighbors for high dimensional descriptors
Training of SVMs (linear, additive,rbf)
Evaluation of SVMs (linear, additive, rbf)
Hough transform voting
Optical Flow
Tracking objects/humans
Fly UP