Point distribution model

From HandWiki

The point distribution model is a model for representing the mean geometry of a shape and some statistical modes of geometric variation inferred from a training set of shapes.

Background

The point distribution model concept has been developed by Cootes,[1] Taylor et al.[2] and became a standard in computer vision for the statistical study of shape[3] and for segmentation of medical images[2] where shape priors really help interpretation of noisy and low-contrasted pixels/voxels. The latter point leads to active shape models (ASM) and active appearance models (AAM).

Point distribution models rely on landmark points. A landmark is an annotating point posed by an anatomist onto a given locus for every shape instance across the training set population. For instance, the same landmark will designate the tip of the index finger in a training set of 2D hands outlines. Principal component analysis (PCA), for instance, is a relevant tool for studying correlations of movement between groups of landmarks among the training set population. Typically, it might detect that all the landmarks located along the same finger move exactly together across the training set examples showing different finger spacing for a flat-posed hands collection.

Details

First, a set of training images are manually landmarked with enough corresponding landmarks to sufficiently approximate the geometry of the original shapes. These landmarks are aligned using the generalized procrustes analysis, which minimizes the least squared error between the points.

[math]\displaystyle{ k }[/math] aligned landmarks in two dimensions are given as

[math]\displaystyle{ \mathbf{X} = (x_1, y_1, \ldots, x_k, y_k) }[/math].

It's important to note that each landmark [math]\displaystyle{ i \in \lbrace 1, \ldots k \rbrace }[/math] should represent the same anatomical location. For example, landmark #3, [math]\displaystyle{ (x_3, y_3) }[/math] might represent the tip of the ring finger across all training images.

Now the shape outlines are reduced to sequences of [math]\displaystyle{ k }[/math] landmarks, so that a given training shape is defined as the vector [math]\displaystyle{ \mathbf{X} \in \mathbb{R}^{2k} }[/math]. Assuming the scattering is gaussian in this space, PCA is used to compute normalized eigenvectors and eigenvalues of the covariance matrix across all training shapes. The matrix of the top [math]\displaystyle{ d }[/math] eigenvectors is given as [math]\displaystyle{ \mathbf{P} \in \mathbb{R}^{2k \times d} }[/math], and each eigenvector describes a principal mode of variation along the set.

Finally, a linear combination of the eigenvectors is used to define a new shape [math]\displaystyle{ \mathbf{X}' }[/math], mathematically defined as:

[math]\displaystyle{ \mathbf{X}' = \overline{\mathbf{X}} + \mathbf{P} \mathbf{b} }[/math]

where [math]\displaystyle{ \overline{\mathbf{X}} }[/math] is defined as the mean shape across all training images, and [math]\displaystyle{ \mathbf{b} }[/math] is a vector of scaling values for each principal component. Therefore, by modifying the variable [math]\displaystyle{ \mathbf{b} }[/math] an infinite number of shapes can be defined. To ensure that the new shapes are all within the variation seen in the training set, it is common to only allow each element of [math]\displaystyle{ \mathbf{b} }[/math] to be within [math]\displaystyle{ \pm }[/math]3 standard deviations, where the standard deviation of a given principal component is defined as the square root of its corresponding eigenvalue.

PDM's can be extended to any arbitrary number of dimensions, but are typically used in 2D image and 3D volume applications (where each landmark point is [math]\displaystyle{ \mathbb{R}^2 }[/math] or [math]\displaystyle{ \mathbb{R}^3 }[/math]).

Discussion

An eigenvector, interpreted in euclidean space, can be seen as a sequence of [math]\displaystyle{ k }[/math] euclidean vectors associated to corresponding landmark and designating a compound move for the whole shape. Global nonlinear variation is usually well handled provided nonlinear variation is kept to a reasonable level. Typically, a twisting nematode worm is used as an example in the teaching of kernel PCA-based methods.

Due to the PCA properties: eigenvectors are mutually orthogonal, form a basis of the training set cloud in the shape space, and cross at the 0 in this space, which represents the mean shape. Also, PCA is a traditional way of fitting a closed ellipsoid to a Gaussian cloud of points (whatever their dimension): this suggests the concept of bounded variation.

The idea behind PDMs is that eigenvectors can be linearly combined to create an infinity of new shape instances that will 'look like' the one in the training set. The coefficients are bounded alike the values of the corresponding eigenvalues, so as to ensure the generated 2n/3n-dimensional dot will remain into the hyper-ellipsoidal allowed domain—allowable shape domain (ASD).[2]

See also

References

  1. T. F. Cootes (May 2004), Statistical models of appearance for computer vision, http://www.face-rec.org/algorithms/AAM/app_models.pdf 
  2. 2.0 2.1 2.2 D.H. Cooper; T.F. Cootes; C.J. Taylor; J. Graham (1995), "Active shape models—their training and application", Computer Vision and Image Understanding (61): 38–59 
  3. Rhodri H. Davies and Carole J. Twining and P. Daniel Allen and Tim F. Cootes and Chris J. Taylor (2003). "Shape discrimination in the Hippocampus using an MDL Model". IMPI. http://www2.wiau.man.ac.uk/caws/Conferences/10/proceedings/8/papers/133/rhhd_ipmi03%2Epdf. Retrieved 2007-07-27. 

External links