Orientation (computer vision)
In computer vision and image processing a common assumption is that sufficiently small image regions can be characterized as locally one-dimensional, e.g., in terms of lines or edges. For natural images this assumption is usually correct except at specific points, e.g., corners or line junctions or crossings, or in regions of high frequency textures. However, what size the regions have to be in order to appear as one-dimensional varies both between images and within an image. Also, in practice a local region is never exactly one-dimensional but can be so to a sufficient degree of approximation.
Image regions which are one-dimensional are also referred to as simple or intrinsic one-dimensional (i1D).
Given an image of dimension d (d = 2 for ordinary images), a mathematical representation of a local i1D image region is
[math]\displaystyle{ f(\mathbf{x}) = g(\mathbf{x} \cdot \hat{\mathbf n}) }[/math]
where [math]\displaystyle{ f }[/math] is the image intensity function which varies over a local image coordinate [math]\displaystyle{ \mathbf{x} }[/math] (a d-dimensional vector), [math]\displaystyle{ g }[/math] is a one-variable function, and [math]\displaystyle{ \hat{\mathbf n} }[/math] is a unit vector.
The intensity function [math]\displaystyle{ f }[/math] is constant in all directions which are perpendicular to [math]\displaystyle{ \hat{\mathbf n} }[/math]. Intuitively, the orientation of an i1D-region is therefore represented by the vector [math]\displaystyle{ \hat{\mathbf n} }[/math]. However, for a given [math]\displaystyle{ f }[/math], [math]\displaystyle{ \hat{\mathbf n} }[/math] is not uniquely determined. If
[math]\displaystyle{ \tilde{\mathbf n} = - \hat{\mathbf n} }[/math]
[math]\displaystyle{ \tilde{g}(x) = g(-x) }[/math]
then [math]\displaystyle{ f }[/math] can be written as
[math]\displaystyle{ f(\mathbf{x}) = \tilde{g}(\mathbf{x} \cdot \tilde{\mathbf n}) }[/math]
which implies that [math]\displaystyle{ \tilde{\mathbf n} = - \hat{\mathbf n} }[/math] also is a valid representation of the local orientation.
In order to avoid this ambiguity in the representation of local orientation two representations have been proposed
- The double angle representation
- The tensor representation
The double angle representation is only valid for 2D images (d=2), but the tensor representation can be defined for arbitrary dimensions d of the image data.
Relation to direction
A line between two points p1 and p2 has no given direction, but has a well-defined orientation. However, if one of the points p1 is used as a reference or origin, then the other point p2 can be described in terms of a vector which points in the direction to p2. Intuitively, orientation can be thought of as a direction without sign. Formally, this relates to projective spaces where the orientation of a vector corresponds to the equivalence class of vectors which are scaled versions of the vector.
For an image edge, we may talk of its direction which can be defined in terms of the gradient, pointing in the direction of maximum image intensity increase (from dark to bright). This implies that two edges can have the same orientation but the corresponding image gradients point in opposite directions if the edges go in different directions.
Relation to gradients
In image processing, the computation of the local image gradient is a common operation, e.g., for edge detection. If [math]\displaystyle{ f }[/math] above is an edge, then its gradient is parallel to [math]\displaystyle{ \hat{\mathbf n} }[/math]. As is already discussed above the gradient is not a unique representation of orientation. Also, in the case of a local region which is centered on a line, the image gradient is approximately zero. However, in this case the vector [math]\displaystyle{ \hat{\mathbf n} }[/math] is still well-defined except for its sign. Therefore, [math]\displaystyle{ \hat{\mathbf n} }[/math] is a more appropriate starting point for defining local orientation than the image gradient.
Estimation of local image orientation
A number of methods have been proposed for computing or estimating an orientation representation from image data. These include
- Quadrature filter based methods
- The structure tensor
- Using a local polynomial approximation
- The Energy tensor[1]
- The Boundary tensor[2]
The first approach can be used both for the double angle representation (only 2D images) and the tensor representation, and the other methods compute a tensor representation of local orientation.
Application of local image orientation
Given that a local image orientation representation has been computed for some image data, this formation can be used for solving the following tasks:
- Estimation of line or edge consistency
- Estimation of curvature information
- Detection of corner points[3]
- Adaptive or anisotropic noise reduction
- Motion estimation
References
- ↑ Felsberg, Michael, and Gösta Granlund. "POI detection using channel clustering and the 2D energy tensor." Joint Pattern Recognition Symposium. Springer, Berlin, Heidelberg, 2004.
- ↑ Köthe, Ullrich. "Integrated Edge and Junction Detection with the Boundary Tensor." ICCV. Vol. 3. 2003.
- ↑ Chabat, François, Guang-Zhong Yang, and David M. Hansell. "A corner orientation detector." Image and Vision Computing 17.10 (1999): 761-769.