Inverse depth parametrization

From HandWiki
Short description: Computational method for constructing 3D models
In inverse depth parametrization, a point is identified by its inverse depth [math]\displaystyle{ \rho = \frac{1}{\left\Vert \mathbf{p} - \mathbf{c}_0\right\Vert} }[/math] along the ray, with direction [math]\displaystyle{ v = (\cos \phi \sin \theta, -\sin \phi, \cos \phi \cos \theta) }[/math], from which it was first observed.

In computer vision, the inverse depth parametrization is a parametrization used in methods for 3D reconstruction from multiple images such as simultaneous localization and mapping (SLAM).[1][2] Given a point [math]\displaystyle{ \mathbf{p} }[/math] in 3D space observed by a monocular pinhole camera from multiple views, the inverse depth parametrization of the point's position is a 6D vector that encodes the optical centre of the camera [math]\displaystyle{ \mathbf{c}_0 }[/math] when in first observed the point, and the position of the point along the ray passing through [math]\displaystyle{ \mathbf{p} }[/math] and [math]\displaystyle{ \mathbf{c}_0 }[/math].[3]

Inverse depth parametrization generally improves numerical stability and allows to represent points with zero parallax. Moreover, the error associated to the observation of the point's position can be modelled with a Gaussian distribution when expressed in inverse depth. This is an important property required to apply methods, such as Kalman filters, that assume normality of the measurement error distribution. The major drawback is the larger memory consumption, since the dimensionality of the point's representation is doubled.[3]

Definition

Given 3D point [math]\displaystyle{ \mathbf{p} = (x, y, z) }[/math] with world coordinates in a reference frame [math]\displaystyle{ (e_1, e_2, e_3) }[/math], observed from different views, the inverse depth parametrization [math]\displaystyle{ \mathbf{y} }[/math] of [math]\displaystyle{ \mathbf{p} }[/math] is given by:

[math]\displaystyle{ \mathbf{y} = (x_0, y_0, z_0, \theta, \phi, \rho) }[/math]

where the first five components encode the camera pose in the first observation of the point, being [math]\displaystyle{ \mathbf{c_0} = (x_0, y_0, z_0) }[/math] the optical centre, [math]\displaystyle{ \phi }[/math] the azimuth, [math]\displaystyle{ \theta }[/math] the elevation angle, and [math]\displaystyle{ \rho = \frac{1}{\left\Vert \mathbf{p} - \mathbf{c}_0\right\Vert} }[/math] the inverse depth of [math]\displaystyle{ p }[/math] at the first observation.[3]

References

  1. Piniés et al. (2007)
  2. Sunderhauf et al. (2007)
  3. 3.0 3.1 3.2 Civiera et al. (2008)

Bibliography

  • Montiel, JM Martínez; Civera, Javier; Davison, Andrew J (2006). Unified inverse depth parametrization for monocular SLAM. Robotics: Science and Systems. 
  • Civera, Javier; Davison, Andrew J; Montiel, JM Martínez (2008). "Inverse depth parametrization for monocular SLAM". IEEE Transactions on Robotics (IEEE) 24 (5): 932–945. doi:10.1109/TRO.2008.2003276. 
  • Piniés, Pedro; Lupton, Todd; Sukkarieh, Salah; Tardós, Juan D (2007). "Inertial Aiding of Inverse Depth SLAM using a Monocular Camera". Proceedings 2007 IEEE International Conference on Robotics and Automation. IEEE. pp. 2797–2802. doi:10.1109/ROBOT.2007.363895. ISBN 978-1-4244-0602-9. 
  • Sunderhauf, Niko; Lange, Sven; Protzel, Peter (2007). "Using the unscented Kalman filter in mono-SLAM with inverse depth parametrization for autonomous airship control". 2007 IEEE International Workshop on Safety, Security and Rescue Robotics (IEEE): 1–6.