Birchfield–Tomasi dissimilarity

From HandWiki
Revision as of 02:39, 21 July 2022 by imported>StanislovAI (linkage)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In computer vision, the Birchfield–Tomasi dissimilarity is a pixelwise image dissimilarity measure that is robust with respect to sampling effects. In the comparison of two image elements, it fits the intensity of one pixel to the linearly interpolated intensity around a corresponding pixel on the other image.[1] It is used as a dissimilarity measure in stereo matching, where one-dimensional search for correspondences is performed to recover a dense disparity map from a stereo image pair.[2][3][4]

Description

When performing pixelwise image matching, the measure of dissimilarity between pairs of pixels from different images is affected by differences in image acquisition such as illumination bias and noise. Even when assuming no difference in these aspects between an image pair, additional inconsistencies are introduced by the pixel sampling process, because each pixel is a sample obtained integrating the continuous light signal over a finite region of space, and two pixels matching the same feature of the image content may correspond to slightly different regions of the real object that can reflect light differently and can be subject to partial occlusion, depth discontinuity, or different lens defocus, thus generating different intensity signals.[1]

The Birchfield–Tomasi measure compensates for the sampling effect by considering the linear interpolation of the samples. Pixel similarity is then determined by finding the best match between the intensity of a pixel sample in one image and the interpolated function in an interval around a location in the other image.[1]

Considering the stereo matching problem for a rectified stereo pair, where the search for correspondences is performed in one dimension, given two columns [math]\displaystyle{ x_l }[/math] and [math]\displaystyle{ x_r }[/math] along the same scanline for the left and right image respectively, it is possible to define two symmetric functions

[math]\displaystyle{ \begin{align} d_l(x_l, x_r) &= \min_{x_r - \frac{1}{2} \le x \le x_r + \frac{1}{2}} \left| I_l(x_l) - \hat{I}_r(x) \right| \\ d_r(x_l, x_r) &= \min_{x_l - \frac{1}{2} \le x \le x_l + \frac{1}{2}} \left| \hat{I}_l(x) - I_r(x_r) \right| \end{align} }[/math]

where [math]\displaystyle{ \hat{I}_l }[/math] and [math]\displaystyle{ \hat{I}_r }[/math] are the linear interpolation functions of the left and right image intensity [math]\displaystyle{ I_l }[/math] and [math]\displaystyle{ I_r }[/math] along the scanline. The Birchfield–Tomasi dissimilarity can then be defined as[1]

[math]\displaystyle{ d(x_l, x_r) = \min \left\{ d_l(x_l, x_r), d_r(x_l, x_r) \right\}. }[/math]

In practice the measure can be computed with only a small and constant overhead with respect to the calculation of the simple intensity difference, because it is not necessary to reconstruct the interpolant function. Given that the interpolant is linear within each unit interval centred around a pixel, its minimum is located in one of its extremities. Therefore, [math]\displaystyle{ d_l(x_l, x_r) }[/math] can be written as

[math]\displaystyle{ d_l(x_l, x_r) = \max \left\{ 0, I_l(x_l) - I_{max}, I_{min} - I_l(x_l) \right\} }[/math]

where

[math]\displaystyle{ \begin{align} I_{max} &= \max \left\{ I_r(x_r), I^{+}_{r}(x_r), I^{-}_{r}(x_r) \right\} \\ I_{min} &= \min \left\{ I_r(x_r), I^{+}_{r}(x_r), I^{-}_{r}(x_r) \right\} \end{align} }[/math]

denoting with [math]\displaystyle{ I^{+}_{r}(x_r) }[/math] and [math]\displaystyle{ I^{-}_{r}(x_r) }[/math] the values of the interpolated intensities at the rightmost and leftmost extremities of a one-pixel interval centred around [math]\displaystyle{ x_r }[/math]

[math]\displaystyle{ \begin{align} I^{+}_{r}(x_r) &= \frac{1}{2} \left( I_r(x_r) + I_r(x_r + 1) \right) \\ I^{-}_{r}(x_r) &= \frac{1}{2} \left( I_r(x_r - 1) + I_r(x_r) \right) . \end{align} }[/math]

The other function [math]\displaystyle{ d_r(x_l, x_r) }[/math] can be similarly rewritten, completing the expression for [math]\displaystyle{ d }[/math].[1]

References

  1. 1.0 1.1 1.2 1.3 1.4 Birchfield and Tomasi (1998)
  2. Hirschmüller and Scharstein (2007)
  3. Szeliski and Scharstein (2004)
  4. Morales et al. (2013)
  • Birchfield, Stan; Tomasi, Carlo (1998). "A pixel dissimilarity measure that is insensitive to image sampling". IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE) 20 (4): 401–406. 
  • Hirschmüller, Heiko; Scharstein, Daniel (2007). "Evaluation of cost functions for stereo matching". 
  • Morales, Nestor; Camellini, Gabriele; Felisa, Mirko; Grisleri, Paolo; Zani, Paolo (2013). "Performance analysis of stereo reconstruction algorithms". 16th International IEEE Conference on Intelligent Transportation Systems. pp. 1298–1303. 
  • Szeliski, Richard; Scharstein, Daniel (2004). "Sampling the disparity space image". IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (3): 419–425.