Scale-space segmentation

From HandWiki
A one-dimension example of scale-space segmentation. A signal (black), multi-scale-smoothed versions of it (red), and segment averages (blue) based on scale-space segmentation
The dendrogram corresponding to the segmentations in the figure above. Each "×" identifies the position of an extremum of the first derivative of one of 15 smoothed versions of the signal (red for maxima, blue for minima). Each "+" identifies the position that the extremum tracks back to at the finest scale. The signal features that persist to the highest scale (smoothest version) are evident as the tall structures that correspond to the major segment boundaries in the figure above.

Scale-space segmentation or multi-scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing.

One-dimensional hierarchical signal segmentation

Witkin's seminal work in scale space[1] included the notion that a one-dimensional signal could be unambiguously segmented into regions, with one scale parameter controlling the scale of segmentation.

A key observation is that the zero-crossings of the second derivatives (which are minima and maxima of the first derivative or slope) of multi-scale-smoothed versions of a signal form a nesting tree, which defines hierarchical relations between segments at different scales. Specifically, slope extrema at coarse scales can be traced back to corresponding features at fine scales. When a slope maximum and slope minimum annihilate each other at a larger scale, the three segments that they separated merge into one segment, thus defining the hierarchy of segments.

Image segmentation and primal sketch

There have been numerous research works in this area, out of which a few have now reached a state where they can be applied either with interactive manual intervention (usually with application to medical imaging) or fully automatically. The following is a brief overview of some of the main research ideas that current approaches are based upon.

The nesting structure that Witkin described is, however, specific for one-dimensional signals and does not trivially transfer to higher-dimensional images. Nevertheless, this general idea has inspired several other authors to investigate coarse-to-fine schemes for image segmentation. Koenderink[2] proposed to study how iso-intensity contours evolve over scales and this approach was investigated in more detail by Lifshitz and Pizer.[3] Unfortunately, however, the intensity of image features changes over scales, which implies that it is hard to trace coarse-scale image features to finer scales using iso-intensity information.

Lindeberg[4] studied the problem of linking local extrema and saddle points over scales, and proposed an image representation called the scale-space primal sketch which makes explicit the relations between structures at different scales, and also makes explicit which image features are stable over large ranges of scale including locally appropriate scales for those. Bergholm [5] proposed to detect edges at coarse scales in scale-space and then trace them back to finer scales with manual choice of both the coarse detection scale and the fine localization scale.

Gauch and Pizer[6] studied the complementary problem of ridges and valleys at multiple scales and developed a tool for interactive image segmentation based on multi-scale watersheds. The use of multi-scale watershed with application to the gradient map has also been investigated by Olsen and Nielsen[7] and has been carried over to clinical use by Dam et al.[8] Vincken et al.[9] proposed a hyperstack for defining probabilistic relations between image structures at different scales. The use of stable image structures over scales has been furthered by Ahuja and his co-workers[10][11] into a fully automated system. A fully automatic brain segmentation algorithm based on closely related ideas of multi-scale watersheds has been presented by Undeman and Lindeberg [12] and been extensively tested in brain databases.

These ideas for multi-scale image segmentation by linking image structures over scales have also been picked up by Florack and Kuijper.[13] Bijaoui and Rué [14] associate structures detected in scale-space above a minimum noise threshold into an object tree which spans multiple scales and corresponds to a kind of feature in the original signal. Extracted features are accurately reconstructed using an iterative conjugate gradient matrix method.

Segmentation of vector functions of time

Scale-space segmentation was extended in another direction by Lyon[15] to vector-valued functions of time, where the vector derivative does not have maxima and minima, and the second derivative does not have zero crossings, by putting segment boundaries instead at maxima of the Euclidean magnitude of the vector derivative of the smoothed vector signals. This technique has been applied to segmentation of speech and of text.[16]

References

  1. Witkin, A. (1984). "Scale-space filtering: A new approach to multi-scale description". ICASSP '84. IEEE International Conference on Acoustics, Speech, and Signal Processing. 9. pp. 150–153. doi:10.1109/ICASSP.1984.1172729. https://pdfs.semanticscholar.org/f58b/22395f9585c3da65bbc948c67eed3377f701.pdf. Retrieved 2019-08-01. 
  2. Koenderink, Jan "The structure of images", Biological Cybernetics, 50:363--370, 1984
  3. Lifshitz, L.M.; Pizer, S.M. (1990). "A multiresolution hierarchical approach to image segmentation based on intensity extrema". IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (6): 529–540. doi:10.1109/34.56189. http://portal.acm.org/citation.cfm?id=80964&dl=GUIDE&coll=GUIDE. 
  4. Lindeberg, Tony (1993). "Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention". International Journal of Computer Vision 11 (3): 283–318. doi:10.1007/BF01469346. http://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A472969&dswid=3437. 
  5. Bergholm, F. (1987). "Edge focusing". IEEE Transactions on Pattern Analysis and Machine Intelligence 9 (6): 726–741. doi:10.1109/tpami.1987.4767980. PMID 21869435. https://pubmed.ncbi.nlm.nih.gov/21869435/. 
  6. Gauch, J.M.; Pizer, S.M. (1993). "Multiresolution analysis of ridges and valleys in grey-scale images". IEEE Transactions on Pattern Analysis and Machine Intelligence 15 (6): 635–646. doi:10.1109/34.216734. http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=628490. 
  7. Olsen, Ole Fogh; Nielsen, Mads (1997). "Multi-scale gradient magnitude watershed segmentation". Image Analysis and Processing. Lecture Notes in Computer Science. 1310. pp. 6–13. doi:10.1007/3-540-63507-6_178. ISBN 978-3-540-63507-9. https://link.springer.com/content/pdf/10.1007/3-540-63507-6_178.pdf. 
  8. Dam, E., Johansen, P., Olsen, O. Thomsen,, A. Darvann, T. , Dobrzenieck, A., Hermann, N., Kitai, N., Kreiborg, S., Larsen, P., Nielsen, M.: "Interactive multi-scale segmentation in clinical use" in European Congress of Radiology 2000.
  9. Vincken, K.L.; Koster, A.S.E.; Viergever, M.A. (1997). "Probabilistic multiscale image segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (2): 109–120. doi:10.1109/34.574787. 
  10. Tabb, M.; Ahuja, N. (1997). "Multiscale image segmentation by integrated edge and region detection". IEEE Transactions on Image Processing 6 (5): 642–655. doi:10.1109/83.568922. PMID 18282958. Bibcode1997ITIP....6..642T. https://pubmed.ncbi.nlm.nih.gov/18282958/. 
  11. Akbas, Emre; Ahuja, Narendra (2010). "From Ramp Discontinuities to Segmentation Tree". Computer Vision – ACCV 2009. Lecture Notes in Computer Science. 5994. pp. 123–134. doi:10.1007/978-3-642-12307-8_12. ISBN 978-3-642-12306-1. https://doi.org/10.1007%2F978-3-642-12307-8_12. 
  12. Undeman, Carl; Lindeberg, Tony (2003). "Fully Automatic Segmentation of MRI Brain Images Using Probabilistic Anisotropic Diffusion and Multi-scale Watersheds". Scale Space Methods in Computer Vision. Lecture Notes in Computer Science. 2695. pp. 641–656. doi:10.1007/3-540-44935-3_45. ISBN 978-3-540-40368-5. http://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A451266&dswid=-6010. 
  13. Florack, L. M. J.; Kuijper, A. (2000). "The topological structure of scale-space images". Journal of Mathematical Imaging and Vision 12 (1): 65–79. doi:10.1023/A:1008304909717. https://dspace.library.uu.nl/bitstream/handle/1874/18929/florack_98_the_topological.pdf?sequence=1. 
  14. Bijaoui, Albert; Rué, Frédéric (1995). "A multiscale vision model adapted to the astronomical images". Signal Processing 46 (3): 345–362. doi:10.1016/0165-1684(95)00093-4. https://dx.doi.org/10.1016/0165-1684(95)00093-4. 
  15. Richard F. Lyon. "Speech recognition in scale space," Proc. of 1987 ICASSP. San Diego, March, pp. 29.3.14, 1987.
  16. "Slaney, M. Ponceleon, D., "Hierarchical segmentation using latent semantic indexing in scalespace", Proc. Intl. Conf. on Acoustics, Speech, and Signal Processing (ICASSP '01) 2001". http://cobweb.ecn.purdue.edu/~malcolm/ibm/pubs/ICASSP2001-Segmentation.pdf. 

See also