Phase-stretch Adaptive Gradient-field Extractor
Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) is an edge detection algorithm based on physics of electromagnetic diffraction and dispersion. A computational imaging algorithm, it identifies edges, their orientations and sharpness in a digital image where the image brightness changes abruptly. Edge detection is a basic operation performed by the eye and is crucial to visual perception. PAGE embeds an original image into a set of feature maps that selects semantic information at different scales, orientations, and spatial frequencies that can be used for object representation and classification. The algorithm performs exceptionally well as an edge and texture extractor, in particular in low light level and low contrast impaired images. As a visualization aid, the edge angle is typically encoded into color in the output image. The code was first released in February, 2022. The code was then significantly refactored and improved to support GPU acceleration. In May 2022, it became one algorithm in PhyCV: the first physics-inspired computer vision library.
Operation principle
Phase-stretch Adaptive Gradient-field Extractor (PAGE) is a physics inspired feature engineering algorithm[1] that computes a feature set composed of edges at different spatial frequencies (and hence spatial scales) and orientations.[2][3] Metaphorically speaking, PAGE emulates the physics of birefringent (orientation-dependent) diffractive propagation through a physical medium with a specific diffractive structure. The propagation converts a real-valued image into a complex function. Related information is contained in the real and imaginary components of the output. The output represents the phase of the complex function. PAGE builds on the Phase stretch transform (PST),[4] another physics-inspired edge detection algorithm. The Phase stretch transform algorithm evolved from the research on a class of real time measurement and sensing methods known as the photonic times stretch including time stretch analog-to-digital converter,[5] Time stretch dispersive Fourier transform[6] and serial time-encoded amplified microscopy.[7]
In a birefringent optical medium, the dielectric constant of the medium and hence, its refractive index is a function of spatial frequency and the polarization in the transverse plane. To understand the analogy between PAGE and electromagnetic propagation equations, let's consider an optical field with two linearly orthogonal polarizations propagating through a medium. The Fourier content of the incoming signal,
[math]\displaystyle{ \tilde{E}_i\left(u,v;z\right)=FFT^2\left\{E_i(x,y;z)\right\} }[/math]
can be decomposed into the two orthogonal polarizations as
[math]\displaystyle{ \tilde{E}_i\left(z\right)=\tilde{E}_x(z) + \tilde{E}_y(z) }[/math]
where [math]\displaystyle{ FFT\left\{\right\} }[/math] is the fast Fourier transform over the transversal coordinates and [math]\displaystyle{ (u,v) }[/math] are spatial frequency variables. As the propagation constant [math]\displaystyle{ \beta=2\pi n/\lambda }[/math] is a function of refractive index, the two orthogonal polarizations [math]\displaystyle{ \tilde{E}_x }[/math] and [math]\displaystyle{ \tilde{E}_y }[/math] will have different propagation constants and hence, a phase difference at the output given by the following equation:
[math]\displaystyle{ \Delta \phi = \phi_x -\phi_y = \Delta \beta = \frac{\omega_m}{c} \mid {n_x-n_y} \mid }[/math]
By controlling the value of [math]\displaystyle{ n_x }[/math] and [math]\displaystyle{ n_x }[/math], as well the dependence of refractive index on frequency [math]\displaystyle{ n_x(\omega) }[/math] and [math]\displaystyle{ n_y(\omega) }[/math], coherent detection at the output detects a hyper-dimensional feature set from a 2D image that corresponds to edges at user-defined specific orientations and spatial frequencies. We note that in the above definition of the phase, we have set the propagation length to 1.
Physical and mathematical foundations of Phase-Stretch Adaptive Gradient-Field Extractor
The first step is to apply an optional smoothening kernel in the frequency domain to reduce noise. This is typically performed in the frequency domain (after a Fourier transform), but it can also be done in the spatial domain using convolution. The image is then multiplied by a phase kernel that emulates the birefringence and frequency channelized diffractive propagation. Next, the image is transformed back into the spatial domain followed by a calculation of the spatial phase representing the desired feature vectors. The final step of PAGE is to apply thresholding and morphological operations on the generated feature vectors to produce the final output. For a color image, these operations are performed separately on all color channels and the results are then combined in a single image, although each channel can also be viewed separately.
Mathematically, this sequence of operations can be represented by the following equations. The birefringent Stretch operator [math]\displaystyle{ \mathbb{S}\left\{\right\} }[/math] is defined as follows:
[math]\displaystyle{ E_o\left[x,y\right]=\mathbb{S}\left\{E_i\left[x,y\right]\right\} =IFFT^2\bigg\{\tilde{K}\left[u,v,\theta \right]\cdot{}\tilde{L}\left[u,v\right] \cdot{}FFT^2\Big\{E_i\left[x,y\right]\Big\}\bigg\} }[/math]
where [math]\displaystyle{ E_o\left[x,y\right] }[/math] is a complex quantity defined as,
[math]\displaystyle{ E_o\left[x,y\right]=\left\vert{}E_o\left[x,y\right]\right\vert{}e^{j\theta{}\left[x,y\right]} }[/math]
In the above equations, [math]\displaystyle{ E_i\left[x,y\right] }[/math] is the input image, [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] are the spatial variables, [math]\displaystyle{ FFT^2 }[/math] is the two-dimensional Fast Fourier Transform, [math]\displaystyle{ IFFT^2 }[/math] is the two-dimensional Inverse Fast Fourier Transform, and [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] are frequency variables. The function [math]\displaystyle{ \tilde{K} [u,v, \theta] }[/math] is called the PAGE kernel and the function [math]\displaystyle{ \tilde{L}[u,v] }[/math] is a denoising kernel, both implemented in frequency domain. For the results shown on this Wiki page, [math]\displaystyle{ \tilde{L}\left[u,v\right] }[/math] is a gaussian filter whose cut off frequency is determined by the sigma of the gaussian filter.
The PAGE operator [math]\displaystyle{ \mathbb{P}\left\{\right\} }[/math] is then be defined as the phase of the output of the stretch operation [math]\displaystyle{ \mathbb{S}\left\{\right\} }[/math] applied on the input image [math]\displaystyle{ E_i\left[x,y\right] }[/math]:
[math]\displaystyle{ \mathbb{P}\left\{E_i\left[x,y\right]\right\}=\measuredangle \Big\{ \mathbb{S} \big\{ E_i\left[x,y\right] \big\} \Big\} }[/math]
where [math]\displaystyle{ \measuredangle \langle \cdot \rangle }[/math] is the angle operator.
PAGE filter banks
PAGE filter banks are defined by the PAGE kernel [math]\displaystyle{ \tilde{K}\left[u,v,\theta \right] }[/math] and are designed to compute semantic information from an image at different orientations and frequencies. The PAGE kernel [math]\displaystyle{ \tilde{K}\left[u,v,\theta \right] }[/math], consists of a phase filter which is a function of frequency variable [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math], and the angle variable [math]\displaystyle{ \theta }[/math] which controls the directionality of the edge. The spectral phase operator is expressed as a product of two phase functions, [math]\displaystyle{ \phi_1 }[/math] and [math]\displaystyle{ \phi_2 }[/math]. The first component [math]\displaystyle{ \phi_1 }[/math] is a symmetric gaussian filter that selects the spatial frequency range of the edges that are detected. Default center frequency is 0, which indicates a baseband filter, the center frequency and bandwidth of which can be changed to probe edges with different sharpness. In other words, it enables the filtering of edges occurring over different spatial scales. The second component, [math]\displaystyle{ \phi_2 }[/math], performs the edge-detection. Since the output is based on the phase, it needs to be a complex-valued function. The PAGE operation transforms a real-value input to a complex-value quantity from which the phase is extracted.
A change of basis leads to the transformed frequency variables [math]\displaystyle{ u^{\prime} }[/math] and [math]\displaystyle{ v^{\prime} }[/math]
[math]\displaystyle{ u^{\prime} = u \cdot cos(\theta) + v \cdot sin(\theta) }[/math]
[math]\displaystyle{ v^{\prime} = u \cdot sin(\theta) + v \cdot cos(\theta) }[/math]
such that the frequency vector rotates about the origin with [math]\displaystyle{ \theta }[/math]
[math]\displaystyle{ u^{\prime} + j v^{\prime} }[/math]
The PAGE kernel [math]\displaystyle{ \tilde{K}\left[u,v,\theta \right] }[/math] is defined as a function of frequency variable [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] and angle [math]\displaystyle{ \theta }[/math] as follows:
[math]\displaystyle{ \tilde{K}\left[u,v,\theta \right] = \tilde{K}\left[u^{\prime} , v^{\prime} \right]= \exp \Big\{j \cdot \phi_1({u^{\prime}})\cdot \phi_2({v^{\prime}}) \Big\} }[/math]
where
[math]\displaystyle{ \phi_1({u^{\prime}}) = S_{u^{\prime}} \cdot \frac{1}{{\sigma_{u^{\prime}} \sqrt {2\pi } }} \cdot exp^{{-({|u^{\prime}|} - \mu_{u^{\prime}} )^2}/{2\sigma_{u^{\prime}} ^2 }} }[/math]
[math]\displaystyle{ \phi_2({v^{\prime}}) = S_{v^{\prime}} \cdot \frac{1}{{{|v^{\prime}|}\sigma_{v^{\prime}} \sqrt {2\pi } }} \cdot exp^{{-(ln({{|v^{\prime}|}}) - \mu_{v^{\prime}} )^2}/{2\sigma_{v^{\prime}} ^2 }} }[/math]
For all simulation examples here, the phase functions [math]\displaystyle{ \phi_1({u^{\prime}}) }[/math] and [math]\displaystyle{ \phi_2({v^{\prime}}) }[/math] are normalized in the range (0,1) for all values of [math]\displaystyle{ \theta }[/math] and then multiplied by [math]\displaystyle{ S_{u^{\prime}} }[/math] and [math]\displaystyle{ S_{v^{\prime}} }[/math] respectively, such that the strength of each kernel is mutable for different applications and image conditions.
Feature extraction
PAGE has the potential to be used as a preprocessing step for machine learning tasks such as image classification. An important step in any classification task is feature extraction. Feature extraction algorithms of note include histogram of oriented gradients, scale-invariant feature transform, and shape context descriptors. In each case, images are reduced to certain key features that aid in the tasks of object detection and classification. PAGE serves as a physics-inspired feature extractor and descriptor. It is able to return a hyper-dimensional feature mapping in which regions of great change in intensity are highlighted and grouped based on directionality.
Applications
PAGE has the potential to be used as a preprocessing step for machine learning tasks such as image classification. An important step in any classification task is feature extraction. In such applications, images are reduced to certain key features that aid in the tasks of object detection and classification. PAGE serves as a physics-inspired feature extractor and descriptor. It returns a hyper-dimensional feature mapping in which regions of great change in intensity are highlighted and grouped based on directionality. Given that it is selective over edge width and orientation, it is able to return a rich feature matrix with high representational power.
PAGE has a diverse set of applications that span several fields. Diagnosis and classification of retinopathy, for example, are medically important tasks highly dependent upon segmentation of blood vessels of varied width and orientation. This segmentation, and further image analysis, can be accomplished through a directional edge filter such as PAGE. Similarly, digital subtraction angiography creates an image of blood vessels using a contrast medium that can be used in pathology for soft tissue. Such imagery can be processed using the PAGE filter for diagnosis and visualization purposes. Further applications of note requiring directional edge information such as that computed by PAGE include fingerprint, written character, and flora and fauna recognition.
Originally introduced in 2020, PAGE builds on the Phase Stretch Transform (PST).[8] Local Flow PST (LF-PST) is another algorithm introduced in 2020 that is based on PST and also performs orientation and scale dependent edge detection.[9] Local Flow PST has shown exceptional results in retina vessel detection for application to retinopathy.
See also
- Edge detection
- Feature detection (computer vision)
- Time stretch analog-to-digital converter
- Time stretch dispersive Fourier transform
- Phase stretch transform
- PhyCV
References
- ↑ Physics-based Feature Engineering. Jalali et al. Optics, Photonics and Laser Technology, 2019
- ↑ Suthar, Madhuri, and Bahram Jalali. "Phase-stretch adaptive gradient-field extractor (page)." Coding Theory. IntechOpen, 2020. 143.
- ↑ MacPhee, Callen, Madhuri Suthar, and Bahram Jalali. "Phase-Stretch Adaptive Gradient-Field Extractor (PAGE)." arXiv preprint arXiv:2202.03570 (2022).
- ↑ M. H. Asghari, and B. Jalali, "Physics-inspired image edge detection," IEEE Global Signal and Information Processing Symposium (GlobalSIP 2014), paper: WdBD-L.1, Atlanta, December 2014.
- ↑ Bhushan, A. S. et al. “Time-stretched analogue-to-digital conversion.” Electronics Letters 34 (1998): 839-841.
- ↑ Mahjoubfar, A., Churkin, D., Barland, S. et al. Time stretch and its applications. Nature Photon 11, 341–351 (2017). https://doi.org/10.1038/nphoton.2017.76
- ↑ {K. Goda, K. K. Tsia, and B. Jalali, "Serial Time Encoded Amplified Microscopy," in Conference on Lasers and Electro-Optics/International Quantum Electronics Conference, OSA Technical Digest (CD) (Optica Publishing Group, 2009), paper CTuAA3.
- ↑ M. H. Asghari, and B. Jalali, "Physics-inspired image edge detection," IEEE Global Signal and Information Processing Symposium (GlobalSIP 2014), paper: WdBD-L.1, Atlanta, December 2014.
- ↑ Challoob M., Gao Y. (2020) A Local Flow Phase Stretch Transform for Robust Retinal Vessel Detection. In: Blanc-Talon J., Delmas P., Philips W., Popescu D., Scheunders P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2020. Lecture Notes in Computer Science, vol 12002. Springer, Cham. https://doi.org/10.1007/978-3-030-40605-9_22
Open Source Code on Github
Category:Image processing Category:Computational physics