fusion methods

image pyramids

Image pyramids have been initially described for multiresolution image analysis and as a model for the binocular fusion in human vision. A generic image pyramid is a sequence of images where each image is constructed by lowpass filtering and subsampling from its predecessor. Due to sampling, the image size is halved in both spatial directions at each level of the decomposition process, thus leading to an multiresolution signal representation. The difference between the input image and the filtered image is necessary to allow an exact reconstruction from the pyramidal representation. The image pyramid approach thus leads to a signal representation with two pyramids: The smoothing pyramid containing the averaged pixel values, and the difference pyramid containing the pixel differences, i.e. the edges. So the difference pyramid can be viewed as a multiresolution edge representation of the input image.

The actual fusion process can be described by a generic multiresolution fusion scheme which is applicable both to image pyramids and the wavelet approach.

There are several modifications of this generic pyramid construction method described above. Some authors propose the computation of nonlinear pyramids, such as the ratio and contrast pyramid, where the multiscale edge representation is computed by an pixel-by-pixel division of neighboring resolutions. A further modification is to substitute the linear filters by morphological nonlinear filters, resulting in the morphological pyramid. Another type of image pyramid - the gradient pyramid - results, if the input image is decomposed into its directional edge representation using directional derivative filters.

fwd: wavelet transform

bck: artificial neural networks