Saliency maps of attention models with different numbers of bases. (a) source images; (b) human eye tracking; (c) saliency maps of Itti’s attention model; (d) saliency maps obtained by the attention model with 100 bases (16 feature maps); (e) with 392 bases (50 feature maps); (f) with 576 bases (64 feature maps); (g) saliency maps of a fully connected network (392 bases, 1 feature map); (h) saliency maps of a randomly connected network (392 bases, 50 feature maps). The attention models in this paper simulate bottom-up saliency detection. Thus, their results are not always identical to the results of human eye tracking, which sometimes involves top-down attention. In the last row, our model with 576 filters (64 invariant features) acts like a contour extractor that can suppress textures. It detects most contours of the target despite the strongly cluttered background. All the filters are learned from the same training dataset.
Qi et al. BMC Neuroscience 2012 13:145 doi:10.1186/1471-2202-13-145