High level semantic knowledge in addition to low level visual cues is essentially crucial for co-saliency detection. This project proposes a novel end-to-end deep learning approach for robust co-saliency detection by simultaneously learning high level  group wise semantic representation as well as deep visual features of a given image group. The inter image interaction at the semantic level and the complementarity between the group semantics and visual features are exploited to boost the inferring capability of co-salient regions. Specifically, the proposed approach consists of a co-category learning branch and a co-saliency detection branch. While the former is proposed to learn a group wise semantic vector using co-category association of an image group as supervision, the latter is to infer precise co-salient maps based on the ensemble of group-seman tic knowledge   and deep visual cues. The group-semantic vector is used to augment visual features at multiple scales and acts as a top down semantic guidance for boosting the bottom up inference of co-saliency. Moreover, we develop a pyramidal attention (PA) module that endows the network with the capability of concentrating on important image patches and suppressing distractions. The co-category learning and co-saliency detection branches are jointly optimized in a multitask learning manner, further improving the robustness of the approach. We construct a new large-scale co-saliency data set COCO-SEG to facilitate research of the cosaliency detection. Extensive experimental results on COCO-SEG and a widely used benchmark Cosal2015 have demonstrated the superiority of the proposed approach compared with state of the art methods. This project is implemented with MATLAB software.

Let's Talk