Deep patch learning for weakly supervised object classification and discovery


Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained supervisions (e.g., bounding-box annotations) to learn patch features, which requires a great effort to label images may limit their potential applications. In this paper, we propose to learn patch features via weak supervisions, i.e., only image-level supervisions. To achieve this goal, we treat images as bags and patches as instances to integrate the weakly supervised multiple instance learning constraints into deep neural networks. Also, our method integrates the traditional multiple stages of weakly supervised object classification and discovery into a unified deep convolutional neural network and optimizes the network in an end-to-end way. The network processes the two tasks object classification and discovery jointly, and shares hierarchical deep features. Through this jointly learning strategy, weakly supervised object classification and discovery are beneficial to each other. We test the proposed method on the challenging PASCAL VOC datasets. The results show that our method can obtain state-of-the-art performance on object classification, and very competitive results on object discovery, with faster testing speed than competitors.

Pattern Recognition