Classifying breast cancer histopathological images automatically is an important task in computer assisted pathology analysis. However, extracting informative and non-redundant features for histopathological image classification is challenging due to the appearance variability caused by the heterogeneity of the disease, the tissue preparation, and staining processes. In this project, we propose a new feature extractor, called deep manifold preserving auto encoder, to learn discriminative features from unlabeled data. Then, we integrate the proposed feature extractor with a softmax classifier to classify breast cancer histopathology images. Specifically, it learns hierarchal features from unlabeled image patches by minimizing the distance between its input and output, and simultaneously preserving the geometric structure of the whole input data set. After the unsupervised training, we connect the encoder layers of the trained deep manifold preserving auto encoder with a softmax classifier to construct a cascade model and fine tune this deep neural network with labeled training data. The proposed method learns discriminative features by preserving the structure of the input datasets from the manifold learning view and minimizing reconstruction error from the deep learning view from a large amount of unlabeled data. Extensive experiments on the public breast cancer dataset (BreaKHis) demonstrate the effectiveness of the proposed method. This project is implemented with MATLAB software.

Let's Talk