Diabetic retinopathy (DR) is a disease that damages retinal blood vessels and leads to blindness. Usually, colored fundus shots are used to diagnose this irreversible disease. The manual analysis (by clinicians) of the mentioned images is monotonous and error-prone. Hence, various computer vision hands-on engineering techniques are applied to predict the occurrences of the DR and its stages automatically. However, these methods are computationally expensive and lack to extract highly nonlinear features and, hence, fail to classify DR’s different stages effectively. This project focuses on classifying the DR’s different stages with the lowest possible learnable parameters to speed up the training and model convergence. The VGG16, spatial pyramid pooling layer (SPP) and network-in-network (NiN) are stacked to make a highly nonlinear scale-invariant deep model called the VGG-NiN model. The proposed VGG-NiN model can process a DR image at any scale due to the SPP layer’s virtue. Moreover, the stacking of NiN adds extra nonlinearity to the model and tends to better classification. The proposed model performs better in terms of accuracy, computational resource utilization compared to state-of-the-art methods.