This project proposed a harvesting system based on the Internet of Things technology and smart image recognition. Farming decisions require extensive experience; with the proposed system, crop maturity can be determined through object detection by training neural network models, and mature crops can then be harvested using robotic arms. Keras was used to construct a multilayer perceptron machine learning model and to predict multi axial robotic arm movements and position. Following the execution of object detection on images, the pixel coordinates of the central point of the target crop in the image were used as neural network input, whereas the robotic arms were regarded as the output side. A MobileNet version 2 convolutional neural network was then used as the image feature extraction model, which was combined with a single shot multi box detector model as the posterior layer to form an object detection model. The model then performed crop detection by collecting and tagging images. Empirical evidence shows that the proposed model training had a mean average precision (mAP) of 84%, which was higher than that of other models; a mAP of 89% was observed from the arm picking results. This project is implemented with MATLAB software.