The continuous development of industry big data technology requires better computing methods to discover the data value. Information forecast, as an important part of data mining technology, has achieved excellent applications in some industries. However, the existing deviation and redundancy in the data collected by the sensors make it difficult for some methods to accurately predict future information. This project proposes a semi supervised prediction model, which exploits the improved unsupervised clustering algorithm to establish the fuzzy partition function, and then utilize the neural network model to build the information prediction function. The main purpose of this project is to effectively solve the time analysis of massive industry data. We built a data platform on Spark, and used some marine environmental factor datasets and UCI public datasets as analysis objects. Meanwhile, we analyzed the results of the proposed method compared with other traditional methods, and the running performance on the Spark platform. The results show that the proposed method achieved satisfactory prediction effect.