PRACTICAL PRIVACY-PRESERVING FREQUENT ITEM SET MINING ON SUPERMARKET TRANSACTIONS

Abstract

Data mining is widely applied to establish connections among the items in massive datasets nowadays. Association rule mining is one of the most popular methods to perform data mining, and a fundamental part of this is frequent item set mining. Big scale data are uploaded to the hones but curious cloud service provider (CSP). Therefore, it is imperative to protect the raw data from being obtained by the CSP and the third parties. Furthermore, because supermarket transactions are sparse, they are not suitable to be mined using the same methods used for most of the other data. The methods used for ordinary data will cost more computation power if they are applied on this special dataset. In this project, we propose an efficient protocol to evaluate whether an item set is frequent or not under the encrypted mining query on supermarket transactions. To improve the mining efficiency, we design a blocking algorithm. In this algorithm, we separate the encrypted transactions into blocks and only calculate bilinear pairings on ciphertexts of part blocks instead of all ciphertexts, which helps cut down the computation cost of the mining process. Finally, we evaluate the performance of our protocol by conducting theoretical analyses and simulator experiments in the aspects of computation cost, security, correctness, and running time. The results demonstrate that our protocol can output a correct mining result and clearly outperforms the previous solution in the aspect of efficiency under the same security level.

 

Let's Talk