ABSTRACT
For cloud storage service vendors, balancing the client-perceived IO performance and the self-perceived space cost is always one of the standing challenges. When applying deduplication techniques for the cloud storage systems, the demand for optimizing such tradeoff becomes more pressing. Enabling deduplication decreases the storage space cost, whereas the IO performance will be somewhat affected due to extra processing overhead and data fragmentation. In this project, we address this challenge by proposing MUSE, a Muti-tiered and S LA-driven deduplication framework for cloud storage systems. First, we propose a novel notation of Dedup-SLA (deduplication-oriented service level agreement). With different levels of quantified performance/space-cost combinations, the Dedup-SLA serves as a refined service quality protocol between service vendor and customer. Second, MUSE adopts multi-tiered deduplication that orchestrates several combinational forms of deduplication into multiple tiers with varied “deduplication strength”. Third, we implement a mechanism called dynamic deduplication regulation (DDR) to adjust the deduplication behavior during runtime. MUSE’s deduplication behavior is periodically switched between tiers according to the predefined Dedup-SLA and instant system status. We conduct comprehensive experiments to compare MUSE with several other types of deduplication schemes. The results demonstrate that MUSE significantly optimizes the IO-performance/space-cost balance compared to other schemes, hence delivering higher deduplication service quality for deduplication-enabled cloud storage systems.