Utilization of Data de duplication to enhance the Performance of  Storage System in the Cloud

K. Mounika; Prof.G. Swetha

Authors

K. Mounika Author
Prof.G. Swetha Author

Keywords:

Deduplication, HPC, and POD are all terms that refer to data deduplication

Abstract

The I/O bottleneck has become an insurmountable challenge for large-scale cloud data
analysis due to the volatile growth in data volume. Ongoing research has shown that direct to high
information recurrence is unquestionably present in Cloud storage frameworks. Because of the
relatively high worldwide access area and the low number of I/O solicitations for repeating
information, our exploratory findings reveal that information excess exhibits a much greater
quantity of power on the I/O route than on circles. Additionally, deduplication of critical Cloud
storage frameworks is likely to lead to memory and data inconsistencies on the platters. POD, a
performance-oriented I/O deduplication, is proposed instead of iDedup, a limit-oriented I/O
deduplication, to improve the I/O performance of important storage systems in the Cloud without
sacrificing the last's limit reserves.
mentioned. Specifically, Case uses a demand-based specific deduplication procedure called Select
Dedupe to lighten the information fracture and a versatile memory administration plot called iCache
to facilitate the memory conflict between the bursty read movement and the bursty compose
activity in a two-dimensional manner to enhance the execution of essential stockpiling framework
and limit the execution overhead of deduplication. We've implemented a POD model as a module in
the Linux working framework for your convenience. POD outperforms iDedup in the I/O
performance test by up to 87.9 percent, with a typical of 58.8 percent, according to the tests
conducted on our lightweight POD model. Furthermore, our findings show that POD achieves
comparable or preferable limits on investment funds to iDedup.