When using deduplication, which of the following ratios would be expected for general files?

vSAN can perform block-level deduplication and compression to save storage space. When you enable deduplication and compression on a vSAN all-flash cluster, redundant data within each disk group is reduced.

Deduplication removes redundant data blocks, whereas compression removes additional redundant data within each data block. These techniques work together to reduce the amount of space required to store the data. vSAN applies deduplication and then compression as it moves data from the cache tier to the capacity tier.

You can enable deduplication and compression as a cluster-wide setting, but they are applied on a disk group basis. When you enable deduplication and compression on a vSAN cluster, redundant data within a particular disk group is reduced to a single copy.

You can enable deduplication and compression when you create a new vSAN all-flash cluster or when you edit an existing vSAN all-flash cluster. For more information about creating and editing vSAN clusters, see "Enabling vSAN" in vSAN Planning and Deployment.

When you enable or disable deduplication and compression, vSAN performs a rolling reformat of every disk group on every host. Depending on the data stored on the vSAN datastore, this process might take a long time. Do not perform these operations frequently. If you plan to disable deduplication and compression, you must first verify that enough physical capacity is available to place your data.

Note: Deduplication and compression might not be effective for encrypted VMs, because VM Encryption encrypts data on the host before it is written out to storage. Consider storage tradeoffs when using VM Encryption.

How to Manage Disks in a Cluster with Deduplication and Compression

Consider the following guidelines when managing disks in a cluster with deduplication and compression enabled.

  • Avoid adding disks to a disk group incrementally. For more efficient deduplication and compression, consider adding a disk group to increase the cluster storage capacity.
  • When you add a disk group manually, add all the capacity disks at the same time.
  • You cannot remove a single disk from a disk group. You must remove the entire disk group to make modifications.
  • A single disk failure causes the entire disk group to fail.

Verifying Space Savings from Deduplication and Compression

The amount of storage reduction from deduplication and compression depends on many factors, including the type of data stored and the number of duplicate blocks. Larger disk groups tend to provide a higher deduplication ratio. You can check the results of deduplication and compression by viewing the Usage breakdown before dedup and compression in the vSAN Capacity monitor.

When using deduplication, which of the following ratios would be expected for general files?

You can view the Usage breakdown before dedup and compression when you monitor vSAN capacity in the vSphere Client. It displays information about the results of deduplication and compression. The Used Before space indicates the logical space required before applying deduplication and compression, while the Used After space indicates the physical space used after applying deduplication and compression. The Used After space also displays an overview of the amount of space saved, and the Deduplication and Compression ratio.

The Deduplication and Compression ratio is based on the logical (Used Before) space required to store data before applying deduplication and compression, in relation to the physical (Used After) space required after applying deduplication and compression. Specifically, the ratio is the Used Before space divided by the Used After space. For example, if the Used Before space is 3 GB, but the physical Used After space is 1 GB, the deduplication and compression ratio is 3x.

When deduplication and compression are enabled on the vSAN cluster, it might take several minutes for capacity updates to be reflected in the Capacity monitor as disk space is reclaimed and reallocated.

2015-09-14T22:30:00Z

  • Deduplication Software
  • Dell PowerProtect DD (Data Domain) Reviews
  • Can anyone share their real-life deduplication ratios using Data Domain?

When using deduplication, which of the following ratios would be expected for general files?

  • 32
  • 1460

I know there are a number of variables but EMC is quoting us 6:1, we are seeing 7:1 in a POC but a colleague tells me he is getting 2:1. This is in Data Domain.

Thanks in advance!

Eric

14

14 Answers

Answered Oct 10, 2015

Sr. Information Officer at Merino Industries Ltd

Jun 01, 2022

Hi community, I work as a Sr. Information Officer at a Manufacturing company. Currently, I'm looking to compare Dell PowerProtect DD (Data Domain) vs NetApp FAS Series.  Also, if possible, I'm looking to compare how they are both compared to Pure Storage. Can you please share your inputs? Thanks.

When using deduplication, which of the following ratios would be expected for general files?

Responsabile Data Management DC Area Nord Ovest at a tech services company with 501-1,000 employees

31 May 22

I think they are different types of storage for different purposes. If you are looking for a storage where to put backups data you can think Data Domain is the perfect choice because it is its main use (most or all the backup softwares have plugins in ordere to manage data domains). If you are looking for a primary storage (where to put your servers' data) then you can look to Netapp FAS and Purestorage. The latter are flash natives so it's simpler to manage and configure. If you look at the Netapp FAS you can also choose storages with HDDs with less performance (and a cheaper price). 

When using deduplication, which of the following ratios would be expected for general files?

President & CDS at Dragon Slayer Consulting

31 May 22

@Dhruba Roy, your question conflates very different kinds of storage.  PowerProtect DD is Dell's latest version of Data Domain. It is ONLY useful as target storage for backups. Nothing else, not even archiving. If that is what you want, it does what it's supposed to do. Albeit, it's a bit pricey and underperforming.  There are much faster, cheaper, and more advanced backup target storage. Especially when measuring restore performance. I would suggest you take a hard look at a variety of backup target storage vendors including, Infinidat InfiniGuard, ExaGrid, Quantum, StorONE, iXsystems, and many more. Most backup target storage is all HDD although some are hybrid SSD and HDD. NetApp FAS is a general-purpose storage system for blocks and files. It can be all HDD, hybrid HDD and SSD, or all SSD (all-flash FAS or AFF). It's a solid all around storage system with NetApp pioneered capabilities, but expensive as a backup storage target.  Pure Storage FlashArray//X or //C are block all-flash storage arrays. Their FlashBlades are all flash file and object storage systems. Good performers but overkill and way too expensive for backup target storage. I think you need to define what it is you really need. Of the 3 vendors you asked about, I am going to repeat myself, PowerProtect DD is ONLY useful as a target storage for backups. The other two can do so, but are really not priced nor designed specifically for backup target storage.  If general purpose storage is what you need NetApp and PureStorage are good possibilities among many others.

Download Free Report

Download our free Dell PowerProtect DD (Data Domain) Report and get advice and tips from experienced pros sharing their opinions. Updated: September 2022.

DOWNLOAD NOW

632,147 professionals have used our research since 2012.

When using deduplication, which of the following ratios would be expected for general files?

© 2022 PeerSpot, All Rights Reserved.