DPM 2010 + Firestreamer + Data Domain

The technical support forum for Firestreamer (the virtual tape library).
Locked
ganars
Posts: 2
Joined: 30 Dec 2010, 16:59

Post by ganars »

My company recently switched from Symantec Backup Exec 12.5 to Microsoft DPM 2010. Backup Exec was using a Data Domain deduplication device as a "backup to disk" device in addition to storing longer term backups to tape.

We purchased Firestreamer so that we could continue storing backups on the Data Domain device. But now, we are seeing much lower levels of deduplication. The Firestreamer data files are compressed on average 8 to 1 whereas with Backup Exec it was closer to 20 to 1.

We are trying to determine the cause of this change. Most likely it has something to do with how the Data Domain software interprets the data stream being stored. This is especially true if "tape markers" are being used and their software is not aware of them.

One question I have been asked to answer is this:

Does Firestreamer introduce any type of "tape marker" into the virtual media files it creates or would such markers come from the backup software (Microsoft DPM 2010 in this case)?
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

I assume that you are using Firestreamer in file media mode, when a virtual tape corresponds to a .fsrm file. Firestreamer uses certain format to translate tape data stream to normal files, and with that format there's a file header, and there are block headers. If the Data Domain device tries to understand what's stored in a .fsrm file, it will see the original backup data intermixed with Firestreamer data.

Now it depends on how exactly the device works. If it depends on the knowledge of the file format (let's assume it's aware of the Backup Exec format), it's unlikely to work properly with Firestreamer because it doesn't know the Firestreamer format. We will contact Data Domain to see if they are interested in making their products compatible with Firestreamer.

If the device doesn't care about the format, then it depends on whether it tries to find duplicate data blocks based on file system clusters or interprets files as uninterrupted data streams. If it deduplicates using file system clusters, it's unlikely to succeed because tape blocks are likely to be misaligned with cluster boundaries. If it tries to match blocks by interpreting file data as a stream, it has more chances to succeed, but it's unlikely the device uses this method because it requires more processing power than the cluster based matching.

Firestreamer uses the same compression as the NTFS file system. You can try to disable the compression in Firestreamer to see if it improves the deduplication ratio.
Best regards,
John Smith
Cristalink Support
ganars
Posts: 2
Joined: 30 Dec 2010, 16:59

Post by ganars »

Thanks for the fast response John!

I am using Firestreamer in file media mode and we turned off the Firestreamer compression at the start of use. The Data Domain is simply acting as a CIFS target.
My understanding is that the dedupe software will accept any data format, but has optimizations it can use if it "detects" certain "tape markers".
I imagine it either removes or remaps the makers so that only the original backup data is deduped.

The options in their help file are:

auto Attempt to automatically determine what type of markers are in use (the default setting).
besr1 Backup Express System Restore, used for Symantec NetBackup family of products, which takes a sector level dump of a Windows drive.
cv1 CommVault Galaxy with VTL and file system backups.
eti1 HP NonStop systems using ETI-NET EZX/BackBox.
hpdp1 HP DP versions 5.1, 5.5, and 6.0 with VTL and file system backups.
ism1 Informix Onbar. Used when Informix database is backed up using onbar and its internal storage manager.
nw1 Legato NetWorker with VTL.
ssrt1 Synectics backup express
tsm1 IBM Tivoli Storage Manager on media servers with "small-endian" processor architecture, such as x86 Intel or AMD.
tsm2 IBM Tivoli Storage Manager on media servers with "big-endian" processor architecture, such as SPARC or IBM mainframe.
none Data with no markers.

We are using "auto" but I imagine it does not know how to deal with the "original backup data intermixed with Firestreamer data".
And more than likely it is using the file system clusters as you mentioned.

It would be great if they are open to working with you for compatibility though they do have their own VTL option for Microsoft DPM. We were just trying to avoid that route as a) we really like your product and its flexability and b) for their VTL to work we have to of course spend more money on software licensing as well as add Fiber Channel cards/connectivity.

Thanks again for your response and I will be sure to come back and add any additional information I find in case it is of use to any other customers.
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

Their "tape markers" most likely refer to proprietary formats used by various vendors for virtual tape. Firestreamer is from a vendor that Data Domain doesn't seem to support yet. We will try to contact them next week. I will post any updates here.
Best regards,
John Smith
Cristalink Support
cfrantsen
Posts: 1
Joined: 03 Jan 2012, 15:22

Post by cfrantsen »

Resurrecting this thread as we are facing the same problem with low compression rates on data domain compared to backups of the same data using other software.

Did you find a solution for this?
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

Unfortunately, there is nothing we can do at our side. We tried to contact Datadomain in January 2011, but they were too bureaucratic. We couldn't even reach whoever was responsible for the deduplication storage there. If you are a customer of them, it may help to talk to their support people. You can say you contacted us and that Cristalink is willing to cooperate with Datadomain. If they are interested, they can simply email us to get the ball rolling.
Best regards,
John Smith
Cristalink Support
Locked