System process 50% CPU during backup then server locks up

The technical support forum for Firestreamer (the virtual tape library).
Locked
KenR
Posts: 3
Joined: 20 Apr 2012, 17:03

Post by KenR »

Windows Server 2008 SP2 64bit 16GB RAM
Firestreamer 3.3.0 x64
DPM 2007 SP1 w/ KB959605 & KB976542

To improve backup time to "tape" (was using USB) I have installed an ESATA card (SiI 3132 chipset) in the DPM server (Dell PowerEdge 2950). This has vastly improved backup times, but only when the server doesn't lock up with the System process pegged at 50%. Roughly 300GB can be transferred to the ESATA drive before the server becomes glacial in response, requiring a hard reboot (attempting proper shutdowns or reboots haven't worked).

I've waited 5 hours with the System processed at 50% CPU and haven't seen this issue resolve itself.

This issue has occurred when DPM is only performing a "tape" backup; no recovery points or synchronizations being done by DPM.

System process CPU usage ramps up once the backup begins.

Has occurred with two different ESATA drives (LACIE 2TB Rugged XL).

Besides trying a different ESATA card with a different chipset I’m not sure what else to look at here.

Pre-thanks for any thoughts, pointers, etc.

Ken
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

Most likely, you have a driver that leaks memory. In Task Manager, check if the available kernel memory steadily declines during a backup. There's a good chance the culprit is the storage controller driver, so try updating it first. For more information, please see Why does my computer become unresponsive?, What hardware do you recommend for use with Firestreamer?
Best regards,
John Smith
Cristalink Support
KenR
Posts: 3
Joined: 20 Apr 2012, 17:03

Post by KenR »

Johh,

Thanks for the reply. Seeing that your FAQ mentions other users have had issues with Silicon Image controllers (also the driver hasn't been updated since 2007...), I've dumped the SiI card for a Vantec UGT-IS100R using a JMicron JMB36X chipset: Driver 1.17.63.1 dated 5/19/2011. Wish I could report a fix, but now the system is failing in under 2 minutes with only moving 2GB of data. Not the kind of speed improvement I was looking for.

Using Process Explorer I see the thread within the System process that is maxing at 50% CPU is clfs3dvr.sys.

I'm going to check of Dell updates and see where that takes me.
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

Unfortunately, the quality of drivers is and always was a major problem. Have you checked the memory usage as I advised in my previous post?

What do you mean by "the system is failing", presicely? A blue screen? If yes, is there recent MEMORY.DMP in c:\windows? If yes, are you familiar with WinDbg from Debugging Tools for Windows? It's relatively easy to use it to find out the name of the faulting driver.

I would suggest putting back the Silicon Image controller and trying to upgrade its driver first.
Best regards,
John Smith
Cristalink Support
KenR
Posts: 3
Joined: 20 Apr 2012, 17:03

Post by KenR »

John thanks for your feedback.

While troubleshooting the issue this past week, the DPM Console crashed (Connection to the DPM service has been lost. ID 917).

I followed the document below, but to no avail. And painful lesson learned; I did not have a backup of the DPM database.
http://blogs.technet.com/b/dpm/archive/ ... ssues.aspx

DPM 2007, SQL 2005, and Firestreamer 3.3.0 were uninstalled.
The JMicron card was removed and then reinstalled with latest drivers.
DPM 2010, SQL 2008, and Firestreamer 4.0.0 were installed.

Reading though the Firestreamer documentation, I set up the ESATA drive as a File Media drive as opposed to Disk Media; I was using Disk Media with 3.3.0 which I should have stated in my initial post as it turns out this may be part of the issue I am encountering.

Backing up to File Media has proven successful thus far after two consecutive full back ups.
However, it is just a smidgen faster than backing up to USB; nowhere near ESATA speeds.

Backing up to Drive Media is fast as one would expect with ESATA, but even with DPM 2010 and Firestreamer 4.0, the system process eventually pegs the processor at 50% (1 core minimal CPU hit, the other core pegged at 100% from the system process) and all data reading and writing ceases.

To the questions in your post above:
As you asked, I watched the Kernel memory during backups and when using File Media, Kernel memory total grew to a little over 4GB and then back down when the backup finished. The system was responsive during this backup
Using Drive Media the Kernel memory total started at 228 and was up to 231 when the system fails with about 300GB left to back up out of 900GB. Once the system "failed", response was sluggish if not totally unresponsive.

By failing I mean that the system process hits 50% CPU (1 core minimal CPU hit, the other core pegged at 100% from the system process) and the system becomes glacial to respond or may not at all. I have not encountered any blue screens.

Before the DPM 2007 Console crash stated above, I did the following as you suggested:
Per the SiI website, the most recent Win 64-bit BASE driver for the 3132 Silicon Image controller is 1.0.15.0 dated 10/16/2007, which is the one I’ve used. I have removed the device and driver from the system, reinstalled with the SiI driver and have still encountered the issue with the clfs3dvr.sys hitting 50% and glacial response of the system.

So right now I am planning on using File Media and am going to dig into why the transfer speeds are so slow. I will also test a simple file copy to the drive of a few 100 GB and see if that process is slow as well.

Thanks again for your help and for Firestreamer.
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

While troubleshooting the issue this past week, the DPM Console crashed (Connection to the DPM service has been lost. ID 917).
Firestreamer is a stand-alone product. It's not aware of Microsoft DPM, and does not interact with Microsoft DPM on its own. Any issues with the DPM Console should be reported to Microsoft. We have nothing to do with this.
Backing up to File Media has proven successful thus far after two consecutive full back ups.
However, it is just a smidgen faster than backing up to USB; nowhere near ESATA speeds.
Please see Why is the performance so poor?
Backing up to Drive Media is fast as one would expect with ESATA, but even with DPM 2010 and Firestreamer 4.0, the system process eventually pegs the processor at 50% ... and all data reading and writing ceases.
With drive media, Firestreamer doesn't care about the interface and treats all direct-access devices (HDDs) in the same standard way. If drive media works fine with USB but fails with eSATA, the problem is with the eSATA storage stack.
I watched the Kernel memory during backups and when using File Media, Kernel memory total grew to a little over 4GB and then back down when the backup finished
The amount of available memory should roughly stay the same during a backup and should not gradually decrease.
Best regards,
John Smith
Cristalink Support
Locked