Archive for the 'Data Protection Manager' Category

SQL Auto Protection Fails

Using DPM 2012 had an error with SQL Auto Protection.  In the error, it says to run “AutoProtectInstances.ps1”  When I did that, I got this error:

Start-DPMAutoProtection : DPM could not enumerate SQL Server instances using Wi
ndows Management Instrumentation on the protected computer <ComputerName> . (ID: 965)
Please make sure that Windows Management Instrumentation for SQL server is in g
ood state.

A quick search turned up this article talking about protecting SharePoint.  The underlying problem was with the SQL instance that they were using so this fix worked:

You do have to run the cmd from an elevated prompt.

Error installing DPM 2010 Beta

I was installing the DPM 2010 Beta (finally) and had an issue trying to get the SQL 2008 to install.  Finally figured out that I had the install files stored too deeply in a network share.  I figure this out by running the SQL install directly and when it when to check prereq’s it had an error on one section and when you click for more info this is what you get:

Rule "Long path names to files on SQL Server installation media" failed.

SQL Server installation media on a network share or in a custom folder can cause installation failure if the total length of the path exceeds 260 characters. To correct this issue, utilize Net Use functionality or shorten the path name to the SQL Server setup.exe file.

So, I moved it to a shorter path and it installed just fine.

Data Protection Manager 2010

So I am a bit late realizing this, but the Beta for DPM 2010 is available now on the Connect site.  I haven’t read anything on it yet, so mainly I am just posting this to make myself look into it.

Follow up to the DPM recovery point expiration issues

Previously, I blogged about issues I was having where old recovery points were not being expired/removed from my DPM servers.  I had to open a ticket with Microsoft, and worked with them to determine the cause, and since then, they have released a fix.

The fix that Microsoft developed is here:

A few people have asked for the PowerShell script “show-pruneshadowcopies.ps1” that Microsoft provided and I mentioned in my previous post (here).  The script looks like this:

#displays all RP for data sources and shows which RP’s would be deleted by the regular pruneshadowcopies.ps1
# Outputs to a logfile:  C:\Program Files\Microsoft DPM\DPM\bin\SHOW-PRUNESHADOWCOPIES.LOG

#Author    : Mike J
#Date    : 02/24/2009


function GetDistinctDays([Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.ProtectionGroup] $group,
[Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.Datasource] $ds)
    if($group.ProtectionType -eq [Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.ProtectionType]::DiskToTape)
        return 0
    $scheduleList = get-policyschedule -ProtectionGroup $group -ShortTerm
    if($ds -is [Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.FileSystem.FsDataSource])
        $jobType = [Microsoft.Internal.EnterpriseStorage.Dls.Intent.JobTypeType]::ShadowCopy
        $jobType = [Microsoft.Internal.EnterpriseStorage.Dls.Intent.JobTypeType]::FullReplicationForApplication
        if($ds.ProtectionType -eq [Microsoft.Internal.EnterpriseStorage.Dls.Intent.ReplicaProtectionType]::ProtectFromDPM)
            return 2
    write-host   "Look for jobType $jobType"

    foreach($schedule in $scheduleList)
        write-host("schedule jobType {0}" -f $schedule.JobType)
        if($schedule.JobType -eq $jobType)
            return [Math]::Ceiling(($schedule.WeekDays.Length * $ds.RecoveryRangeinDays) / 7)

    return 0

function IsShadowCopyExternal($id)
    $result = $false;

    $ctx = New-Object -Typename Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlContext

    $cmd = $ctx.CreateCommand()
    $cmd.CommandText = "select COUNT(*) from tbl_RM_ShadowCopy where shadowcopyid = ‘$id’"  
    write-host $cmd.CommandText
    $countObj = $cmd.ExecuteScalar()
    write-host $countObj
    if ($countObj -eq 0)
        $result = $true

    return $result

function IsShadowCopyInUse($id)
    $result = $true;

    $ctx = New-Object -Typename Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlContext

    $cmd = $ctx.CreateCommand()
    $cmd.CommandText = "select ArchiveTaskId, RecoveryJobId from tbl_RM_ShadowCopy where ShadowCopyId = ‘$id’"  
    write-host $cmd.CommandText
    $reader = $cmd.ExecuteReader()
        if ($reader.IsDBNull(0) -and $reader.IsDBNull(1))
            $result = $false

    return $result

"**********************************" > $logfile
"Version $version" >> $logfile
get-date >> $logfile

$dpmservername = &"hostname"

$dpmsrv = connect-dpmserver $dpmservername

if (!$dpmsrv)
    write-host "Unable to connect to $dpmservername"
    exit 1

write-host $dpmservername
"Selected DPM server = $DPMservername" >> $logfile
$pgList = get-protectiongroup $dpmservername
if (!$pgList)
    write-host   "No PGs found"
    disconnect-dpmserver $dpmservername
    exit 2

write-host("Number of ProtectionGroups = {0}" -f $pgList.Length)
$replicaList = @{}
$latestScDateList = @{}

foreach($pg in $pgList)
    $dslist = get-datasource $pg
    if ($dslist.length -gt 0)
    write-host("Number of datasources in this PG = {0}" -f $dslist.length)
    ("Number of datasources in this PG = {0}" -f $dslist.length) >> $logfile
    Foreach ($ds in $dslist)
       write-host("DS NAME=  $ds")
       ("DS NAME=  $ds") >>$logfile
    foreach ($ds in $dslist)
        $rplist = get-recoverypoint $ds | where { $_.DataLocation -eq ‘Disk’ }
        write-host("Number of recovery points for $ds {0}" -f $rplist.length)
        ("Number of recovery points for $ds {0}" -f $rplist.length) >>$logfile 
        $countDistinctDays = GetDistinctDays $pg $ds
        write-host("Number of days with fulls = $countDistinctDays")
        ("Number of days with fulls = $countDistinctDays") >>$logfile
        if($countDistinctDays -eq 0)
            write-host   "D2T PG. No recovery points to delete"
            "D2T PG. No recovery points to delete" >>$logfile
        $replicaList[$ds.ReplicaPath] = $ds.RecoveryRangeinDays
        $latestScDateList[$ds.ReplicaPath] = new-object DateTime 0,0
        $lastDayOfRetentionRange = ([DateTime]::UtcNow).AddDays($ds.RecoveryRangeinDays * -1);       
        write-host("Distinct days to count = {0}. LastDayOfRetentionRange = {1} " -f $countDistinctDays, $lastDayOfRetentionRange)
        ("Distinct days to count = {0}. LastDayOfRetentionRange = {1} " -f $countDistinctDays, $lastDayOfRetentionRange) >>$logfile
        $distinctDays = 0;
        $lastDistinctDay = (get-Date).Date
        $numberOfRecoveryPointsDeleted = 0

        if ($rplist)
            foreach ($rp in ($rplist | sort-object -property UtcRepresentedPointInTime -descending))
                if ($rp)
                    if ($rp.UtcRepresentedPointInTime.Date -lt $lastDistinctDay)
                        $distinctDays += 1
                        $lastDistinctDay = $rp.UtcRepresentedPointInTime.Date
                    write-host(" $ds")
                    (" $ds") >>$logfile
                    write-host("  Recovery Point #$distinctdays RPtime={0}" -f $rp.UtcRepresentedPointInTime)
                    ("  Recovery Point #$distinctdays RPtime={0}" -f $rp.UtcRepresentedPointInTime) >>$logfile
                    if (($distinctDays -gt $countDistinctDays) -and ($rp.UtcRepresentedPointInTime -lt $lastDayOfRetentionRange))
                        write-host ("Recovery Point would be deleted ! – RPtime={0}" -f $rp.UtcRepresentedPointInTime)  -foregroundcolor red
                        ("Recovery Point would be deleted ! – RPtime={0} <<<<<<<" -f $rp.UtcRepresentedPointInTime) >>$logfile
#remove-recoverypoint $rp -ForceDeletion -confirm:$true | out-null
                        $numberOfRecoveryPointsDeleted += 1
                        write-host "    Recovery point not expired yet"
                        "    Recovery point not yet expired" >>$logfile
                    write-host "Got a NULL rp"
                    "Got a NULL rp" >>$logfile

            write-host "Number of RPs that would be deleted = $numberOfRecoveryPointsDeleted"  
            "Number of RPs that would be deleted = $numberOfRecoveryPointsDeleted" >>$logfile            

disconnect-dpmserver $dpmservername
write-host "Exiting from script"


Hyper-V and DPM – Some issues that you may see

We have several (15 or so) Hyper-V hosts running a number (126 or so) guests.  We use DPM to backup our servers, but only a few of our VMs are backed up at the host level.  Most are backed up as regular clients.  I have been having trouble with a couple of the ones that we do backup at the host level and just got to looking for the answer to what is going on.  Lucky for me I waited long enough for the Core Team to come up with some suggestions:

Ask the Core Team : DPM 2007 – Troubleshooting protection for Hyper-V

This post is about Windows Server 2008 with the Hyper-V role installed, that are being protected by System Center Data Protection Manager 2007.  There may be one or many Virtual Machines on each Host/Parent Partition, and they may be running Windows 2003 and/or Windows 2008. 

Ask the Core Team : DPM 2007 – Troubleshooting protection for Hyper-V

DPM v 3

I just watched a webcast on DPM v3 and thought I would share some of what I got from that.

In the last 18 months, DPM 2007 (v2) delivered application protection for Exchange ,SQL Server, SharePoint and virtualization environments running Virtual Server and Hyper-V.  Disaster recovery with Iron Mountain, Local Datasource Protection and Client backups have also come out through DPM 2007, its first feature update and Service Pack 1.  Now it is time to show what is coming next for DPM. 

A few top line items are support for the following:

  • support for Exchange 14, and more granular restore
  • protect the entire SQL instance, and auto discover new DB’s
  • protect 1000’s of DBs per DPM server
  • End User Recovery by the SQL Admin (role based access from the DPM console)
  • Office 14
  • AD appears as a data source in DPM UI
  • Image restore from centrally managed DPM server – executed locally
  • Support for Windows guest on VMware hosts
  • SAP running on MS SQL

and some other improvements:

  • up to 100 servers, 1000 laptops, 2000 databases per DPM server
  • management pack updates
  • automatic re-running of jobs and improved self-healing  —  This is a huge one in my book
  • auto protect new sources for SQL and MOSS
  • improved scheduling capabilities
  • one click DPM DR failover and failback
  • continued support for SAN (scripts/whitepapers)

platform requirements:

  • DPM Server must be 64-bit Windows Server 2008 R2
  • Integration capability with Windows EBS 2008 R2

DPM does not remove expired recovery points

I have been using DPM for about 7 months now.  (I tested with it for a few months before that.)  I never installed 2006, but 2007 seems to be working ok.  I have a few complaints, but I have complaints about all the backup software that I have ever used.  None of it really makes me happy.  But on to the story…

I have 3 production DPM servers.  One of them has a large number of protection group members.  8 Protection Groups, 328 Members.  And that is just to protect 39 computers, but one of the SQL servers has about 150 databases.

I noticed the problem because I kept running out of space on the Recovery Point volumes.  I had a particular 2008 Domain Controller that the system state recovery point volume would have to be extended every couple of days.  I was keeping the recovery points on disk for 5 days, so it finally occurred to me that it should take more that 200 GB to keep 5 days work of recovery points for the system state. 

I called and opened a ticket with Microsoft and we have been working on this for almost 2 months.  So far, the best that I can tell is that the process that clears the old recovery points slowly eats up memory.  This coupled with the fact that I have a lot of PG members, and means that the job frequently fails before it completes.  If the number of recovery points continues to grow, the job that clears them (pruneshadowcopies) takes longer and takes more memory.  This increases the chance that it will fail…

I don’t have a solution to this problem yet, other than a few work-arounds and a way to manually run the process:

  • add more RAM to your DPM Server.  Especially if you are running SQL locally on the box.
  • reduce the number of PG members.  Fewer members, less recovery points, less chance the prune job will fail.
  • open the DPM Management Shell (DPM PowerShell) and run “pruneshadowcopies.ps1”.  This will manually run the job that is triggered by DPM at midnight every night.  If you have a lot of recovery points that haven’t been pruned, then this will probably fail (crash) a few times before it finishes.  I have had it run all weekend before and then crash, and I have seen it run for just an hour and then crash.  Keep running it, and it will eventually finish. 
  • Hope that Microsoft comes up with a real fix soon…

To see if you have this problem, there is a version of the pruneshadowcopies script that just shows the recovery points, without actually expiring them.  The tech that I have been working with on my case sent it to me. 

Remove DPM agent from the DPM agent console

I blogged about this last year, but when I moved my blog, I lost part of the post (the picture) so I just deleted the post.  Then I noticed that Google is still sending people here to find the answer, so…

If you have DPM Protected Computer that goes away before you uninstall the agent, it isn’t obvious how you get the agent removed from the console.  Or at least it wasn’t immediately obvious to me.

  1. In the Management/Agents tab, right click on the agent (it will have a red x and “Unavailable” in the Agent Status column) and select Uninstall…
  2. Verify your list of agents (you can select more than one)
  3. Click on “Uninstall Agents”
  4. Enter the appropriate credentials.  This must be an account that has permissions to remove the agent from the DPM server, even though the Protected computer doesn’t exist, it still has to be a valid account.
  5. Select the “Manually restart the selected servers later” radio button
  6. Click ok.

So far, that isn’t any different than any other client uninstall.   At this point, you will have the option to close the window, and go on about your business.  And if the protected computer was still available, that would be perfectly fine to do.  But since the protected computer isn’t still available, you have to wait for the error to pop up.  First you will see that the uninstall failed and then you get this message:


Basically, it says, I couldn’t find that computer to remove the agent, you want me to just forget that it existed?  You click on “Yes” and then the entry for that computer is removed from the DPM database.  Now wasn’t that obvious?

Replica disk threshold exceeded, or Recovery Point Volume threshold exceeded

Great! :(  Now what?

Well, if you have a new DPM server and not a lot of protection groups created, and you haven’t been protecting anything much, you can just click on the link in the warning message that says “Allocate more disk space for replica…”  That pulls up a pretty window that looks like the one below:


So you go ahead an make the number in the “Replica Volume” field a little bigger, hit ok and go on about your business.  Unless..

Sometimes you may need to go and use DISKPART to manually add space to the volume.  If you try the above method, and you get a failed message instead of success, you are either out of disk space, or it could be that you have more than one disk on your DPM server and one of the disks becomes full.  In order to extend the volume onto another disk, you have to use DISKPART.  DPM (at this version) won’t do it for you.

  1. Open a command prompt (run as administrator if you are using a 2008 Server for your DPM server) and type “diskpart”. 
  2. Type “List Volume” at the prompt.
  3. Right click and “select all” then enter to copy the output to the clipboard
  4. Paste it in notepad so you can do a search and search for the Data Source
    1. You should see a line similar to this:

        Volume 534       DPM-Prolo  NTFS   Simple      2050 MB  Healthy
          C:\Program Files\Microsoft DPM\DPM\Volumes\Replica\\ SqlServerWriter\PrologPilot\
        Volume 535       DPM-Prolo  NTFS   Simple      2050 MB  Healthy
          C:\Program Files\Microsoft DPM\DPM\Volumes\DiffArea\\ SqlServerWriter\PrologPilot\
        Volume 536       DPM-Non VSS  NTFS   Simple      1540 MB  Healthy
          C:\Program Files\Microsoft DPM\DPM\Volumes\Replica\\Non VSS Datasource Writer\Computer\SystemState\SystemState

    1. The volume number  comes before what it is describing and there are 2 for each protected object.  A Replica volume and a DiffArea.  The replica volume is a copy of the data as it is on the protected member.  The DiffArea is where the recovery points are stored.  The “Non VSS Datasource Writer” is system state in the example.
  5. At the DISKPART> prompt type “select volume” and the volume number i.e. : select volume 534
  6. If you want to see the details about the disk, you can type detail volume and it gives an output similar to:

    DISKPART> detail volume

      Disk ###  Status      Size     Free     Dyn  Gpt
      ——–  ———-  ——-  ——-  —  —
    * Disk 2    Online      2560 GB   356 GB   *    *

    Read-only              : No
    Hidden                 : No
    No Default Drive Letter: Yes
    Shadow Copy            : No
    Dismounted             : No
    BitLocker Encrypted    : No

    Volume Capacity        : 1030 MB
    Volume Free Space      :  186 MB

  7. In order to increase the space for the Replica volume you would type: EXTEND SIZE=1024 DISK=2.  This would extend the selected volume by 1 GB (1024 MB) on DISK 2. 
  8. Now you have to go back in and tell DPM that you extended the volume.  (I believe it may figure it out on its own eventually, but I prefer to get the warning cleared up sooner rather than later, so I go update DPM.


Note:  Each time you use DISKPART, you are likely to see different numbers for the volumes.  I haven’t looked into what that is, but I do know that the system volume is always one of the last in the list.  For that reason I recommend that you always view detail volume after you select it, to make sure you are seeing the volume you intend to work with.

The stub received bad data – DPM backup of a SQL DB

My DPM Server:
Server 2008 x64
DPM 2007 SP1
SQL 2005 SP3 (local to DPM)

Server 2003 SP2 x64
SQL 2005 SP3

I have about 30 databases being backed up from the one client machine.  All of them backup just fine except one.  Every time I try to do a synchronization or a full backup I get the following error:

Triggering synchronization on *myserver\mydatabase* failed: Error 46: DPM failed to perform the operation because too many objects have been selected. Select fewer objects and then retry this operation.

Error details: The stub received bad data (0x800706F7)

Recommended action: Select fewer objects. 1) If you are trying to protect a large number of data sources on a volume, consider protecting the whole volume instead of individual data sources. 2)If you are trying to recover a large number of folders or files from a volume, consider recovering the parent folder, or divide the recovery into multiple operations.

This happens even if I just select the one database.  All the other databases backup correctly.

After posting in the news group, I got a question from a Microsoft person whether full text indexing was enabled, and if so how many catalogs.  Upon investigation, it appears that this is one of the only databases that i have that has full text indexing enabled.  It has 32 indexes. 

So with the suggestion of a colleague, I did a rebuild of the indexes:

    1. In SQL Server Management Studio (SSMS) expand the database/Storage/Full Text Catalogs
    2. Right click on the Full Text Catalogs folder and select Rebuild All

Then I went back to my DPM server, to the protection group and selected the database in question.  I did a “Create recovery point – Disk”, “Create a recovery point by using express full backup”.

That worked, so maybe that means the problem is fixed…