SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Backup fails due to occupied tapes
Author Message
Post Backup fails due to occupied tapes 
Hello all.


This is my first post here. I've recently started working with HP DataProtector (DP), and have problems getting the backup process to function correctly. In one of my backup groups I'm getting these messages:

Some of the backup devices are occupied. Session is waiting for all the devices to get free
Timeout waiting for the devices to get free. The session will terminate.
None of the Disk Agents completed successfully. Session has failed.


If I understand correctly, DP will hold the backup process until all necessary tapes are free. Following this, it is likely that the backup group above is scheduled to start too soon after another backup group. Is this a correct assumption? If so, I need to find a open slot for backing up this group. Is there a structured way of doing this (maybe DP has a feature that can suggest time slots)?

Regards,
Kenneth Holter

View user's profile Send private message
Post Backup fails due to occupied tapes 
Kenneth,

I don't mean to state the obvious, but Data Protector is waiting for a tape device to be freed up.

How are you allocating tape drives? Are you allocating specific drives, or are you using the load balancing function? I like to use the load balancing function so that a job won't fail waiting for a specific drive when others are available.

There are two possible causes here.

Case 1: The first is that your job is waiting on a specific drive, but another backup job is using it. The way to fix this is to configure your jobs with the load balancing function as I previously stated (assuming this is a relatively recent version of DP.)

Look in the logs of the jobs. If the logs indicate that the robot tried to load a tape before it was discovered that the drive was busy, you could have a configuration problem within DP with the locking mechanism for the drives--causing it to not properly track which drives are in use.

Case 2:

The other cause would be that a tape is stuck in a drive. If this is the case, you should see the following symptoms:
-- The DP GUI does not show this drive in use
-- If you have physical access to the libary, you can see the tape in the drive --or-- if you log into the webgui of the library you should see that the drive has a tape in it

You may also have other error messages in other jobs saying that DP was unable to verify the tape header.

If you have these symptoms they indicate a serious problem where you are losing data from other backups and don't realize it. If this is the case I recommend that you get a skilled consultant in to fix our environment due to the data loss that is already happening that you aren't aware of. I can point you in the right direction if you need it.[/b]

View user's profile Send private message
Post Scheduling in DP 
IMHO the scheduling function in HPDP is weak (though not as confusing as NetBackup to me). Granularity is only every 15 minutes due to the GUI settings and the 'omnitrig' cron trigger mechanism. And, as you have found out, jobs can easily time out waiting for drives to free up, especially if you assign specific drives to specific backup specs. You can increase the timeout but that can lead to more problems getting jobs in during the backup window. We use IBM Tivoli Workload Scheduler (a/k/a Maestro) to get around this issue. Not an ad, and I'm sure HP, CA and others have similar job scheduling products.

You define a set # of tape resources (to match your available drives) in Maestro, setup your backup specs to run (daily/weekly/whatever), and the schedules can run one right after the other with no delays as soon as the correct # of drives (which can be set per-schedule) are free. Just make sure to have no conflicting scheduled jobs actually setup in the calendar in DP itself, and also every spec should have every available drive assigned (just set min drives to 1 and max drives to the # you want each to use).

Note you still have to balance out the big jobs with the little ones, and if you have a job which hogs all your drives for the entire backup window you won't ever get to run your little jobs (which means you probably need to add more / faster drives!). Another bug/gotcha, which may be fixed in later versions of DP and Maestro, but in our environment if a big job starts on 6 drives, even if it finishes on 5 of them and is only writing to 1 drive, it won't release the other 5 to the free pool again until the job is fully complete. SAP and Oracle are the main culprits here due to the way the backup agents try to load balance objects across drives. To minimize this we typically run all our small jobs on 1-2 drives first and then as they finish & free up enough drives, the big ones can start.

We are also looking at doing Curtis' recommendation about spreading out the fulls during all 7 weeknights instead of just Saturdays/Sundays.

View user's profile Send private message
Post  
It's a big issue for me, as i have some backups which run for more than 24h. I've even increased timeout to 30h, but still there are times when heavy load prevents from running some small jobs.

Is there a way to select, for example, 4 drives for a job, to try to run on one drive from those four, and let HP DP free those 3 drives for other backups? I would then select 4 drives for every job, and let HP DP use only one free of them. This would be a great feature, as I havent figured it out.

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB