Views

How do you decipher the output of "bpdbjobs -report -all columns"?

This Wiki is brought to you by Backup Central, where you can find the Mr. Backup Blog, Forums, and a mailing list for each forum!

Backup FAQs Service Providers Backup Software Backup Hardware Backup Book Wiki Free Stuff Miscellaneous


You can get a lot of information about a job from bpdbjobs -report -all_columns. First, we'll start with the somewhat-documented format of the bpdbjobs -report -all_columns output:

jobid,jobtype,state,status,class,schedule,client,server,started,elapsed,ended,
stunit,try,operation,kbytes,files,pathlastwritten,percent,jobpid,owner,subtype,
classtype,schedule_type,priority,group,masterserver,retentionunits,
retentionperiod,compression,kbyteslastwritten,fileslastwritten,filelistcount,
[files]...,trycount,[trypid,trystunit,tryserver,trystarted,tryelapsed,tryended,
trystatus,trystatusdescription,trystatuscount,[trystatuslines]...,
trybyteswritten,tryfileswritten]...

How does one decode that? The first thing you have to realize is that there are two "fields" that have a variable count:

- The number of entries in the filelist (e.g. /u01, /u01, /u03, etc.)
- The number of lines in the status output
  (e.g. Mounting tape x, Positioning tape x, etc.)


If you know the format above, you can get the time of the successful try. Let's take a look.

An explanation

This time is virtually worthless, unless you want to know when the job got queued:

Field 9 = start time (Time job first got queued)

Up to Field 32, all fields are fixed. Then Field 32 tells you how many entries there are in the filelist fields.

Field 32 = filelistcount (How many files are listed in the filelist)

Then, if you add that value to 33, you'll get the field that shows you the number of tries.

Field 33 + filelistcount = trycount (How many tries there are)

If there's only one try, and you want its starttime, then add 33, filelistcount + 4, and you've got the field that shows you the starttime of the first try:

Field 33 + filelistcount + 4 = [first]trystarted (The starttime of the first try)

But, if there were _two_ tries, then you've got to wade past the status entries. First, you'll need the number of entries in the status field. To get that, add _9_ to 33 and the filelistcount:

Field 33 + filelistcount + 9 = trystatuscount (The number of status entries in the first try)

Then , to get the starttime of the SECOND try, add 33, filelistcount,

9, trystatuscount, and 6:

Field 33 + filelistcount + 9 + trystatuscount+6 = [second]trystarted (The start time of the second try)

An example

First off, I'll tell you that perl is much better at this than awk. If you go over a certain number of characters, awk will complain that the line is too long. Perl doesn't complain at all. Perl is your friend.

The following is a real status line from a backup. First, we want the filelistcount (You'll notice that we grep for "^306" which is the job id of this job. You can see that in the example output below):

 # bpdbjobs -report -all_columns|grep '^306,' |awk -F',' '{print $32}'  
 1

So, there is only one entry in the filelist. (Which is usually the case in my configurations, since I usually select 'Allow multiple data streams,' and use "ALL_LOCAL_DRIVES" as the file list, which results in a separate job for each filesystem.) Check it out:

 # bpdbjobs -report -all_columns|grep '^306,' |awk -F',' '{print $33}'
 /u06

33 + 1 = 34

 # bpdbjobs -report -all_columns|grep '^306,' |awk -F',' '{print $34}'
 2

So, there were two tries. We'll have to do it the hard way. First, just for GP, we'll get the start time of the first try:

33 + 1 + 4 = 38

 # bpdbjobs -report -all_columns|grep '^306,' |awk -F',' '{print $38}'
 968698322

Now, let's get the start time of the second try. But first, we'll have to get the number of status entries.

33 + 1 + 9 = 43

 # bpdbjobs -report -all_columns|grep '^306,' |awk -F',' '{print $38}'
 43

So there are 43 status entries. (Yes, that's a coincidence that they're the same number.) So we add that to the other numbers, add 9, and we've got the start time of the second try.

33 + 1 + 9 + 43 + 6 = 92

 # bpdbjobs -report -all_columns|grep '^306,' |awk -F',' '{print $92}'
 968703872

There you go!


Example output of bpdbjobs

Here is an actual status line from a run of bpdbjobs -report -all_columns :

(It really is a single line. I appears here broken in several lines in order to keep the page formating sane.)


306,0,3,0,elvis,Full,elvis,vault,0968697806,0000010002,0968707808,vault,2,,18497655,344,,,2302,root,0,0,0,0,other,vault,4,1,0,0,0,1,/u06,
2,2302,vault,vault,968698322,5549,968703871,24,socket write failed,43,
09/11/2000 11:52:00 - connecting,
09/11/2000 11:52:00 - connected; connect time: 000:00:00,09/11/200011:52:00 - mounting CID304,
09/11/2000 11:52:00 - mounted; mount time: 000:00:00,09/11/2000 11:52:00 - positioning to file 75,
09/11/2000 11:52:00 - positioned;position time: 000:00:00,09/11/2000 11:52:00 - begin writing,
09/11/2000 11:52:00 - positioning to file 76,
09/11/2000 11:52:00 - positioned;position time: 000:00:00,09/11/2000 11:59:20 -positioning to file 77,
09/11/2000 11:59:20 - positioned;position time: 000:00:00,09/11/2000 12:06:20 -positioning to file 78,
09/11/2000 12:06:20 - positioned;position time: 000:00:00,09/11/2000 12:13:00 -positioning to file 79,
09/11/2000 12:13:00 - positioned;position time: 000:00:00,09/11/2000 12:14:30 -positioning to file 80,
09/11/2000 12:14:30 - positioned;position time: 000:00:00,09/11/2000 12:21:30 -positioning to file 81,
09/11/2000 12:21:30 - positioned;position time: 000:00:00,09/11/2000 12:23:40 -positioning to file 82,
09/11/2000 12:23:40 - positioned;position time: 000:00:00,09/11/2000 12:28:30 -positioning to file 83,
09/11/2000 12:28:30 - positioned;position time: 000:00:00,09/11/2000 12:30:50 -positioning to file 84,
09/11/2000 12:30:50 - positioned;position time: 000:00:00,09/11/2000 12:31:30 -positioning to file 85,
09/11/2000 12:31:30 - positioned;position time: 000:00:00,09/11/2000 12:38:30 -positioning to file 86,
09/11/2000 12:38:30 - positioned;position time: 000:00:00,09/11/2000 12:46:00 -positioning to file 87,
09/11/2000 12:46:00 - positioned;position time: 000:00:00,09/11/2000 12:53:10 -positioning to file 88,
09/11/2000 12:53:10 - positioned;position time: 000:00:00,09/11/2000 12:53:40 -positioning to file 89,
09/11/2000 12:53:40 - positioned;position time: 000:00:00,09/11/2000 13:01:30 -positioning to file 90,
09/11/2000 13:01:30 - positioned;position time: 000:00:00,09/11/2000 13:09:50 -positioning to file 91,
09/11/2000 13:09:50 - positioned;position time: 000:00:00,09/11/2000 13:19:20 -positioning to file 92,
09/11/2000 13:19:20 - positioned;position time: 000:00:00,09/11/2000 13:23:20 -mounting CID312,
09/11/2000 13:24:30 - end writing;write time: 001:32:28,
17114076,200,2302,vault,vault,968703872,3936,968707808,0,the requested operation was successfully completed,26,
09/11/2000 13:24:30 - connecting,
09/11/2000 13:24:30 -connected; connect time: 000:00:00,09/11/2000 13:24:40 -mounting CID312,
09/11/2000 13:24:40 -mounted; mount time: 000:00:00,09/11/2000 13:24:40 -positioning to file 1,
09/11/2000 13:24:50 -positioned; position time: 000:00:09,09/11/2000 13:24:50 - begin writing,
09/11/2000 13:33:50 -positioning to file 2,09/11/2000 13:33:50 -positioned; position time: 000:00:00,
09/11/2000 13:40:30 -positioning to file 3,09/11/2000 13:40:30 - positioned;position time: 000:00:00,
09/11/2000 13:47:10 -positioning to file 4,09/11/2000 13:47:10 - positioned;position time: 000:00:00,
09/11/2000 13:54:00 -positioning to file 5,09/11/2000 13:54:00 - positioned;position time: 000:00:00,
09/11/2000 14:00:30 -positioning to file 6,09/11/2000 14:00:30 - positioned;position time: 000:00:00,
09/11/2000 14:07:20 -positioning to file 7,09/11/2000 14:07:20 - positioned;position time: 000:00:00,
09/11/2000 14:14:20 -positioning to file 8,09/11/2000 14:14:20 - positioned;position time: 000:00:00,
09/11/2000 14:21:20 -positioning to file 9,09/11/2000 14:21:20 - positioned;position time: 000:00:00,
09/11/2000 14:27:50 -positioning to file 10,09/11/2000 14:27:50 - positioned;position time: 000:00:00,
09/11/2000 14:30:00 - end writing;write time: 001:05:09,18497655,344

This snippet will make a nice perl data structure containing the whole beast: It works for me. NO WARRANTY!

#!/usr/bin/perl
use Data::Dumper;
@job_info_labels = ('jobid', 'jobtype', 'state', 'status', 'class', 'sched', 'client', 'server', 'start',
               'elapsed', 'end', 'stunit', 'try', 'operation', 'kbytes', 'files', 'path_last_written',
             'percent', 'jobpid', 'owner', 'subtype', 'classtype', 'schedtype', 'priority', 'group',
             'master_server', 'retention_units', 'retention_period', 'compression', 'kbyteslastwritten',
             'fileslastwritten', 'filelistcount');
@job_try_labels_1 = ('trypid', 'trystunit', 'tryserver', 'trystarted', 'tryelapsed', 'tryended',
                     'trystatus', 'trystatusdescription', 'trystatuscount');
@job_try_labels_2 = ('trybyteswritten','tryfileswritten');

while(<>) {
    chomp;
    undef($job);
    #This next line removes commas that are in the middle of some messages and replaces them with a ;.  
    #That leaves only the commas that are separating posts
    @columns =~ s/\\,/;/g;
    @columns = split(',');
    $idx = 0;
    foreach my $label (@job_info_labels) {
        $job->{$label} = $columns[$idx];
        $idx++;
    }
    @files = ();
    if ($job->{'filelistcount'} > 0) {
        @files = @columns[$idx..$idx+$job->{'filelistcount'}-1];
    }
    $job->{'files'} = [@files];
    $idx += $job->{'filelistcount'} ;
    $job->{'trycount'} = $columns[$idx];
    # Now get the try data.  Within the try data
    # is a list of try status lines.
    @trylist = ();
    $idx++;
    for (1..$job->{'trycount'}) {
        undef($trydata);
        foreach my $trylabel ( @job_try_labels_1 ) {
            $trydata->{$trylabel} = $columns[$idx];
            $idx++;
        }
        @trystatuslines = @columns[$idx..$idx+$trydata->{'trystatuscount'}-1];
        $trydata->{'trystatuslines'} = [@trystatuslines];
        $idx += $trydata->{'trystatuscount'};
        foreach my $trylabel ( @job_try_labels_2 ) {
            $trydata->{$trylabel} = $columns[$idx];
            $idx++;
        }
         push @trylist, $trydata;
      }
    $job->{'trylist'} = [@trylist];
    push @jobs, $job;
}

print Dumper(@jobs);
exit(0);

This information was good for understanding the format of the -all_columns output, but there are a couple interesting flaws in the logic in the perl program:

The filelist information is not presented well. If you are looking for the _number_ of files, you may be surprised at what you get.

There is no error-checking for parsing the list when there are embedded commas within the record (i.e. path_last_written can contain commas duing an active job, and certain status codes have commans in the text in the try details). The commas ARE escaped (\,) but filelists can contain confusing commas ( C:\,d:\,e:\)

If you are into Python, I have written another script that is [very] losely based on the original perl program. It currently REQUIRES Python 2.3 and is still something of a work in progress, but it has a lot more checks to be more consistent in its data presentation. It currently defaults to dumping the non-try related fields in csv format, but can also be used (with a -a switch) to dump out all the try data, too.

This most recent version covers 3.x, 4.x and 5.x output. The headers are for ALL fields, but the data will adjust for the type of input being given. It will handle input from several differing sources at the same time (4.5 and 5.1, for example), but I don't recommend doing this unless you have to. It's gets a little wierd.

Please don't hesitate to contact me if you have any questions. As Always, No warranty on any code, all I can tell you is I use it at work ;-)

#!/usr/bin/python
#
# bpdbreport.py
#
# Take input from bpdbjobs -report -all_columns
# and change info into human readable format
#
# Changelog:
#
# 2005-06-06:   DWR
#   Added output logic to determine whether the job data is
#   3.x, 4.x or 5.x. The header line is currently always a 5.x
#   header, the last fields just aren't used unless it is data
#   from a 5.x environment. Currently in testing mode.
# 2005-05-03:   DWR
#   Added logic to throw out lines that have embedded newline
#   characters in string values from the -most_columns output.
#   The csv module doesn't handle it very well. I still need to
#   figure out how to handle the broken lines for reporting.
# 2004-09-03:   DWR
#   Added "--hoursago","-q (quiet)", and "--shelv_dict" options
#   to help facilitate creating a dictionary for later retrieval
#   by a different python script.
# 2004-08-02:   DWR
#   Added try/except block around test for start/end time and for
#   which state the job is currently listed. Discovered that it is
#   possible for a job to exist with absolutely NO information.
# 2004-02-27:   DWR
#   Added basic framework for displaying all job entries, not just
#   jobs with a State of "Done". The two new switches are:
#   --show_all   (shows Active,Done,Queued,Re-Queued)
#   --show_active (shows Active and Done)
#   The layout of this is not the cleanest in the world just yet.
#   It's just being handled by an if/elif to make separate dicts
#   for each type of Job State and then outputting data from each
#   dict. It should really try to determine a precidence to show
#   only the most recent Job State for a given jobid.
#   Maybe delete jobid from other dicts if it exists in done_master?
#   Changed output so that the header is a separate module.
# 2003-12-03:   DWR
#   Changed end time commandline entry to include end date
#   It was taking the date as dd/mmm/yyyy and assuming 00:00:00
#   for the time, which does NOT include the data from that day.
# 2003-10-30:   DWR
#   Discovered what job type 5 is: Import
#   Discovered what job type 3 is: Verify
#   Added import and verify categories
# 2003-10-28:   DWR
#   Added --mdy switch to readability output so that the interim
#   reporting solution can continue to function. The default format
#   for dates is dd/mmm/yyyy, and this allows for usage of dd/mm/yyyy
# 2003-10-21:   DWR
#   Added readability data handling to individual try information
#   Changed line processing to use the csv module in python 2.3
#   instead of the messy logic to walk the line looking for special
#   cases that broke because of embedded and escaped commas. Because
#   of this, the script now REQUIRES python 2.3
# 2003-10-02:   DWR
#   Started adding Job types for Netbackup 4.5, still need types
#   3 and 5
# 2003-09-17:   DWR
#   Cleaned up logic to find escaped commas in text so that the
#   fields will parse correctly. Known sections where this applies
#   are; path_last_written, filelist, and trystatusdetails.
# 2003-09-16:   DWR
#   Added more debug code for -d switch. This is almost completely
#   for my benefit only to aid in fixing other problems with the
#   script itself. Users would generally not be using it from day
#   to day. Problem lines will generate ERROR: line messages to
#   stderr nw instead of generating a traceback. Most of this exception
#   handling is wrapped around the process_line module since that is
#   where the bulk of the problems would occur.
# 2003-08-11:   DWR
#   Added readability module to the all_data output so human
#   readable output is now possible (-a -v).
# 2003-08-08:   DWR
#   Fixed bug where embedded commas in path last written field
#   would screw up field splitting. Also started to add debug
#   mode switch to get more info that can be collected on stderr


import os
import re
import csv
import sys
import time
import types
import getopt
import string
import cPickle
import fileinput
import traceback

# definitions for indexed job columns
job_type  = {   '0' : 'Backup',
                '1' : 'Archive',
                '2' : 'Restore',
                '3' : 'Verify',
                '4' : 'Duplicate',
                '5' : 'Import',
                '6' : 'DB Backup',
                '7' : 'Vault'
            }
job_state = { '0' : 'Queued', '1' : 'Active', '2' : 'Re-Queued', '3' : 'Done' }
sched_type = { '0': 'Full', '1' : 'Differential', '2' : 'User Backup', '3' : 'User Archive', '4' : 'Cumulative' }
sub_type = {'0': 'Immediate', '1': 'Scheduled', '2': 'User-Initiated' }
retention_units = { '0': 'Unknown', '1': 'Days', '2': 'Unknown' }



def usage():
    print >>sys.stderr, '''\nbpdbreport.py usage:

    bpdbreport.py [switches] <filelist> | -

    -a                       all data format (includes try information)
    -d                       run in debug mode (outputs to stderr)
    -f format_file           column output format file
    -s dd/mmm/yyyy           define start time (default is epoch)
    -e dd/mmm/yyyy           define end time (default is current localtime)
    -h                       print this help and exit
    --hoursago hours         sets start time to number of hours ago
                               --hoursago and -s/-e should be mutually
                               exclusive, but they aren't yet. Use only
                               one or the other.
    --mdy                    change verbose date output format to mm/dd/yyyy
    --shelve_dicts filename  output dictonary to a python pickle object
                               This option implies -q
    --show_active            show Done and Active jobs
                               (may show duplicates if multiple files are used)
    --show_all               show All jobs
                               (may show duplicates if multiple files are used)
    -q                       quiet (no output to stdout)
    --usage                  print detailed help message and exit
    -v                       verbose (human readable output)

    Default output is the first 32 columns from bpdbjobs -report -all_columns
    formatted data. Columns can be defined by format file (see --usage for
    sample format file)

    examples:
        get all entries in verbose format from stdin:
            bpdbreport.py -s 05/may/2003 -e 05/jun/2003 -v -

        get data from file named all_columns.output and display columns defined
        in sample.fmt file
            bpdbreport.py -f sample.fmt all_columns.output

    Lines that generate bad data (mostly because of bugs or bad commas) will
    spit out error lines to stderr in the format of:
        ERROR: inputline
'''



def detailed_usage():
    usage()
    print >>sys.stderr, '''

# sample format file for column output
# This will skip all lines that start with # and
# all whitespace lines.
# Any lines that are incorrect will be dropped

jobid
jobtype
state
status
class
sched
client
server
start
elapsed
end
stunit
try
operation
kbytes
files
path_last_written
percent
jobpid
owner
subtype
classtype
schedtype
priority
group
master_server
retention_units
retention_period
compression
kbyteslastwritten
fileslastwritten
filelistcount
parentjob
kbpersec
copy
robot
vault
profile
session
ejecttapes
srcstunit
srcserver
srcmedia
dstmedia
stream
suspendable
resumable
restartable
datamovement
frozenimage
backupid
killable
controllinghost'''


def sec_to_hms( input ):
    input = seconds = int(input)
    hours = seconds / 3600
    seconds = seconds - hours*3600
    minutes = seconds / 60
    seconds = seconds - minutes*60
    return (hours,minutes,seconds)




def process_line(buffer):
    dict = {}
    idx = 0
    info_labels = ( 'jobid', 'jobtype', 'state', 'status', 'class', 'sched',
                    'client', 'server', 'start', 'elapsed', 'end', 'stunit',
                    'try', 'operation', 'kbytes', 'files', 'path_last_written',#17
                    'percent', 'jobpid', 'owner', 'subtype', 'classtype',
                    'schedtype', 'priority', 'group', 'master_server',
                    'retention_units', 'retention_period', 'compression',
                    'kbyteslastwritten', 'fileslastwritten', 'filelistcount' )
    try_labels1 = ( 'trypid', 'trystunit', 'tryserver', 'trystarted', 'tryelapsed',
                    'tryended', 'trystatus', 'trystatusdescription', 'trystatuscount' )
    try_labels2 = ( 'trybyteswritten','tryfileswritten' )
    info_labels4x = ( 'parentjob', 'kbpersec', 'copy', 'robot', 'vault', 'profile',
                    'session', 'ejecttapes', 'srcstunit', 'srcserver', 'srcmedia',
                    'dstmedia', 'stream' )
    info_labels5x = ( 'suspendable','resumable','restartable','datamovement',
                    'frozenimage','backupid','killable','controllinghost' )


    for label in info_labels:
        dict[label] = buffer[idx]
        idx += 1

    try:
        if dict['filelistcount'] > 0:
            for f in range ( idx, idx+int(dict['filelistcount']) ):
                try:
                    dict['filelist'].append(buffer[idx])
                except:
                    dict['filelist'] = [buffer[idx]]
                idx += 1
        dict['trycount'] = buffer[idx]
        idx += 1

        for job_try in range(1,int(dict['trycount'])+1):
            try_idx = 'try'+str(job_try)
            dict[try_idx] = {}
            for trylabel in try_labels1:
                dict[try_idx][trylabel] = buffer[idx]
                idx += 1
            if dict[try_idx]['trystatuscount'] > 0:
                for f in range ( idx, idx+int(dict[try_idx]['trystatuscount']) ):
                    try:
                        dict[try_idx]['trystatuslines'].append(buffer[idx])
                    except:
                        dict[try_idx]['trystatuslines'] = [buffer[idx]]
                    idx += 1
            for trylabel in try_labels2:
                dict[try_idx][trylabel] = buffer[idx]
                idx += 1
        try:
            for label in info_labels4x:
                dict[label] = buffer[idx]
                idx += 1
        except:
            pass
        try:
            for label in info_labels5x:
                dict[label] = buffer[idx]
                idx += 1
        except:
            pass

        return dict, 0, False
    except:
        return dict, sys.exc_info(),buffer



def get_output_cols( format_file ):
    try:
        col_fp = open( format_file, 'r' )
    except:
        print >>sys.stderr, 'could not open format file'
        return None

    for line in col_fp:
        line = line.rstrip('\n')
        buf = line.split('#',1)
        if buf[0]:
            try:
                col_fmt.append(buf[0].strip())
            except:
                col_fmt = [buf[0].strip()]
    col_fp.close()
    return col_fmt


def print_dict(d,k,t):
    if debug_mode:
        print >>sys.stderr,'DEBUG:   ',k,'{'
        keys = d.keys()
        keys.sort()
        for item in keys:
            if type(d[item]) is types.DictType:
                print_dict(d[item],item,t+1)
            elif type(d[item]) is types.ListType:
                print_list(d[item],item,t+1)
            else:
                print >>sys.stderr,'DEBUG:   ','\t'*t+item,':',d[item]
        print >>sys.stderr,'DEBUG:   ','}'
    else:
        print k,'{'
        keys = d.keys()
        keys.sort()
        for item in keys:
            if type(d[item]) is types.DictType:
                print_dict(d[item],item,t+1)
            elif type(d[item]) is types.ListType:
                print_list(d[item],item,t+1)
            else:
                if verbose:
                    print '\t'*t+item,':',readability( item, d[item] )
                else:
                    print '\t'*t+item,':',d[item]
        print '}'


def print_list(l,k,t):
    if debug_mode:
        idx = 0
        print >>sys.stderr,'DEBUG:   ','\t'*(t-1)+k,'{'
        for item in l:
            if type(item) is types.DictType:
                print_dict(l[idx],item,t+1)
            elif type(item) is types.ListType:
                print_list(l[idx],item,t+1)
            else:
                print >>sys.stderr,'DEBUG:   ','\t'*t+l[idx]
            idx += 1
        print >>sys.stderr,'DEBUG:   ','\t'*(t-1)+'}'
    else:
        idx = 0
        print '\t'*(t-1)+k,'{'
        for item in l:
            if type(item) is types.DictType:
                print_dict(l[idx],item,t+1)
            elif type(item) is types.ListType:
                print_list(l[idx],item,t+1)
            else:
                if verbose:
                    print '\t'*t+readability( item, l[idx] )
                else:
                    print '\t'*t+l[idx]
            idx += 1
        print '\t'*(t-1)+'}'

def output_data( d,col_fmt_input ):
    list4x = [ 'parentjob', 'kbpersec', 'copy', 'robot', 'vault', 'profile',
            'session', 'ejecttapes', 'srcstunit', 'srcserver', 'srcmedia',
            'dstmedia', 'stream' ]
    list5x = [ 'suspendable','resumable','restartable','datamovement',
            'frozenimage','backupid','killable','controllinghost' ]

    keys = d.keys()
    keys.sort()

    if all_data:
        for key in keys:
            print key,'{'
            k = d[key].keys()
            k.sort()
            for item in k:
                if type(d[key][item]) is types.ListType:
                    print_list(d[key][item],item,1)
                elif type(d[key][item]) is types.DictType:
                    print_dict(d[key][item],item,1)
                else:
                    if verbose:
                        print item,':',readability( item, d[key][item] )
                    else:
                        print item,':',d[key][item]
            print '}*** END',key,'***\n'
        return

    for key in keys:
        col_fmt = col_fmt_input
        nbuVersion = get_nbuVersion(d[key])

        if not col_fmt:
            col_fmt = [ 'jobid', 'jobtype', 'state', 'status', 'class', 'sched',
                    'client', 'server', 'start', 'elapsed', 'end', 'stunit',
                    'try', 'operation', 'kbytes', 'files', 'path_last_written',
                    'percent', 'jobpid', 'owner', 'subtype', 'classtype',
                    'schedtype', 'priority', 'group', 'master_server',
                    'retention_units', 'retention_period', 'compression',
                    'kbyteslastwritten', 'fileslastwritten', 'filelistcount' ]
            if nbuVersion == '4x':
                for item in list4x:
                    col_fmt.append(item)
            elif nbuVersion == '5x':
                for item in list4x:
                    col_fmt.append(item)
                for item in list5x:
                    col_fmt.append(item)

        col_out = ''
        for column in col_fmt:
            if verbose:
                data = readability(column, d[key][column])
                col_out += data+','
            else:
                col_out += d[key][column]+','
        col_out = col_out.rstrip(',')
        print col_out


def get_nbuVersion(key):
    if key.has_key('suspendable'):
        nbuVersion = '5x'
    elif key.has_key('kbpersec'):
        nbuVersion = '4x'
    else:
        nbuVersion = '3x'
    return nbuVersion


def print_header( col_fmt,d ):
    ''' This pretty much assumes a 5.1 header for the csv output'''
    if not col_fmt:
        col_fmt = [ 'jobid', 'jobtype', 'state', 'status', 'class', 'sched',
                'client', 'server', 'start', 'elapsed', 'end', 'stunit',
                'try', 'operation', 'kbytes', 'files', 'path_last_written',
                'percent', 'jobpid', 'owner', 'subtype', 'classtype',
                'schedtype', 'priority', 'group', 'master_server',
                'retention_units', 'retention_period', 'compression',
                'kbyteslastwritten', 'fileslastwritten', 'filelistcount',
                'parentjob', 'kbpersec', 'copy', 'robot', 'vault', 'profile',
                'session', 'ejecttapes', 'srcstunit', 'srcserver', 'srcmedia',
                'dstmedia', 'stream', 'suspendable','resumable','restartable',
                'datamovement', 'frozenimage','backupid','killable','controllinghost' ]

    col_out = ''
    for header in col_fmt:
        col_out += header.upper()+','
    col_out = col_out.rstrip(',')
    print col_out


def readability( key, string ):
    try:
        if key == 'jobtype':
            string = job_type[string]
        elif key == 'state':
            string = job_state[string]
        elif key == 'schedtype':
            string = sched_type[string]
        elif key == 'subtype':
            string = sub_type[string]
        elif key in ['start','end','trystarted','tryended']:
            if mdy:
                string = time.strftime( '%m/%d/%Y %H:%M:%S', time.localtime(int(string)))
            else:
                string = time.strftime( '%d/%b/%Y %H:%M:%S', time.localtime(int(string)))
        elif key in ['elapsed','tryelapsed']:
            (h,m,s) = sec_to_hms(string)
            string = '%d:%02d:%02d' % (h,m,s)
    except:
        pass
    return string


def output_debug_dict( d ):
    keys = d.keys()
    keys.sort()

    print >>sys.stderr, 'DEBUG:   ', d['jobid'],'{'
    for key in keys:
        if type(d[key]) is types.ListType:
            print_list(d[key],key,1)
        elif type(d[key]) is types.DictType:
            print_dict(d[key],key,1)
        else:
            print >>sys.stderr, 'DEBUG:   ', key,':',d[key]
    print >>sys.stderr, 'DEBUG:   }*** END',d['jobid'],'***'
    return



if __name__ == '__main__':
    try:
        opts, args = getopt.getopt(sys.argv[1:],
                                    "f:s:e:hvxadq", ["hoursago=","shelve_dicts=", "show_active", "show_all", "usage", "mdy"])
    except getopt.GetoptError:
        # print help information to stderr and exit:
        usage()
        sys.exit(2)
    if not args:
        for o, a in opts:
            if o == "-h":
                usage()
                sys.exit()
            if o == "--usage":
                detailed_usage()
                sys.exit()
        print >>sys.stderr, '\nArgument list can not be empty'
        print >>sys.stderr, 'use "-" for stdin'

        usage()
        sys.exit(1)

    # Check to see if filenames are valid files
    argflag = False
    for arg in args:
        if arg != '-':
            if not os.path.exists(arg):
                argflag = True
                print >>sys.stderr, '\nFile', arg, 'does not exist.'
    if argflag:
        usage()
        sys.exit(1)

    # Commandline argument defaults
    all_data        = False                         # -a
    debug_mode      = False                         # -d
    xplicite        = False                         # -x (not used yet)
    verbose         = False                         # -v
    mdy             = False                         # --mdy
    start_date      = 0                             # -s
    show_active     = False                         # --show_active
    show_all        = False                         # --show_all
    end_date        = time.mktime(time.localtime()) # -e
    format_file     = ''                            # -f
    col_fmt         = ''                            # parsed column output string
    shelve_dicts    = False                         # shelve data for future use
    output          = True                          # -q default output, option turns it off

    for o, a in opts:
        if o == "-h":
            usage()
            sys.exit()
        if o == "--usage":
            detailed_usage()
            sys.exit()
        if o == "-f":
            format_file = a
        if o == "-d":
            debug_mode = True
        if o == "--shelve_dicts":
            shelve_dicts = True
            output = False
            pkl = a
        if o == "-q":
            output = False
        if o == "-v":
            verbose = True
        if o == "--show_active":
            show_active = True
        if o == "--show_all":
            show_all = True
            show_active = False
        if o == "--mdy":
            mdy = True
        if o == "--hoursago":
            hoursago = a
            start_date = end_date - int(hoursago) * 3600
        if o == "-s":
            try:
                start_date = time.mktime(time.strptime(a, '%d/%b/%Y'))
            except:
                print >>sys.stderr, '\nDate values must be in dd/mmm/yyyy format'
                usage()
                sys.exit(1)
        if o == "-e":
            try:
                end_date = time.mktime(time.strptime(a, '%d/%b/%Y'))
                end_date += 86399   # Add 23:59:59 to enddate to include that day
            except:
                print >>sys.stderr, '\nDate values must be in dd/mmm/yyyy format'
                usage()
                sys.exit(1)
        if o == "-a":
            all_data = True
        if o == "-x":
            xplicite = True

    done_master = {}
    active_master = {}
    queued_master = {}
    requeued_master = {}

    try:
        if debug_mode:
            print >>sys.stderr, 'DEBUG: Options and Arguments:'
            for o,a in opts:
                print >>sys.stderr, 'DEBUG:   ', o, a
        for inputline in fileinput.input(args):
            try:
                for line in csv.reader([inputline], escapechar='\\'):
                    try:
                        d, exc, buf_debug = process_line(line)
                        if exc:
                            raise
                    except:
                        if debug_mode:
                            print >>sys.stderr, 'DEBUG:  ', '*'*30
                            print >>sys.stderr, 'DEBUG:   Filename:            ', fileinput.filename()
                            print >>sys.stderr, 'DEBUG:   Line Number:         ', fileinput.lineno()
                            print >>sys.stderr, 'DEBUG:   Exception:           ', exc[0]
                            print >>sys.stderr, 'DEBUG:   Exception:           ', exc[1]
                            print >>sys.stderr, 'DEBUG:   Dict Contents:       '
                            output_debug_dict(d)
                            print >>sys.stderr, 'DEBUG:   ', buf_debug
                            print >>sys.stderr, 'DEBUG:   ', line
                            print >>sys.stderr, 'DEBUG:  ', '*'*30
                        else:
                            print >>sys.stderr, 'ERROR: ', line
                    else:
                        try:
                            if int(d['start']) >= start_date and int(d['start']) <= end_date:
                                # To make this cleaner, maybe cross check dicts based on
                                # the assumption that Done jobs are the most important?
                                if int(d['state']) == 0:
                                    if not queued_master.get(d['jobid']):
                                        try:
                                            queued_master[d['jobid']].append(d)
                                        except:
                                            queued_master[d['jobid']] = d
                                elif int(d['state']) == 1:
                                    if not active_master.get(d['jobid']):
                                        try:
                                            active_master[d['jobid']].append(d)
                                        except:
                                            active_master[d['jobid']] = d
                                elif int(d['state']) == 2:
                                    if not requeued_master.get(d['jobid']):
                                        try:
                                            requeued_master[d['jobid']].append(d)
                                        except:
                                            requeued_master[d['jobid']] = d
                                elif int(d['state']) == 3:
                                    if not done_master.get(d['jobid']):
                                        try:
                                            done_master[d['jobid']].append(d)
                                        except:
                                            done_master[d['jobid']] = d
                        except:
                            if debug_mode:
                                exc = sys.exc_info()
                                print >>sys.stderr, 'DEBUG:  ', '*'*30
                                print >>sys.stderr, 'DEBUG:   Filename:            ', fileinput.filename()
                                print >>sys.stderr, 'DEBUG:   Line Number:         ', fileinput.lineno()
                                print >>sys.stderr, 'DEBUG:   Exception:           ', exc[0]
                                print >>sys.stderr, 'DEBUG:   Exception:           ', exc[1]
                                print >>sys.stderr, 'DEBUG:   Dict Contents:       '
                                output_debug_dict(d)
                                print >>sys.stderr, 'DEBUG:   ', buf_debug
                                print >>sys.stderr, 'DEBUG:   ', line
                                print >>sys.stderr, 'DEBUG:  ', '*'*30
                            else:
                                print >>sys.stderr, 'ERROR: ', line
            except:
                print >>sys.stderr, 'ERROR: ', inputline

        if format_file:
            col_fmt = get_output_cols(format_file)
        if output:
            if not all_data:
                print_header(col_fmt,done_master)
            output_data(done_master, col_fmt)
            if show_active:
                output_data(active_master, col_fmt)
            if show_all:
                output_data(active_master, col_fmt)
                output_data(queued_master, col_fmt)
                output_data(requeued_master, col_fmt)
        if shelve_dicts:
            fp_output = open(pkl, 'wb')
            cPickle.dump(done_master,fp_output,1)
            fp_output.close()

    except KeyboardInterrupt:   # Catch premature ^C
        traceback.print_tb(sys.exc_traceback)
        sys.exit(3)

# modeline vim:set ts=4 sw=4 et:

I found that some of the fields have an embedded comma which makes the "split" in the perl script problematic. However, it appears that such commas are escaped, e.g. "\,". So, a simple workaround is to preceed the "split" command by a substitute command to replace the "\," string with something else. I chose to use "s/\\,/;/g".

Here's another script that's been posted a few times. Still not sure it works with NBU 6:


##
## bpdbjobs_parse()
##
## This function is derived from the following Veritas command:
##
##   /usr/openv/netbackup/bin/admincmd/bpdbjobs
##
## --PLB 12/19/2001
##
sub bpdbjobs_parse {
    my $self = shift;
    $_ = shift;
    my $tmpfile;
    chomp;
    s/'/\\'/g; # Escape any un-escaped single quotes.
    ##
    ## jobid,jobtype,state,status,class,schedule, client, server, started, elapsed,
    ## ended, stunit, try, operation, kbytes, files, pathlastwritten, percent,
    ## jobpid, owner, subtype, classtype, schedule_type, priority, group,
    ## masterserver, retentionunits, retentionperiod, compression,
    ## kbyteslastwritten, fileslastwritten, filelistcount, [files]..., trycount,
    ## [trypid, trystunit, tryserver, trystarted, tryelapsed, tryended, trystatus,
    ## trystatusdescription, trystatuscount, [trystatuslines]..., trybyteswritten,
    ## tryfileswritten]
    ##
    my(
        $jobid,
        $jobtype,
        $state,
        $status,
        $class,
        $schedule,
        $client,
        $server,
        $started,
        $elapsed,
        $ended,
        $stunit,
        $try,
        $operation,
        $kbytes,
        $files,
        $pathlastwritten,
        $percent,
        $jobpid,
        $owner,
        $subtype,
        $classtype,
        $schedule_type,
        $priority,
        $group,
        $masterserver,
        $retentionunits,
        $retentionperiod,
        $compression,
        $kbyteslastwritten,
        $fileslastwritten,
        @files_and_tries,
    ) = parse_line(",", 0, $_);
    for( $jobtype ) {
        /0/ and do { $jobtype = "backup"    ; last };
        /1/ and do { $jobtype = "archive"   ; last };
        /2/ and do { $jobtype = "restore"   ; last };
        /3/ and do { $jobtype = "verify "   ; last };
        /4/ and do { $jobtype = "duplicate" ; last };
        /5/ and do { $jobtype = "import "   ; last };
        /6/ and do { $jobtype = "db_backup" ; last };
        /7/ and do { $jobtype = "vault"     ; last };
    }
    for( $state ) {
        /0/ and do { $state = "queued"      ; last };
        /1/ and do { $state = "active"      ; last };
        /2/ and do { $state = "re-queued"   ; last };
        /3/ and do { $state = "done"        ; last };
    }
    my($filelistcount) = shift(@files_and_tries);
    my(@files);
    for(1..$filelistcount) {
        # Skip leading and trailing whitespace
        $tmpfile = shift @files_and_tries;
        $tmpfile =~ s/^\s*//;
        $tmpfile =~ s/\s*$//;
        push( @files, $tmpfile );
    }
    my($specifiedfiles) = join(", ", @files);
    my($trycount) = shift(@files_and_tries);
    my(%tries,$trynum);
    foreach $trynum (1..$trycount) {
        my($trypid)                    = shift(@files_and_tries);
        my($trystunit)                 = shift(@files_and_tries);
        my($tryserver)                 = shift(@files_and_tries);
        my($trystarted)                = shift(@files_and_tries);
        my($tryelapsed)                = shift(@files_and_tries);
        my($tryended)                  = shift(@files_and_tries);
        my($trystatus)                 = shift(@files_and_tries);
        my($trystatusdescription)      = shift(@files_and_tries);
        my($trystatuscount)            = shift(@files_and_tries);
        my(@trystatuslines);
        for(1..$trystatuscount) {
            push(@trystatuslines, shift(@files_and_tries) );
        }
        my($trystatuslines) = join("\n", @trystatuslines);
        my($trykbyteswritten)      = shift(@files_and_tries);
        my($tryfileswritten)       = shift(@files_and_tries);
        %tries = (
            %tries,
            "try_${trynum}_pid"               => "$trypid",
            "try_${trynum}_stunit"            => "$trystunit",
            "try_${trynum}_server"            => "$tryserver",
            "try_${trynum}_started"           => "$trystarted",
            "try_${trynum}_elapsed"           => "$tryelapsed",
            "try_${trynum}_ended"             => "$tryended",
            "try_${trynum}_status"            => "$trystatus",
            "try_${trynum}_statusdescription" => "$trystatusdescription",
            "try_${trynum}_statuscount"       => "$trystatuscount",
            "try_${trynum}_statuslines"       => "$trystatuslines",
            "try_${trynum}_kbyteswritten"     => "$trykbyteswritten",
            "try_${trynum}_fileswritten"      => "$tryfileswritten",
        );
    }
    my(%record) = (
        jobid             => "$jobid",
        jobtype           => "$jobtype",
        state             => "$state",
        status            => "$status",
        class             => "$class",
        schedule          => "$schedule",
        client            => "$client",
        server            => "$server",
        started           => "$started",
        elapsed           => "$elapsed",
        ended             => "$ended",
        stunit            => "$stunit",
        try               => "$try",
        operation         => "$operation",
        kbytes            => "$kbytes",
        files             => "$files",
        path              => "$pathlastwritten",
        percent           => "$percent",
        jobpid            => "$jobpid",
        owner             => "$owner",
        subtype           => "$subtype",
        classtype         => "$classtype",
        schedule_type     => "$schedule_type",
        priority          => "$priority",
        group             => "$group",
        masterserver      => "$masterserver",
        retentionunits    => "$retentionunits",
        retentionperiod   => "$retentionperiod",
        compression       => "$compression",
        kbyteslastwritten => "$kbyteslastwritten",
        fileslastwritten  => "$fileslastwritten",
        filelistcount     => "$filelistcount",
        specifiedfiles    => "$specifiedfiles",
        trycount          => "$trycount",
    );
    %record = (%record, %tries);
    return %record;
}


The escaped-comma substitution isn't a perfect solution, because some fields can end in a backslash-escaped backslash. For instande: "...,C:\\,...". In this case, the comma isn't escaped: the backslash is.

I tried to come up with a perfect solution using "split" but couldn't. I ended up using a global match like this:

my @line = ($line =~ /(.*?(?<!\\)(?:\\\\)*)(?:,|$)/g);

As far as I could test, it seems to work right, assigning to @line all the comma separated fields in $line.

--Gnustavo 04:57, 9 December 2007 (PST)