From a previous life attempting to keep on top of and improve a NetBackup and Bacula enterprise backup environment, my abiding feeling is that trying to
a) understand what's going on
b) optimise (e.g. minimise failures, maximise throughput and ROI) and
c) plan new environments
is a black art and akin to trying to nail jelly to a wall. I think (though funds were never available to assess) that there are reporting tools "out there", maybe specific to products, but that none of these do anything particularly clever in terms of going beyond symptoms (what's failed / at risk and for how long) to drilling into underlying problems, making and presenting correlations between what's going on, where, in the backup network and beyond, guiding support as to where best to focus energies, etc.
Am I right, or is what I could could call "the backup reporting problem" solved and what we experienced was just a result of lack of investment and understanding what was available?
I find myself in a position to do some research into current data analysis and machine learning techniques and would like to apply this to the areas a), b) and c) above but if ours was just a local problem which we could have solved with market-available tools, then there's not a lot of point and I'll find something else to work on!
thanks for any feedback,
Phil Weber
