SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
How to warn the sysadmin in case of trouble?
Author Message
Post How to warn the sysadmin in case of trouble? 
Dear all,

Today I bring you another important issue regarding backing up things: how do you keep your sysadmin informed of what's happening?

This question is brought with a real life example I had with a client. They had a simple backup script running daily, which emailed the sysadmin with a report of what had been done. Everything was working fine...until it wasn't. There was a problem with the server and they had to retrieve things from the backup. But there was a problem...the backup script had stopped work months before, due to *insert some irrelevant reason*. The sysadmin didn't notice he stopped receiving the daily reports until it was too late.

What's the bottom line? To me, it's that a daily report that everything was ok is not good. It might be better to receive a report when things stop being ok.

I was wondering what approaches you guys usually follow to achieve this. I take it you might have some sort of warning and not just a "blind trust" that all the backups are working as expected.


Cheers,

Miguel Almeida

Post How to warn the sysadmin in case of trouble? 
On 2012-01-10 12:50 PM, Miguel Almeida <miguel < at > almeida.at> wrote:
This question is brought with a real life example I had with a client.
They had a simple backup script running daily, which emailed the
sysadmin with a report of what had been done. Everything was working
fine...until it wasn't. There was a problem with the server and they had
to retrieve things from the backup. But there was a problem...the backup
script had stopped work months before, due to *insert some irrelevant
reason*. The sysadmin didn't notice he stopped receiving the daily
reports until it was too late.

Well, that was the Admins fault. There is nothing you can do to protect
people from their own laziness or stupidity.

The bottom line is, if I am getting daily email reports for something
that is important, I *will* miss them if they don't arrive.

What's the bottom line? To me, it's that a daily report that everything
was ok is not good. It might be better to receive a report *when things
stop being ok.*

Well, the problem with that approach is, what if something breaks that
causes the warnings to not get sent? You would have no way of knowing.

--

Best regards,

Charles

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post How to warn the sysadmin in case of trouble? 
On Tue, Jan 10, 2012 at 17:50, Miguel Almeida <miguel < at > almeida.at> wrote:
Dear all,

Today I bring you another important issue regarding backing up things: how
do you keep your sysadmin informed of what's happening?

This question is brought with a real life example I had with a client. They
had a simple backup script running daily, which emailed the sysadmin with a
report of what had been done. Everything was working fine...until it wasn't.
There was a problem with the server and they had to retrieve things from the
backup. But there was a problem...the backup script had stopped work months
before, due to *insert some irrelevant reason*.  The sysadmin didn't notice
he stopped receiving the daily reports until it was too late.

What's the bottom line? To me, it's that a daily report that everything was
ok is not good. It might be better to receive a report when things stop
being ok.

I was wondering what approaches you guys usually follow to achieve this. I
take it you might have some sort of warning and not just a "blind trust"
that all the backups are working as expected.

Integrate with the existing monitoring solution - if somebody doesn't
have one of those then they don't care enough anyway ;)

I've got a very simple string match being used with Xymon against the
rsnapshot log file, and the log file is rotated daily.

--
                 Please keep list traffic on the list.

Rob MacGregor
      Whoever fights monsters should see to it that in the process he
        doesn't become a monster.                  Friedrich Nietzsche

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post How to warn the sysadmin in case of trouble? 
Quoting Charles Marcus <CMarcus < at > Media-Brokers.com>:

On 2012-01-10 12:50 PM, Miguel Almeida <miguel < at > almeida.at> wrote:
What's the bottom line? To me, it's that a daily report that everything
was ok is not good. It might be better to receive a report *when things
stop being ok.*

Well, the problem with that approach is, what if something breaks that
causes the warnings to not get sent? You would have no way of knowing.

Except if you have a ton of systems, then you will have so many 'ok'
notifications that it's easy to loose track.
A computer is much better at handling all those 'oks' than a person is.

So, IMHO, it's better to have multiple 'bad' notifications ready to
fire (such as script failure + zenoss/AlienVault/WhatsUp), and manual
infrequent spot checks. Though every few months is a bit too
infrequent...

Rick


------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post How to warn the sysadmin in case of trouble? 
On Tue, Jan 10, 2012 at 05:50:50PM +0000, Miguel Almeida wrote:

Today I bring you another important issue regarding backing up things:
how do you keep your sysadmin informed of what's happening?

I have a strict policy of not talking to my sysadmin, ever, lest I
accidentally let slip that I used to be one and I then get pulled back
to the Dark Side of IT Smile

What's the bottom line? To me, it's that a daily report that everything
was ok is not good. It might be better to receive a report when things
stop being ok.

I was wondering what approaches you guys usually follow to achieve this.
I take it you might have some sort of warning and not just a "blind
trust" that all the backups are working as expected.

For my own backups, I have a daily cron job that checks that daily.0 is
less than 24 hours old and emails me if it isn't. If I were using
rsnapshot at work, I'd have a look to see if I could make red flashing
lights appear in a Nagios dashboard instead.

--
David Cantrell | http://www.cantrell.org.uk/david

Suffer the little children to come unto me, as
their buying habits are most easily influenced.
-- Marketroid Jesus

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post How to warn the sysadmin in case of trouble? 
On Tue, Jan 10, 2012 at 12:26 PM, David Cantrell <david < at > cantrell.org.uk> wrote:
On Tue, Jan 10, 2012 at 05:50:50PM +0000, Miguel Almeida wrote:
I was wondering what approaches you guys usually follow to achieve this.
I take it you might have some sort of warning and not just a "blind
trust" that all the backups are working as expected.

For my own backups, I have a daily cron job that checks that daily.0 is
less than 24 hours old and emails me if it isn't.  If I were using
rsnapshot at work, I'd have a look to see if I could make red flashing
lights appear in a Nagios dashboard instead.

Due to this thread, I ran a test on my similar script ... and found
out that the emails go to my spam folder. I've fixed that by telling
gmail that that specific address never spams, but it's something else
to check!

-scott

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post How to warn the sysadmin in case of trouble? 
I like Nagios, I have a script that reads all my rsnapshot config and writes out the Nagios check config files.

There are 2 checks involved (not counting the host up check)
  1. A verification script that all rsnapshot configs have been read in and are in Nagios <- got to monitor the monitor to make sure it's correct
  2. A check that each backup mount is less then X time old from the last update.

Once I get back to the office I can see if these are scripts I can release or if they are "propected" by work copyright.

Thanks,
Aaron

On 1/10/2012 2:06 PM, Scott Hess wrote: On Tue, Jan 10, 2012 at 12:26 PM, David Cantrell <david < at > cantrell.org.uk> ([email]david < at > cantrell.org.uk[/email]) wrote:
On Tue, Jan 10, 2012 at 05:50:50PM +0000, Miguel Almeida wrote:
I was wondering what approaches you guys usually follow to achieve this.
I take it you might have some sort of warning and not just a "blind
trust" that all the backups are working as expected.

For my own backups, I have a daily cron job that checks that daily.0 is
less than 24 hours old and emails me if it isn't.  If I were using
rsnapshot at work, I'd have a look to see if I could make red flashing
lights appear in a Nagios dashboard instead.

Due to this thread, I ran a test on my similar script ... and found
out that the emails go to my spam folder. I've fixed that by telling
gmail that that specific address never spams, but it's something else
to check!

-scott

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net ([email]rsnapshot-discuss < at > lists.sourceforge.net[/email])
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss


Post How to warn the sysadmin in case of trouble? 
Why do complicated? Rsnapshot and cron have this feature built-in.

If you simply set "verbosity 2" in your rsnapshot.conf, then it will only print out messages (and thus, the Cron daemon will only email you) if there is an error.

If you want to get daily "success" messages, just set "verbosity 3". Then you'll get an email after every run. This is not convenient because you must read them on a daily basis (to see if the text indicates an error instead of the expected daily output).

Note, for cron-generated emails to work you need to do something like the following:

- If using root's crontab, then set a valid email address alias for "root" in /etc/aliases. Then run "newaliases". You should do this for every server you monitor anyway, as various system processes (like failed security updates, intrusion detection, mdraid, or other services) will email root < at > localhost by default.

- If using a local user account for the crontab, have that user's email forward to your actual address.

- Alternatively, you can just set the MAILTO variable in your crontab. Then it doesn't matter what aliases are set up, it will send mail to whatever's in MAILTO.


I do agree that all critical services should have an external monitor with alerts (I prefer ServerDensity). But for a simple "email me if it fails" feature, nothing else is required.


--Derek Simkowiak

On 01/10/2012 04:05 PM, Aaron Johnson wrote: I like Nagios, I have a script that reads all my rsnapshot config and writes out the Nagios check config files.

There are 2 checks involved (not counting the host up check)
  1. A verification script that all rsnapshot configs have been read in and are in Nagios <- got to monitor the monitor to make sure it's correct
  2. A check that each backup mount is less then X time old from the last update.

Once I get back to the office I can see if these are scripts I can release or if they are "propected" by work copyright.

Thanks,
Aaron

On 1/10/2012 2:06 PM, Scott Hess wrote: On Tue, Jan 10, 2012 at 12:26 PM, David Cantrell <david < at > cantrell.org.uk> ([email]david < at > cantrell.org.uk[/email]) wrote:
On Tue, Jan 10, 2012 at 05:50:50PM +0000, Miguel Almeida wrote:
I was wondering what approaches you guys usually follow to achieve this.
I take it you might have some sort of warning and not just a "blind
trust" that all the backups are working as expected.
For my own backups, I have a daily cron job that checks that daily.0 is
less than 24 hours old and emails me if it isn't.  If I were using
rsnapshot at work, I'd have a look to see if I could make red flashing
lights appear in a Nagios dashboard instead.
Due to this thread, I ran a test on my similar script ... and found
out that the emails go to my spam folder. I've fixed that by telling
gmail that that specific address never spams, but it's something else
to check!

-scott

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net ([email]rsnapshot-discuss < at > lists.sourceforge.net[/email])
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox

_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net ([email]rsnapshot-discuss < at > lists.sourceforge.net[/email])
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss


Post How to warn the sysadmin in case of trouble? 
On Tue, Jan 10, 2012 at 6:19 PM, Derek Simkowiak <derek < at > simkowiak.net> wrote:
    Why do complicated?  Rsnapshot and cron have this feature built-in.

    If you simply set "verbosity 2" in your rsnapshot.conf, then it will
only print out messages (and thus, the Cron daemon will only email you) if
there is an error.

It will only email you in case of error messages if the backup server
is itself running and executing cron jobs! You definitely want to
monitor that the jobs on the system are running and that they are
running correctly, regardless of how you choose the break that up.

-scott

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Post How to warn the sysadmin in case of trouble? 
Scott> It will only email you in case of error messages if the backup server is itself running and executing cron jobs!
Nico> And if it doesn't run, for whatever reason, you get a deafening silence.
Sheesh guys. Obviously.

If your server isn't powered, or if you have a bad network cable, or if your DNS is broken, or any one of a million other things, then OF COURSE, you won't get the email. For example, the alerts might be going into your Spam folder (right Scott?).

The OP's question was simply how to get alerts if Rsnapshot failed. I didn't think a blueprint for a multi-regional high availability setup (plus an essay on Spam filtering practices) was warranted.

I did explicitly state that "all critical services should have an external monitor with alerts". I like ServerDensity because then the monitoring happens on a completely independent, external network (and a separate organization is responsible for making sure the alerts are working).

> Also, sad to say, most Linux and UNIX setups send all cron mail to the local root account
...I also gave three different ways to tell cron how to send email to the right place (along with several reasons why you should always set up a valid root alias).

And finally, "all critical services should have an external monitor with alerts".


--Derek

Commercial plug: http://www.cool-st.com/managed-sercon-servers/

On 01/11/2012 05:05 PM, Nico Kadel-Garcia wrote: On Tue, Jan 10, 2012 at 9:19 PM, Derek Simkowiak <derek < at > simkowiak.net> ([email]derek < at > simkowiak.net[/email]) wrote:
    Why do complicated?  Rsnapshot and cron have this feature built-in.

    If you simply set "verbosity 2" in your rsnapshot.conf, then it will
only print out messages (and thus, the Cron daemon will only email you) if
there is an error.
And if it doesn't run, for whatever reason, you get a deafening
silence. Also, sad to say, most Linux and UNIX setups send all cron
mail to the local root account, not anywhere else you could find it.
So unless you log in and check it, the stuff is sitting there in
/var/spool/mail/root, unread.


Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB