Views

Why does NetWorker seem to hang?

This Wiki is brought to you by Backup Central, where you can find the Mr. Backup Blog, Forums, and a mailing list for each forum!

Backup FAQs Service Providers Backup Software Backup Hardware Backup Book Wiki Free Stuff Miscellaneous


A. Mike Myers submits (28 Mar 2000): There seems to be an issue with the recommended tuning parameters for high speed networks from Sun (eg. GB Ethernet and ATM). Specifically the tcp_xmit_hiwat and tcp_xmit_recv tunable which Sun recommends (info doc 17416, which appears to be quoting the answer book) be set at 64K each. These appear to be tickling a bug in the TCP/IP stack when set this high. It will lower your throughput a little to return them to their default values of 8K but it has greatly increased the reliability of our environment.

Another person (alain.gourves@paribas.com) sent me the following which may also address the problem (or at least did for them). This was after I sent him our experiences with returning to defaults: To correct our problem, we tuned the TCP/IP parameter tcp_conn_req_max_q up to 256 (128 by default) and left tcp_recv_hiwat and tcp_xmit_hiwat at 65635.

At this site: http://www.rvs.uni-hannover.de/people/voeckler/tune/EN/tune.html I found a great document for tuning. Sun has closed this because the original organization that opened it was happy with the work-around (reducing back to defaults). We're looking into getting this reopened and actually fixed. Another hang issue that may or may not be specific to the Solaris platform (which is what we run on) are the "Nsrmmd polling interval" and "Nsrmmd control timeout" intervals. These only apply to Storage nodes (it controls how often it checks to see if the nsrmmd is still running) and we ran into this hang after adding a new storage node at the end of a WAN link. We've changed ours from the default (3 and 5 minutes I believe) to 10 minutes each. This has solved some problems which were destabilizing our environment. Apparently this is addresses in the upcoming release of Legato (it is a known bug).