SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
RHEL6/XFS/NW 7.6.2
Author Message
Post RHEL6/XFS/NW 7.6.2 
Good day, everyone.

I am here today because I have this year upgraded my NW server infrastructure (server and
storage nodes) to x86_64 RHEL6 on four year old IBM 3650's and purchased the extra (Academic
pricing) license to use XFS on the storage nodes with NexSan fibre arrays (point-to-point
attached, no fibre switch). Twice since I have upgraded to 7.6.2 (once at NW 7.6.2.3 the last
weekend of August, and just this past last weekend of October at NW 7.6.2.5) the one storage
node that I have converted to using XFS has hung between midnight and 1am on the Saturday
morning.

When it happened in August, I upgraded to the latest RedHat maintenance (new kernel) because
the dump that the system took pointed into the kernel and likely at the XFS code. This time,
there was no dump, the system just hung. The support person on call took a screen capture to
show what was on the unresponsive console, forced a reboot, and then dropped and broke his
laptop before giving me the screen image (OUCH!). There's nothing in the logs anywhere of
course...

Before I ask EMC and RedHat to point fingers at each other, I thought I'd ping you illustrious
folk to see if any of you know of anything that would indicate I have either royally screwed up
by choosing XFS or if perhaps you have more experience with XFS and can suggest something in
the way of tuning to "make it stop" (right now, I'm using mostly default options, with the
single exception of adding the inode64 option to allow inodes to be placed beyond the first 4TB
of the 17TB disk).

Thanks for any pointers!

--
Frank Swasey | http://www.uvm.edu/~fcs
Sr Systems Administrator | Always remember: You are UNIQUE,
University of Vermont | just like everyone else.
"I am not young enough to know everything." - Oscar Wilde (1854-1900)


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

Post RHEL6/XFS/NW 7.6.2 
In regard to: [Networker] RHEL6/XFS/NW 7.6.2, Francis Swasey said (at...:

Before I ask EMC and RedHat to point fingers at each other,

Smile

I thought
I'd ping you illustrious folk to see if any of you know of anything that
would indicate I have either royally screwed up by choosing XFS or if
perhaps you have more experience with XFS and can suggest something in
the way of tuning to "make it stop" (right now, I'm using mostly default
options, with the single exception of adding the inode64 option to allow
inodes to be placed beyond the first 4TB of the 17TB disk).

XFS obviously isn't as widely tested on Linux as something like ext3, but
sometimes it's the only viable choice.

It's been a while since I've used it on Linux and the usage pattern was
different, so I don't think I can offer any useful configuration
suggestions for XFS.

If you haven't already done so, though, I highly suggest you enable the
Linux "Magic SysRq" key on the troublesome storage node, as well as enable
netdump(Cool. Have the SysRq key guide printed out and available to your
operators, so they can get a task list, lock list, memory state, etc.
Might help Red Hat track down what's going on.

Tim
--
Tim Mooney Tim.Mooney < at > ndsu.edu
Enterprise Computing & Infrastructure 701-231-1076 (Voice)
Room 242-J6, IACC Building 701-231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

View user's profile Send private message
Post RHEL6/XFS/NW 7.6.2 
On Nov 1, 2011, at 9:59 AM, Francis Swasey wrote:

Good day, everyone.

I am here today because I have this year upgraded my NW server infrastructure (server and
storage nodes) to x86_64 RHEL6 on four year old IBM 3650's and purchased the extra (Academic
pricing) license to use XFS on the storage nodes with NexSan fibre arrays (point-to-point
attached, no fibre switch). Twice since I have upgraded to 7.6.2 (once at NW 7.6.2.3 the last
weekend of August, and just this past last weekend of October at NW 7.6.2.5) the one storage
node that I have converted to using XFS has hung between midnight and 1am on the Saturday
morning.

When it happened in August, I upgraded to the latest RedHat maintenance (new kernel) because
the dump that the system took pointed into the kernel and likely at the XFS code. This time,
there was no dump, the system just hung. The support person on call took a screen capture to
show what was on the unresponsive console, forced a reboot, and then dropped and broke his
laptop before giving me the screen image (OUCH!). There's nothing in the logs anywhere of
course...

Before I ask EMC and RedHat to point fingers at each other, I thought I'd ping you illustrious
folk to see if any of you know of anything that would indicate I have either royally screwed up
by choosing XFS or if perhaps you have more experience with XFS and can suggest something in
the way of tuning to "make it stop" (right now, I'm using mostly default options, with the
single exception of adding the inode64 option to allow inodes to be placed beyond the first 4TB
of the 17TB disk).

Thanks for any pointers!

--
Frank Swasey | http://www.uvm.edu/~fcs
Sr Systems Administrator | Always remember: You are UNIQUE,
University of Vermont | just like everyone else.
"I am not young enough to know everything." - Oscar Wilde (1854-1900)


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER


I've been using xfs on Fedora/CentOS for many years for our primary storage. I had something similar occur with a troublesome server/external disk chassis some years ago. I replaced various software/hardware parts and never fixed it. Seemingly identical server/storage never had the issue. I used SysRQ to reboot the thing as Tim suggested. It only went away after I replaced the parts entirely, I never really resolved it.

We have not put Centos 6 in production yet, so currently I am using CO5 with HP EVA fiber channel or HP SAS external disk carriers. No issues since I dumped the old hardware. Hard to say if it was the disk carriers, the raid cards or what. This was first gen Proliant DL585, so a while ago.

I have some EXT4 in service, seems fine within it's limitations.
I generally don't set up FS over 5 TB since I have lots of groups with storage and want parallel backup streams whenever possible.
Depending on size/use of the FS I might create it with higher agcount for more parallel journalling.

Chuck


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

Post RHEL6/XFS/NW 7.6.2 
On Tue, Nov 1, 2011 at 8:59 AM, Francis Swasey <Frank.Swasey < at > uvm.edu> wrote:


Before I ask EMC and RedHat to point fingers at each other, I thought I'd
ping you illustrious
folk to see if any of you know of anything that would indicate I have
either royally screwed up
by choosing XFS or if perhaps you have more experience with XFS and can
suggest something in
the way of tuning to "make it stop" (right now, I'm using mostly default
options, with the
single exception of adding the inode64 option to allow inodes to be placed
beyond the first 4TB
of the 17TB disk).

Thanks for any pointers!


Frank, is there any opportunity for you to take the storage node offline,
remove networker from the picture, and put some sequential and then random
IO stress on the subsystem?

I have a 24TB XFS AFTD volume right now, but few clients to stress it
(aside maybe for staging operations...); however it stayed rock solid
during benchmarking (dd, bonnie++, iozone).

file system mount options:
rw,noatime,nodiratime,logbufs=8,logbsize=256K,nobarrier,osyncisdsync,inode64

create options:
mkfs.xfs -l version-2 -d su=128k,sw=11

to match raid stripe size and the number of RAID bands in the device (22
disks, RAID1+0)

At boot:
elevator=noop

and a sizeable read-ahead for sequential reads (with read-ahead completely
disabled on the RAID controller itself):
/sbin/blockdev --setra 16384 /dev/sdxx


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

View user's profile Send private message
Post RHEL6/XFS/NW 7.6.2 
Sent from a medium size mobile device

On Nov 1, 2011, at 21:40, Eugene Vilensky <evilensky < at > gmail.com> wrote:

On Tue, Nov 1, 2011 at 8:59 AM, Francis Swasey <Frank.Swasey < at > uvm.edu> wrote:

Before I ask EMC and RedHat to point fingers at each other, I thought I'd ping you illustrious
folk to see if any of you know of anything that would indicate I have either royally screwed up
by choosing XFS or if perhaps you have more experience with XFS and can suggest something in
the way of tuning to "make it stop" (right now, I'm using mostly default options, with the
single exception of adding the inode64 option to allow inodes to be placed beyond the first 4TB
of the 17TB disk).

Thanks for any pointers!

Frank, is there any opportunity for you to take the storage node offline, remove networker from the picture, and put some sequential and then random IO stress on the subsystem?


There is, as long as I empty off these 34TB of data first. I will get started on that. I did performance testing of XFS vs EXT4 in a different environment before I deployed here. Perhaps, there is something unique with the QLogic HBA's or the NexSan itself here.


I have a 24TB XFS AFTD volume right now, but few clients to stress it (aside maybe for staging operations...); however it stayed rock solid during benchmarking (dd, bonnie++, iozone).

I am a firm believer in teach vs give (as in fish...) - So, if you don't mind, would you care to explain why you chose to use those mount/mkfs options? Perhaps there is some documentation that I have missed in my various searches.


file system mount options:
rw,noatime,nodiratime,logbufs=8,logbsize=256K,nobarrier,osyncisdsync,inode64

create options:
mkfs.xfs -l version-2 -d su=128k,sw=11

to match raid stripe size and the number of RAID bands in the device (22 disks, RAID1+0)

At boot:
elevator=noop

and a sizeable read-ahead for sequential reads (with read-ahead completely disabled on the RAID controller itself):
/sbin/blockdev --setra 16384 /dev/sdxx

And above all, thank you for being willing to share your knowledge.

Frank

via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

Post RHEL6/XFS/NW 7.6.2 
On 11/1/11 11:56 AM, Tim Mooney wrote:
If you haven't already done so, though, I highly suggest you enable the
Linux "Magic SysRq" key on the troublesome storage node, as well as enable
netdump(Cool. Have the SysRq key guide printed out and available to your
operators, so they can get a task list, lock list, memory state, etc.
Might help Red Hat track down what's going on.

I have not yet set up "Magic SysRq", but I will now. Thank you for the idea.

--
Frank Swasey | http://www.uvm.edu/~fcs
Sr Systems Administrator | Always remember: You are UNIQUE,
University of Vermont | just like everyone else.
"I am not young enough to know everything." - Oscar Wilde (1854-1900)


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

Post RHEL6/XFS/NW 7.6.2 
On 11/1/11 12:43 PM, Charles Weber wrote:
I have some EXT4 in service, seems fine within it's limitations.
I generally don't set up FS over 5 TB since I have lots of groups with storage and want parallel backup streams whenever possible.
Depending on size/use of the FS I might create it with higher agcount for more parallel journalling.

I am using EXT4 where the size of the AFTD is below 16TB myself (it's free with Red Hat, so it
makes sense). I have customers that want to have save sets that are 4TB and larger. So far,
I've been able to push them off (and keep them at 2TB max). However, the day is coming when I
will have to give in and allow 4TB save sets - 10Gb networks will make that doable in a 24 hour
period.

--
Frank Swasey | http://www.uvm.edu/~fcs
Sr Systems Administrator | Always remember: You are UNIQUE,
University of Vermont | just like everyone else.
"I am not young enough to know everything." - Oscar Wilde (1854-1900)


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

Post RHEL6/XFS/NW 7.6.2 
On Wed, Nov 2, 2011 at 6:55 AM, Francis Swasey <Frank.Swasey < at > uvm.edu> wrote:

I am a firm believer in teach vs give (as in fish...) - So, if you don't mind, would you care to explain why you chose to use those mount/mkfs options?  Perhaps there is some documentation that I have missed in my various searches.

Disclaimer: some of this came from a Gentoo guide / and Google, only
later referenced to man pages. And some might remain
unconfirmed...this is all on a UPS-protected DAS with BBU-protected
cache.

file system mount options:
rw,noatime,nodiratime,logbufs=8,logbsize=256K,nobarrier,osyncisdsync,inode64

The maximum number of log buffers and their max size to keep in ram,
general explanation is this allows delays between journal commits so
fewer IO for journal updates.


create options:
mkfs.xfs -l version-2 -d su=128k,sw=11

to match raid stripe size and the number of RAID bands in the device
(22 disks, RAID1+0, 11 bands), with 128K raid stripe

At boot:
elevator=noop

Don't let the scheduler muck with IO write ordering since there is
cache and limited intelligence on the DAS to take of such things.


and a sizeable read-ahead for sequential reads (with read-ahead completely disabled on the RAID controller itself):
/sbin/blockdev --setra 16384 /dev/sdxx

 The block device should be smarter than "Adaptive read ahead" or
other hardware read-ahead options.  dd benchmarking (non scientific)
agrees.

Hope this helps. This is my first large XFS device but so far so
good. I also aligned the LVM metadata to stripe size with

pvcreate --metadatasize=128KB /dev/sdb --dataalignment=128KB

and am not using partitions on the device. Otherwise I'd start my
partition on a 128K stripe boundary also.


via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB