SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
did ltfs screw up my LTO5 tape?
Author Message
Post did ltfs screw up my LTO5 tape? 
Hi all,

Thinking I could actually use LTFS (using HP's LTFS 1.2.0) for production, I began rsync'ing 1.2TB of data onto an LTFS formatted LTO5 maxell tape and this HP SAS LTO5 drive:

Vendor: HP Model: Ultrium 5-SCSI Rev: Z39D
Type: Sequential-Access ANSI SCSI revision: 06

I don't know how many gigs were written to the tape, but eventually this popped up in messages :

messages-20110731:Jul 30 03:41:42 mobymc kernel: INFO: task ltfs:18133 blocked for more than 120 seconds.
messages-20110731:Jul 30 03:41:42 mobymc kernel: ltfs D ffff88021fe70e00 0 18133 1 0x00000000
messages-20110731:Jul 30 03:43:42 mobymc kernel: INFO: task ltfs:18133 blocked for more than 120 seconds.
messages-20110731:Jul 30 03:43:42 mobymc kernel: ltfs D ffff88021fe70e00 0 18133 1 0x00000000
messages-20110731:Jul 30 03:45:42 mobymc kernel: INFO: task ltfs:18133 blocked for more than 120 seconds.
messages-20110731:Jul 30 03:45:42 mobymc kernel: ltfs D ffff88021fe70e00 0 18133 1 0x00000000
messages-20110731:Jul 30 03:47:42 mobymc kernel: INFO: task ltfs:18133 blocked for more than 120 seconds.
messages-20110731:Jul 30 03:47:42 mobymc kernel: ltfs D ffff88021fe70e00 0 18133 1 0x00000000

There was nothing wrong with the FC connected eonstor from which the data being written to tape was being pulled from, but the rsync process was basically stuck. I used --progress with rsync so I know it was really stuck, plus I left it on for more than an hour and not an additional single byte was transferred. I couldn't umount the tape (without -l), nor run lsof (hang), nor run "ps aux | grep something" (hang), ltfs had seriously fsck'd the system. I rebooted the system and was planning on using good old tar to make my backup.

I started the tar with :

mt -f /dev/nst0 rewind
mt -f /dev/nst0 compression 1
tar -b 1024 -cvWf /dev/nst0 directory

but then tar gave an I/O error after transferring several gigs?, dmesg showed :

st0: Block limits 1 - 16777215 bytes.

I then recalled that ltfs makes two partitions on the tape, one for "metadata" and one for data. Rather than kill the next few hours with an "mt erase", I ran :

mt -f /dev/nst0 rewind
dd if=/dev/zero of=/dev/nst0 bs=4k count=6291456

and wrote 24GB worth of zeros to the tape. I was then able to write the first dataset of 1.2TB and another (after doing an mt fsf 1 after the first tar) dataset of ~200GB . I also ran a script which dumps the output of "tar -b 1024 -tvf /dev/nst0" for each tar position on the tape for another script that I have that tells me how much I have stored on the tape (using the tar -tvf dumps) if I need to use the tape later in the future. Everything was going great.

Then I came in Monday, ran an mt -f /dev/nst0 rewind and then tried to eject the tape with "mt -f /dev/nst0 eject" and got this beautiful error:

messages:Aug 1 12:30:58 mobymc kernel: st0: Add. Sense: Medium removal prevented

I could still rewind the tape, fsf the tape, but it wouldn't eject. I tried pushing the eject button on the drive several times but basically it sounded like the drive was trying to push the tape out, but couldn't and so was re-seating the drive back onto the motor. I finally held down the eject button for a few seconds, lifted the plastic door cover, and slightly nudged the tape into the drive, and it came out. Making sure the drive just had a random hiccup I put the tape back in and was again able to run mt eject without problems. The tape drive/tapes are practically new and have had almost no real use.

Now I wanted to make sure the tape still had the tar'd data, but to my surprise the tar's on the tape were gone! Every attempt at tar -b 1024 -tvf kept returning the error that the data didn't look like a tar archive. I had used the W flag with tar, so tar itself had verified the data, and then I ran through all 1.4TB worth of tar data on the tape to generate the entire tvf listings the night before, but now I couldn't retrieve anything on the tape!

Thinking that I'd have to re-do all the tars again I opted to do a full mt -f /dev/nst0 erase. Again to my surprise, rather than taking several hours for the full erase, it finished in about 15 mins. I then tried tar'ing the 1.2TB data set onto the tape, but again it prematurely stopped tar'ing after only a few gigs of data were written with an I/O error to the tape (tape full), as if LTFS's initial metadata partition was still on the tape. This time however, the st block limits error didn't show up in dmesg. Showing no mercy to the tape I ran this :

mt -f /dev/nst0 erase
dd if=/dev/zero of=/dev/nst0 bs=524288

524288 is the blocksize as mentioned in this LTFS user guide (thus the -b 1024 blocking factor used with tar) http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c02262008/c02262008.pdf . Dd reported having written 3.4TB of data at 285MB/s . I was then convinced that the tape was "clean". I tried my tar of the 1.2TB data set again, but still got the same tape full /tar I/O error after only a few GB were written.

I had another LTO5 tape untouched by LTFS, so I started tar'ing the data to this tape a few hours ago and everything is going well so far. Anyone had similar experiences with LTFS? Is there anyway to rescue the misbehaving LTO5 tape? How did my tar's get corrupted?

Thanks,
Sabuj Pattanayek

View user's profile Send private message
Post LTFS is NOT Production Ready.... 
First off....LTFS is NOT ready for real-time production. Why? There are many reasons....in fact, TOLIS Group (the makers of BRU) have gone so far as to actually publish a list of LTFS caveats publicly that they feel hinder the current state of LTFS.

Some of the most notable ones are:

--LTFS doesn't support spanning volumes — Unlike BRU PE, if you are writing to an LTFS volume, you must be aware of how much data you've written to the tape since LTFS, like disk, does not have a mechanism for prompting for a new tape. If you attempt to write more data than the tape will hold, you will simply get a "no space left on device" error and the write will fail. In the same situation, BRU PE will safely write what it can onto the current tape and then prompt for an additional tape and complete the operation on the new volume.

--LTFS doesn't offer a verification mechanism — Unlike BRU formatted tapes, LTFS volumes can only be verified on the system where they were created and written and then only by performing a full file-by-file comparison against the original files on disk.

--LTFS tapes can't be used for normal backup and archival operations — With the current LTFS tools, once a tape is formatted for use as an LTFS volume, it is not possible to unformat it for use as a normal data tape. This means that tapes defined for LTFS use can only be used for LTFS purposes. If you are only testing the LTFS waters, be sure that you're not depleting your normal backup media pool since you won't be able to use that tape with your normal backup application.

So, basically, now that you have that LTO-5 tape formatted as an LTFS tape, it will remain an LTFS tape until the LTFS makers have come up with a way to remove the partitioning of the LTFS format to return the tape to a standard LTO-5 data tape that TAR, CPIO, and DD will be able to use and span multiple volumes.

Your current tape, once filled, must stop there. You cannot fill up an LTFS tape, pause, switch tapes, and continue on a second tape. What you're writing must fit on the first tape and the first tape only. Unlike with TAR, which will allow a single TAR archive to span multiple tapes.

Furthermore, without verification, how is anyone able to determine that the data was written to the tape correctly. The LTO organization has declared that the drive performs what's known as "Read After Write Verify" and doesn't need a "verify" process. Sure, and I completely trust the government to do what's best for the people that elected them and the country as a whole because of the checks and balances designed into the government system. Tape drives do have read after write verify, they've done this for over 20 years. However, garbage in, garbage out. If the data is damaged before the tape write head gets the data (i.e. TAR messes something up, the SCSI interface damages something, the LTFS driver hiccups, or the internal tape drive itself is faulty), the read head will only verify that the write head wrote the "correct" junk to the tape. There's no other mechanism for verification of the original source data (comparison verify) or using a checksum verification. Therefore, you're entrusting all of your data to the LTFS layer and any applications using it to write the data to tape.

Also remember that when you delete data from an LTFS tape, the space the file, or files, took up on the tape are not "freed" since the only thing that happens is the data partition removes the reference from the drive. The LTO organization declares this is done intentionally so that if you accidentally delete a file, you can perform a re-index of the whole tape to get the actual data back (it rebuilds the listing of files on the tape). So that was some decent thinking on their part, but the fact that they have not documented that the data "deleted" is not freed from the tape causes the user to assume that it is and therefore deleting a 100GB file allows for another 100GB to be written in its place, which is not the case.

Given that you've used TAR to basically clobber the data partition of the LTFS tape, the tape itself is now hosed unless you know of a way to "reformat" the tape...I'm not aware of one. With the data partition of the tape having been trashed, the tape itself is more than likely useless. With that said, give reformatting the tape a shot and maybe LTFS can rebuild the data partition on the tape. No 'dd' command that you run will remove the LTFS partitioning.

View user's profile Send private message
Post LTFS format 
The Dual partition is not part of LTFS, it is defined by the LTO consortium and thus is reversible.

Your LTFS formatted cartridge can be formatted to a single partition using standard SCSI commands. Refer to your Drive vendors SCSI reference for the commands that can be issued to do this. The are not standrd Linux commands of course, but if you build the sense it will make your media a single partition.

As previously stated the tar clobbered the the LTFS tape. In most cases you will need to uninstall LTFS from the system to use the cartridge in a standard tape back-up.

LTFS seems to be working well for me, You might want to check the HP trouble shooting for LTFS.

View user's profile Send private message
Post Partially Correct... 
>>The Dual partition is not part of LTFS, it is defined by the LTO consortium and thus is reversible.


Yes, that is correct, it's not an LTFS thing and yes, it's reversible. However, the current firmware and LTFS tools that are publicly available do not support the specific operation to erase the partitions. The latest BETA ones do, but the BETA is only available to select groups right now.

>>Your LTFS formatted cartridge can be formatted to a single partition using standard SCSI commands.

Correct. That also means that you're telling someone who probably doesn't know how to issue direct SCSI commands in a manner that is supported and if the user is on Mac OS X it could be more of an issue due to the fact that Mac OS X has no native tape support like other Linux/Unix operating systems do.

You're also asking someone to learn T10 specifications regarding SCSI-2 logic commands and issue commands to a drive. Someone without the knowledge of this can really mess up their drive if they don't know what they are doing. Is the juice worth the squeeze?

My suggestion is to hold on to those tapes and when the next release of LTFS is made public you can revert your tapes back to usable LTO-5 data tapes with a single LTFS command. A process that the user is currently familiar with since they've been using the LTFS tools to make the tape, mount it, and unmount it already. Patience can be a virtue.

>>In most cases you will need to uninstall LTFS from the system to use the cartridge in a standard tape back-up.

Why would you need to uninstall LTFS? If you don't use the LTFS tools to format the tape, the tape remains a standard LTO-5 tape. It's only after you've formatted the tape for LTFS that it becomes an LTFS tape and uninstalling the LTFS tools does not make the tape usable like normal. It's still an LTFS tape. After all, it's been formatted.

Formatting a hard drive into three partitions, installing an OS and then uninstalling an OS doesn't remove the partitions. You must still go back and remove the partitions using hard drive partitioning tools. LTFS is the same way. You've formatted the tape using LTFS. The partitioning capability is defined by the TPC, but the TPC doesn't dictate HOW to partition, just that it does support partitions.

If you go look, one of the big problems right now with LTFS is that there is no standard. IBM's version of LTFS is a bit different than HP's version and as such they are "kinda" compatible. The primary reason is that there is no official standard for the way LTFS is to be done and how it's to be implemented. While this is a feat that the TPC is working to overcome, the fact remains that there is currently no standard for cross compatibility of an HP LTFS tape and an IBM LTO-5 drive. You will see "oddities" if you "cross the streams" essentially.

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB