Testing your hard drive in Linux
I recently needed to test a hard drive in Linux, and had a hard time finding out how to do it properly. In DOS, you can run a Surface Scan in Scandisk. Linux does not have anything called Surface Scan, however. In Linux, it is called checking for bad blocks.
What is a block? I'm not exactly sure how it's defined, but it's basically a chunk of data on the hard drive. So if you have a 40 gig partition, you divide it into a whole bunch of indexed blocks that might be like 4096 bytes each. Block 0 is the first 4096 bytes, Block 1 the second 4096 bytes, and so on. An important thing you should know is that the "blocks" are a part of the filesystem. At time of formatting, a blocksize is chosen for the filesystem. The partition itself does not have a blocksize, the filesystem does.
If part of your hard drive is messed up, the block or blocks that contain that bad part should be marked bad. Basically, this means the block number is added to a list of bad blocks. Then, you give the list to the filesystem on the partition. The file system stores it somewhere and remembers not to use those bad blocks. If you use e2fsck, the process of giving the list to the filesystem is automated. Since that prevents errors, that is preferable.
There are 2 general ways to find the bad blocks.
The first way is to just try reading every block. If one of the reads causes the hard drive to throw an error, then the block in question is marked bad. This, however, is not the best way, because sometimes the hard drive can have a bad part of the disk that doesn't throw an error when read for some reason. The second, slower method, is to write data to every block on the hard disk, and make sure it's the same when it's read back. It is possible to do this without erasing the data in your partition, but it makes it take longer. This second method, read/write, is what is done in a DOS Surface Scan.
Programs to use
In Linux, there is pretty much only one program that is used to check for bad blocks. It is called, surprisingly enough, "badblocks". You should only use this program directly, though, when you are checking a blank partition, or a non ext2 or ext3 filesystem. When checking an ext2 or ext3 filesystem partition, you should use e2fsck, which runs badblocks in the background.
You should use this when checking an ext2 or ext3 filesystem. These 2 methods automatically save the bad blocks found into the filesystem so that those parts of the hard drive are no longer used.
Read-only method: e2fsck -c -C /dev/hda1 ---OR--- e2fsck -c -C -y /dev/hda1 (This answers yes to all questions, so it is sure to finish by itself.)
Non-destructive read/write method: e2fsck -c -c -C
/dev/hda1 ---OR--- e2fsck -c -c -C -y /dev/hda1 (This answers yes
to all questions, so it is sure to finish by itself.)
You should use this when checking a blank partition. You can also use it on a partition with a non ext2 or ext3 filesystem. There might be an equivilent of e2fsck for your filesystem, though, so you might try that. When you use badblocks, the bad blocks list for your partition will not be saved in the filesystem automatically. It is possible to save the badblocks list, and then have the filesystem read in that list. The problem is, you must set the blocksize in badblocks to be the blocksize the filesystem will be, or currently is. Otherwise the block numbers will not correspond to the blocks in that filesystem. I'm not going to describe how to import the block list into the filesystem. You can read the man files for that information.
Read-only method: badblocks -b 4096 -p 4 -c 32768
Destructive read/write method: badblocks -b 4096 -p
4 -c 16384 -w -s /dev/hda1
Other things missing from this page
There is a non-destructive read/write mode of
badblocks. (You should use e2fsck for ext2 and ext3 filesystems,