Skip to content
View in the app

A better way to browse. Learn more.

Thailand News and Discussion Forum | ASEANNOW

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Centos Raid

Featured Replies

Since years I have a Server in Netherlands which is doing the webpages and emails.

It runs rock solid and I never take care of it.

I know it has two harddisk in Raid and so I have some safety against lost data due to a HD failure.

Now I just found out that I messed up Raid 0 and Raid 1 in my mind (thought Raid 0 is what Raid 1 is in the real world): how can I find out via telnet if the server runs on Raid 0 or Raid 1?

Is there an simple command?

Which brings me to my second concern....If it is Raid 1 like it should be, is there a simple check if both HD are healthy or it is running just on 1?

Is it running software or hardware raid?

Not familiar with CentOS, but if it has proc fs running on a software raid try:

cat /proc/mdstat

That will give at least tell you which kind of raid you are running.

It's been a while since I ran mdraid (which I'm assuming you are running). However, assuming that things haven't changed, telneting in and as root do this:

cat /etc/raidtab

There should be a 'raid-level' that will tell you what you're running.

For the health, look at hdparm. Without knowing what the disks are, you're going to need to do this:

hdparm -I /dev/sda

and change the /dev/sda to what you need.

Will spit out a lot of information. Alternatively you can check out this link for checking into the S.M.A.R.T. info.

If you're running your drives on a proper controller, you need to tell us the model. A quick paste of lspci will help out.

  • Author

thanks!

I got:

[root@nohavename ~]# cat /proc/mdstat

Personalities : [raid1]

md3 : active raid1 sdb1[1] sda1[0]

104320 blocks [2/2] [uU]

md1 : active raid1 sdb2[1] sda2[0]

20972736 blocks [2/2] [uU]

md2 : active raid1 sdb5[1] sda5[0]

220965888 blocks [2/2] [uU]

md0 : active raid1 sdb3[1] sda3[0]

2096384 blocks [2/2] [uU]

unused devices: <none>

[root@nohavename ~]# cat /etc/raidtab

cat: /etc/raidtab: No such file or directory

[root@nohavename ~]# hdparm -I /dev/sda

/dev/sda:

HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device

So what I understood: it is raid 1: right???

But I did not understand anything more. Specially not if it is healthy.

Thanks for the help

Yes, those are RAID1 and the system thinks they are in sync with redundancy. However, as you seem to be aware, the disks could be unhealthy and the system just hasn't noticed yet, particularly if they are many years old and have data that is never accessed.

# smartctl -a /dev/sda

# smartctl -a /dev/sdb

these commands will tell you something about the SMART diagnostic status of the drives. Of interest are Reallocated_Sector_Ct to tell you of bad blocks that have been remapped, Current_Pending_Sector to tell you of blocks that are currently having trouble, and Offline_Uncorrectable to tell you blocks that are unrecoverable. You might also be interested in temperature (to see if your server is having cooling problems) and the lifetime counters such as start/stop count, power on hours, etc. Search the web for SMART attributes for more information on this topic.

Before reading further, you might want to attempt backups from the disks if you are worried about their health. The more activity you cause, the more change that a failing disk will completely fail, so it is good to try to prioritize and copy your most important files off the disks before doing anything else!

As for more active checks, I have my systems set to do the following automatically:

1. In /etc/smartd.conf I have the following automatic entry on Fedora (might need entries per drive on CentOS, but am not sure)

DEVICESCAN -H -m root -a -o on -S on -s (S/../.././02|L/../../6/03)

this triggers a SMART self-test on a regular basis. I have to be honest, I don't even remember the rule meaning here, as I've been copying it from system to system for many years. I suspect it runs a short test daily and a long test weekly.

2. In a small /etc/cron.weekly/md-scan.sh script, I have:

#!/bin/sh

# initiate MD block-check sync action on all MD devices

for f in /sys/block/md*/md/sync_action
do
if [[ -w "$f" ]]
then
	echo check > "$f"
fi
done

this will actually cause the software RAID system to access and check the redundancy in RAID1 (or parity codes in RAID5 etc) and eventually access every block on the RAID volume. This is good to make sure your data is really there on a bulk server that has lots of data files that go unaccessed for months or years by applications. It will help detect a failing disk much sooner, so less chance of a catastrophic RAID array failure. Of course, this causes a prolonged burst of activity on the disks for a large server filesystem...

  • Author
Yes, those are RAID1 and the system thinks they are in sync with redundancy. However, as you seem to be aware, the disks could be unhealthy and the system just hasn't noticed yet, particularly if they are many years old and have data that is never accessed.

# smartctl -a /dev/sda

# smartctl -a /dev/sdb

these commands will tell you something about the SMART diagnostic status of the drives. Of interest are Reallocated_Sector_Ct to tell you of bad blocks that have been remapped, Current_Pending_Sector to tell you of blocks that are currently having trouble, and Offline_Uncorrectable to tell you blocks that are unrecoverable. You might also be interested in temperature (to see if your server is having cooling problems) and the lifetime counters such as start/stop count, power on hours, etc. Search the web for SMART attributes for more information on this topic.

Before reading further, you might want to attempt backups from the disks if you are worried about their health. The more activity you cause, the more change that a failing disk will completely fail, so it is good to try to prioritize and copy your most important files off the disks before doing anything else!

As for more active checks, I have my systems set to do the following automatically:

1. In /etc/smartd.conf I have the following automatic entry on Fedora (might need entries per drive on CentOS, but am not sure)

DEVICESCAN -H -m root -a -o on -S on -s (S/../.././02|L/../../6/03)

this triggers a SMART self-test on a regular basis. I have to be honest, I don't even remember the rule meaning here, as I've been copying it from system to system for many years. I suspect it runs a short test daily and a long test weekly.

2. In a small /etc/cron.weekly/md-scan.sh script, I have:

#!/bin/sh

# initiate MD block-check sync action on all MD devices

for f in /sys/block/md*/md/sync_action
do
if [[ -w "$f" ]]
then
	echo check > "$f"
fi
done

this will actually cause the software RAID system to access and check the redundancy in RAID1 (or parity codes in RAID5 etc) and eventually access every block on the RAID volume. This is good to make sure your data is really there on a bulk server that has lots of data files that go unaccessed for months or years by applications. It will help detect a failing disk much sooner, so less chance of a catastrophic RAID array failure. Of course, this causes a prolonged burst of activity on the disks for a large server filesystem...

# smartctl -a /dev/sdb

smartctl version 5.33 [i686-redhat-linux-gnu] Copyright © 2002-4 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

Device: ATA Maxtor 7Y250M0 Version: YAR5

SATA disks accessed via libata are not currently supported by

smartmontools. When libata is given an ATA pass-thru ioctl() then an

additional '-d libata' device type will be added to smartmontools.

that does not help much as it seems.

But anyhow I want to change to a new server in a few months...I hope this will do the job for a few more month.....

You should read the man mdadm and pay attention to the --monitor part.

  • 3 weeks later...

As you are using Raid1 you do not need to bother much about the drives failing. If one drive fails you just replace it with a new identical one. The Raid System will not even bother that a drive failed. After you add a new drive the mirror will be rebuilt.

Only thing to remember is that you replace the correct drive when one fails!

Chris

Create an account or sign in to comment

Recently Browsing 0

  • No registered users viewing this page.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.