Thursday, September 24, 2015

Disk replacement procedure in solaris 10

                  Disk replacement procedure in solaris 10 






If the disk need to be replaced from a solaris global zone server ,we need  to execute some procedures along with HDD replacement .  I am giving you the replacement procedure for the disk replacement for spark T3-2 server 


1. First we need to identify  the faulty drive using below command  as per this the disk 5000c50031f89323 is in predictive fault mode and need to be replaced 

*******************************************************************
#root@solaris-test>fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Sep 03 09:12:29 cb61c63d-bbc9-eb1c-f247-e53e8c8573b8  DISK-8000-0X   Major

Host        : solaris-test
Platform    : ORCL,SPARC-T3-2   Chassis_id  : 1111BDRD7A
Product_sn  : 1111BDRD7A

Fault class : fault.io.disk.predictive-failure
Affects     : dev:///:devid=id1,sd@n5000c50031f89323//scsi_vhci/disk@g5000c50031f89323
                  faulted but still in service
FRU         : "/SYS/SASBP/HDD1" (hc://:product-id=ORCL,SPARC-T3-2:product-sn=1111BDRD7A:server-id=whts18600:chassis-id=1111BDRD7A:serial=0010527270S1--------

6SE270S1:part=SEAGATE-ST930003SSUN300G:revision=0B70/chassis=0/motherboard=0/hba=0/bay=1/disk=0)
                  faulty

Description : SMART health-monitoring firmware reported that a disk failure is
              imminent.

Response    : None.

Impact      : It is likely that the continued operation of this disk will
              result in data loss.

Action      : Use 'fmadm faulty' to provide a more detailed view of this event.
              Please refer to the associated reference document at
              http://sun.com/msg/DISK-8000-0X for the latest service procedures
              and policies regarding this diagnosis.

2. If we check the zpool status we will get the exact disk details 


#root@solaris-test > zpool status rpool
  pool: rpool
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scan: none requested
config:

        NAME                         STATE     READ WRITE CKSUM
        rpool                        ONLINE       0     0     0
          mirror-0                   ONLINE       0     0     0
            c0t5000C50031F9FB07d0s0  ONLINE       0     0     0
            c0t5000C50031F89323d0s0  ONLINE       0     0     0

errors: No known data errors

3. We will get the actual path and disk id from below command 


#root@solaris-test > cfgadm -alv |grep c0t5000C50031F89323
c4::w5000c50031f89321,0        connected    configured   unknown    Client Device: /dev/dsk/c0t5000C50031F89323d0s0(sd1)

4. Now we need to unconfigure the disk 


#root@solaris-test >cfgadm -c unconfigure c4::w5000c50031f89321,0

5. Now we can see as the disk is in unconfigured state 


#root@solaris-test > cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c2                             fc-fabric    connected    configured   unknown
c2::50060e8006cfb113           disk         connected    configured   unusable
c2::50060e80166d5f37           disk         connected    configured   unknown
c3                             scsi-sas     connected    configured   unknown
c3::w5000c50031f9fb05,0        disk-path    connected    configured   unknown
c4                             scsi-sas     connected    configured   unknown
c4::w5000c50031f89321,0        disk-path    connected    unconfigured unknown

6. Now we need to replace the disk with new one 


7. After the disk replacement if we check we can see the new disk is in configured state 


#root@ solaris-test> cfgadm -al

c4                             scsi-sas     connected    configured   unknown
c4::w5000cca00ab91e01,0        disk-path    connected    configured   unknown

root@solaris-test> cfgadm -alv |grep c0t5000CCA00AB91E00d0
c4::w5000cca00ab91e01,0        connected    configured   unknown    Client Device: /dev/dsk/c0t5000CCA00AB91E00d0s0(sd6)

8. Now we need to copy the partition table from root disk to the new disk 


#root@solaris-test> prtvtoc /dev/rdsk/c0t5000C50031F9FB07d0s2 | fmthard -s - /dev/rdsk/c0t5000CCA00AC682C4d0s2
fmthard:  New volume table of contents now in place.

9. Now we need to  attach  the new disk with existing pool using below command ( you will get some warning which you can ignore now)


#root@solaris-test > zpool attach rpool c0t5000C50031F9FB07d0s0 c0t5000CCA00AC682C4d0s0
warning: device in use checking failed: Unknown error
Make sure to wait until resilver is done before rebooting.

We can check the status of the pool status using below command, (once the mirroring is completed resilvering message will clear )

root@solaris-test > zpool status rpool
  pool: rpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Thu Sep 24 00:15:42 2015
    757M scanned out of 188G at 16.5M/s, 3h14m to go
    757M resilvered, 0.39% done
config:

        NAME                         STATE     READ WRITE CKSUM
        rpool                        ONLINE       0     0     0
          mirror-0                   ONLINE       0     0     0
            c0t5000C50031F9FB07d0s0  ONLINE       0     0     0
            c0t5000CCA00AC682C4d0s0  ONLINE       0     0     0  (resilvering)

errors: No known data errors

10. Now we need to install the boot disk .


#root@solaris-test > installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk  /dev/rdsk/c0t5000CCA00AC682C4d0s0


After the above command execution if you check the status of the pool you can see the new disk is in place 



#root@ solaris-test > zpool status rpool

pool: rpool

state: ONLINE

status: The pool is formatted using an older on-disk format. The pool can

still be used, but some features are unavailable.

action: Upgrade the pool using 'zpool upgrade'. Once this is done, the

pool will no longer be accessible on older software versions.

scan: resilvered 188G in 0h58m with 0 errors on Thu Sep 24 01:14:30 2015

config:



NAME STATE READ WRITE CKSUM

rpool ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c0t5000C50031F9FB07d0s0 ONLINE 0 0 0

c0t5000CCA00AC682C4d0s0 ONLINE 0 0 0



errors: No known data errors




So disk is succesfully replaced and pool status is fine