Wednesday, November 16, 2016

How to get rid of 'dgdisabled' flag:vxdisk list showing dg disabled:



I issues vxdctl enable command  in a two node VCS cluster and  it made all the dgs disabled as shown below

[root@ruabon1 ~]# vxdctl enable

You have new mail in /var/spool/mail/root
[root@ruabon1 ~]#
[root@ruabon1 ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
cciss/c0d0   auto:none       -            -            online invalid
sda          auto:cdsdisk    POEBS_ORAAPPL_01  POEBS_ORAAPPL_DG online dgdisabled
sdaa         auto:cdsdisk    POEBS_ORADATA01_15  POEBS_ORA_IMP_EXP_DG online dgdisabled
sdah         auto:cdsdisk    POEBS_ORAAPPL_03  POEBS_ORAAPPL_NFS_DG online dgdisabled
sdb          auto:cdsdisk    POEBS_ORADATA01_01  POEBS_ORADATA01_DG online dgdisabled
sdbq         auto:cdsdisk    POEBS_ORADATA01_17  POEBS_ORADATA01_DG online dgdisabled
sdc          auto:cdsdisk    POEBS_ORADATA01_02  POEBS_ORADATA01_DG online dgdisabled
sdd          auto:cdsdisk    POEBS_ORAREDO01_01  POEBS_ORAREDO01_DG online dgdisabled
sde          auto:cdsdisk    POEBS_ORALOGS_01  POEBS_ORALOGS_DG online dgdisabled
sdf          auto:cdsdisk    POEBS_ORAREDO02_01  POEBS_ORAREDO02_DG online dgdisabled
sdg          auto:cdsdisk    POEBS_ORADATA01_09  POEBS_ORADATA01_DG online dgdisabled
sdh          auto:cdsdisk    POEBS_ORADATA01_16  POEBS_ORADATA01_DG online dgdisabled
sdi          auto:cdsdisk    POEBS_ORAREDO01_02x  POEBS_ORAREDO01_DG online dgdisabled
sdj          auto:cdsdisk    POEBS_ORAREDO02_02x  POEBS_ORAREDO02_DG online dgdisabled
sdk          auto:cdsdisk    POEBS_ORA_IMP_EXP_DG01  POEBS_ORA_IMP_EXP_DG online dgdisabled
sdl          auto:cdsdisk    POEBS_ORAAPPL_02  POEBS_ORAAPPL_NFS_DG online dgdisabled
sdm          auto:cdsdisk    POEBS_MQ_01  POEBS_MQ_DG  online dgdisabled
sdn          auto:cdsdisk    POEBS_ORADATA01_03  POEBS_ORADATA01_DG online dgdisabled
sdo          auto:cdsdisk    POEBS_ORADATA01_04  POEBS_ORADATA01_DG online dgdisabled
sdp          auto:cdsdisk    POEBS_ORADATA01_05  POEBS_ORADATA01_DG online dgdisabled
sdq          auto:cdsdisk    POEBS_ORADATA01_06  POEBS_ORADATA01_DG online dgdisabled
sdr          auto:cdsdisk    POEBS_ORADATA01_07  POEBS_ORADATA01_DG online dgdisabled
sds          auto:cdsdisk    POEBS_ORADATA01_08  POEBS_ORADATA01_DG online dgdisabled
sdt          auto:cdsdisk    POEBS_ORADATA01_10  POEBS_ORADATA01_DG online dgdisabled
sdu          auto:cdsdisk    POEBS_ORADATA01_11  POEBS_ORADATA01_DG online dgdisabled
sdv          auto:cdsdisk    POEBS_ORADATA01_12  POEBS_ORADATA01_DG online dgdisabled
sdw          auto:cdsdisk    POEBS_ORADATA01_13  POEBS_ORADATA01_DG online dgdisabled
sdx          auto:cdsdisk    POEBS_ORADATA01_14  POEBS_ORADATA01_DG online dgdisabled
sdy          auto            -            -            error
sdz          auto            -            -            error



[root@ruabon1 ~]# hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen             

A  ruabon1              RUNNING              0                   
A  ruabon2              RUNNING              0                   

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State         

B  POEBS_DB        ruabon1              Y          N               PARTIAL       
B  POEBS_DB        ruabon2              Y          N               OFFLINE       
B  POEBS_MQ        ruabon1              Y          N               PARTIAL       
B  POEBS_MQ        ruabon2              Y          N               OFFLINE       

-- RESOURCES FAILED
-- Group           Type                 Resource             System             

C  POEBS_DB        Volume               mozart01_vol         ruabon1            
C  POEBS_DB        Volume               mozart02vol          ruabon1            
C  POEBS_DB        Volume               oraappl_nfs_vol      ruabon1            
C  POEBS_DB        Volume               oraappl_vol          ruabon1            
C  POEBS_DB        Volume               oradata01_vol        ruabon1            
C  POEBS_DB        Volume               oraexport_vol        ruabon1            
C  POEBS_DB        Volume               oraimport_vol        ruabon1            
C  POEBS_DB        Volume               oralogs_vol          ruabon1            
C  POEBS_DB        Volume               oraredo01_vol        ruabon1            
C  POEBS_DB        Volume               oraredo02_vol        ruabon1            
C  POEBS_MQ        Application          poebs_mq             ruabon1            

-- RESOURCES OFFLINING
-- Group           Type            Resource             System               IState

F  POEBS_DB        Mount           oraappl_nfs_fs       ruabon1              W_OFFLINE_PATH


Workaround
============


When the "dgdisabled" flag is displayed like that for your diskgroup, it means that vxconfigd lost access to all enabled configuration copies for the disk group.  Despite the loss of access to the configuration copies, file systems remain enabled because no IO resulted in fatal errors.

This is typical for a brief SAN outtage on the order of a few seconds, allowing file system IO to complete after retrying.

The condition itself means that the disk group in-memory is no longer connected to the underlying disk group configuration copies.  VM will not allow any further modification to the disk group.  This condition is to allow you to sanely bring down file systems / applications.

If you have a high confidence that your disks are all in a good state and no corruption has occurred, you can attempt to restart vxconfigd.  When vxconfigd starts back up, it will scan the underlying disks and if everything is clean and correct, it will reattach those devices to the disk group.

NOTE however that this procedure will further degrade your environment if it fails.

1. Freeze all Service Groups with VM resources:

# hagrp -freeze <group> -sys <system>

2. Restart vxconfigd

# vxconfigd -k -x syslog

3. Confirm resulting status:

# vxdg list

# vxdisk list

4. Unfreeze Service Groups if DG is now corrected

# hagrp -unfreeze <group> -sys <system>



After correcting the disk group condition, you will need to look at your cluster configuration for the "oraappl_nfs_fs" resource and determine what its mount point and block device are.  From the block device you can determine disk group and volume name.  

Umount the mount point and remount.  

Verify that the volume is in a good state in vxprint -htg <diskgroup>.



A node reboot could potentially correct all of this automatically as well.

No comments:

Post a Comment