Thursday, December 15, 2016

Veritas Cluster Server service in Solaris 10 reports "maintenance"

Problem

Attempting to enable the svc:/system/vcs:default service using svcadm enable vcs, the service transitions to maintenance.

Error Message

svc:/system/vcs:default (Veritas Cluster Server (VCS) Init service)
 State: maintenance since Thu Oct 14 12:15:57 2010
Reason: Start method failed repeatedly, last exited with status 2.
   See: http://sun.com/msg/SMF-8000-KS
   See: man -M /opt/VRTS/man/man1m/ -s 1M vcsconfig
   See: /var/svc/log/system-vcs:default.log
Impact: This service is not running.

Cause

This situation can occur if the '/etc/default/vcs' file is present, but the VCS_START variable has been set to '0' (i.e., do not start or stop VCS via svc:/system/vcs:default) explicitly in /etc/default/vcs.

Solution


Set VCS_START variable to '1' in /etc/default/vcs

LLT service in SMF is showing in maintenance state






After installing VCS on new node and trying to add it to existing cluster llt service may not start through SMF on new node. It will show LLT service is in maintenance status. Due to this rest of cluster services will not start during system boot
However, a manual start on LLT and GAB working fine.
Error Message

LLT service in SMF is showing in maintenance status.
# svcs -xv

svc:/system/llt:default (Veritas Low Latency Transport (LLT) Init service)

 State: maintenance since Thu Jul 29 13:40:32 2010

Reason: Start method failed repeatedly, last exited with status 2.

   See: http://sun.com/msg/SMF-8000-KS

   See: man -M /opt/VRTSllt/man/man1m/ -s 1M lltconfig

   See: /var/svc/log/system-llt:default.log

Impact: 3 dependent services are not running:

        svc:/system/gab:default

        svc:/system/vcs:default

        svc:/system/vxfen:default



LLT is in maintenance state even its dependencies are in Online states.
#svcs -l llt

fmri         svc:/system/llt:default

name         Veritas Low Latency Transport (LLT) Init service

enabled      true

state        maintenance

next_state   none

state_time   Fri Jul 30 02:09:03 2010

logfile      /var/svc/log/system-llt:default.log

restarter    svc:/system/svc/restarter:default

dependency   require_all/none svc:/system/filesystem/local (online)

dependency   optional_all/none svc:/network/initial (online)


Log file /var/svc/log/system-llt:default.log  is showing below errors.

[ Jul 30 02:13:17 Executing start method ("/lib/svc/method/llt start") ]
silent failure
[ Jul 30 02:13:17 Method "start" exited with status 2 ]
Cause

SMF uses /lib/svc/method/llt script to start LLT at boot time and same to stop at shutdown and it needs environment set in "/etc/default/llt"

# cat /etc/default/llt
#
# This file is sourced :
#       from /etc/init.d/llt            for Solaris < 2.10
#       from /lib/svc/method/llt        for Solaris 2.10
#
# Set the two environment variables below as follows:
#
#       1 = start or stop llt
#       0 = do not start or stop llt
#

LLT_START=0
LLT_STOP=0


If LLT_START is set equal to zero, SMF service does not start LLT service at boot time.

Solution

Modified /etc/default/llt and set LLT_START and LLT_STOP to ONE

# cat /etc/default/llt
#
# This file is sourced :
#       from /etc/init.d/llt            for Solaris < 2.10
#       from /lib/svc/method/llt        for Solaris 2.10
#
# Set the two environment variables below as follows:
#
#       1 = start or stop llt
#       0 = do not start or stop llt
#

LLT_START=1
LLT_STOP=1

Wednesday, November 16, 2016

How to Increasing the size of the root file system (slice) on an encapsulated root disk-VXvm


This technical document describes a procedure to increase the root file system (slice), where the root disk is encapsulated under VERITAS Volume Manager (tm). It is acknowledged that there may be other ways to achieve this objective. It is also possible that the procedure described may not fully apply to every system running Volume Manager, but is likely that the procedure may be adopted with some changes to suit most of the common Volume Manager configurations and installations.


Assumptions:

The boot (root) disk is encapsulated.
The root diskgroup (rootdg) may consist of more than one disk.
One free disk with at least as much capacity as the current root disk.


Preliminary details:

If the free disk required for this procedure is only as big as the current root disk, then there must be enough free space on the root disk to allow for the increased root file system size. If this free space is not available on the disk, then the proposed increase to the root file system size must come from a corresponding reduction in the size, or omission of some other file system (or partition) on the root disk.
The free disk described in this procedure is used to prepare a new root disk with the necessary file system (slice) sizes. The original root disk can be preserved at least until the system boots from the new root disk successfully.
The general procedure to accomplish this task is as follows:


  1. Initialize new root disk and add to rootdg diskgroup
  2. Mirror root volume onto new disk from current root disk
  3. Increase size of root volume and file system
  4. Mirror other volumes from current root disk
  5. Break mirrors on new root disk
  6. Create underlying physical partitions on disk
  7. Create swap partition on disk
  8. Reboot system off slices from new root disk
  9. Remove volumes from old root disk
  10. Encapsulate new root disk and add to rootdg disk group
  11. Reboot off new root disk using volumes
  12. Mirror new root disk to another equivalent disk


If the new root disk is the same size as the original root disk, then upon successful encapsulation of the new root disk, it is possible to mirror back to the original root disk, thereby having two identical mirrors. If the new root disk is of larger capacity, then another disk of equivalent size will be necessary for the root mirror.
In the detailed procedure listed below, the rootdg initially consists of the boot disk (rootdisk=c2t8d0) and one additional disk (root0=c2t3d0) containing Volume Manager volumes. The procedure would be unchanged even if the boot disk has a mirror. The new disk that will eventually become the boot disk is c2t2d0.


Detailed procedure

1. Initialize new disk:
Is new disk same size as current boot disk?
Yes
Is Volume Manager version 3.2?
Yes
Check private region length for boot disk using prtvtoc command. Here, c2t8d0 is the root disk.
# prtvtoc /dev/rdsk/c2t8d0


Slice 7 (tag 15) is the vtoc entry representing the private region. Note the sector count of 2744. This will be the public region offset for the new disk.
Initialize the new disk (c2t2d0) using the vxdisksetup command.
# /etc/vx/bin/vxdisksetup -i c2t2d0 puboffset=2744
Go to step 2.
No (Volume Manager version < 3.2)
Initialize the new disk (c2t2d0) using the vxdisksetup command.
# /etc/vx/bin/vxdisksetup -i c2t2d0
Go to step 2.
No (new disk of higher capacity)
Initialize the new disk (c2t2d0) using the vxdisksetup command.
# /etc/vx/bin/vxdisksetup -i c2t2d0

2. Add new disk to rootdg. In this example, the disk is named newroot (c2t2d0).
# vxdg -g rootdg adddisk newroot=c2t2d0

The rootdg configuration now looks as follows.
# vxdisk list


# vxprint -Qqhtg rootdg


3. Mirror the root slice to disk newroot.
# /etc/vx/bin/vxrootmir newroot

Check rootvol details.
# vxprint -Qqhtr rootvol


4. A series of steps to increase size of the root slice on the new disk.
Disassociate the newly created plex rootvol-02 from rootvol volume.
# vxplex dis rootvol-02

Make a volume named rootalt with usage type "gen" and associate plex rootvol-02.
# vxmake -U gen vol rootalt plex=rootvol-02

Make the volume active.
# vxvol init active rootalt

Volume rootalt, as it appears now.
# vxprint -Qqhtr rootalt


Run a file system check on the "ufs" file system contained within volume rootalt. Fix the file system state in the super block and any other errors encountered. This will clear the file system super-block flag and enable us to mount the file system.
# fsck -F ufs /dev/vx/rdsk/rootdg/rootalt

Mount the rootalt volume on /mnt (or /a)
# mount -F ufs /dev/vx/dsk/rootdg/rootalt /mnt

Estimate the new size for the root slice (rootalt). The new size must be such that the sub-disk ends on a disk cylinder boundary. In this example, the rootvol (and the newly created mirror volume, rootalt) is 4195576 disk sectors long (1 disk sector = 512 bytes). This translates to a size that is marginally over 2 GB. This size will be increased by 500 MB to a new size that is roughly 2.5 GB.
# prtvtoc /dev/rdsk/c2t2d0s2


From the above, it is clear that the root partition size of 4195576 sectors corresponds to 3058 disk cylinders (4195576 sectors / 1372 sectors per cylinder = 3058 cylinders).
The proposed increase is 500 MB. This translates to approximately 746 disk cylinders. The calculation for this being as follows:
500 MB = 500 * 1024 KB = 512000 KB = 512000 * 2 disk sectors = 1024000 disk sectors
1024000 sectors / 1372 sectors per cylinder = 746.35 cylinders.
In this discussion, the figure has been rounded down to 745 cylinders. This means that the new size for the root volume will be 3803 cylinders (3058 + 745). This translates to a size of 5217716 sectors (3803 cylinders * 1372 sectors per cylinder).

Increase size of volume rootalt and the file system to the new size.
# vxassist growto rootalt 5217716

Increase the size of the "ufs" file system to the new size.
# /usr/sbin/growfs -M /mnt /dev/vx/rdsk/rootdg/rootalt

Verify the size of the rootalt volume
# vxprint -Qqhtr rootalt


The new root file system (mounted on /mnt) shows increased file system size.
# df -k


5. Re-write the disk slice information that represents the sub-disk (newroot-01) for volume rootalt.
The current entry (see the prtvtoc command output at step 4, slice 0) contains the old size.
# /etc/vx/bin/vxmksdpart -g rootdg newroot-01 0 0x2 0x0
(command usage: vxmksdpart -g disk-group sub-disk disk-partition tag-in-hexadecimal flag-in-hexadecimal)

Now, the slice information for disk newroot (c2t2d0) is:
# prtvtoc -s /dev/rdsk/c2t2d0s2


6. Delete the rootalt volume from the new root disk.
# cd /
# umount /mnt

Stop the volume.
# vxvol stop rootalt

Disassociate the plex from the volume and remove the volume.
# vxplex dis rootvol-02
# vxedit rm rootalt

Disassociate the sub-disk from the plex and remove the plex.
# vxsd dis newroot-01
# vxedit rm rootvol-02

7. Mirror all the other volumes from the current root disk to the new root disk.
Do not mirror swap volumes. Swap slices will be created on the new disk manually.
In this example, the volumes to mirror are var and opt.

# vxassist -g rootdg mirror var newroot
# vxassist -g rootdg mirror opt newroot

The rootdg diskgroup now looks like this.
# vxprint -Qqhtg rootdg


8. Create the underlying physical partition for each of these volumes, using the vxmksdpart command.

From the sub-disks for the var and opt volumes on the new root disk, create the vtoc entries.
# /etc/vx/bin/vxmksdpart -g rootdg newroot-02 5 0x7 0x0
# /etc/vx/bin/vxmksdpart -g rootdg newroot-03 6 0x0 0x0

9. Remove all plexes and sub-disks from the new root disk.
Remove the disk from the rootdg disk group and take the disk out of Volume Manager control.

Remove the plexes first.
# vxplex -o rm dis var-02
# vxplex -o rm dis opt-02

Remove the only remaining sub-disk next.
# vxedit rm newroot-01

Remove the disk from rootdg.
# vxdg -g rootdg rmdisk newroot

Take the disk out of Volume Manager control.
# /etc/vx/bin/vxdiskunsetup c2t2d0

10. Using the Solaris format command, create a disk partition for swap on the new root disk.
The size of this partition may be adjusted according to the requirements of the system.
In this example, slice 1 has been used for the swap partition.
In case the new root disk has unallocated space at the end of the disk, create a dummy slice that uses up that space. If this is not done, then the private region uses space from the end of the disk when the disk is encapsulated at step 14. Since we already have free space at the beginning of the disk that was being used for the private region when this disk was under Volume Manager control (see prtvtoc output of disk at step 5), we may not want this space to be wasted. Please note that adding this dummy slice is not essential. The only purpose is to make use of the space already available at the beginning of the disk for the private region. The dummy volume created due to the encapsulation at step 14 may be deleted. The volume table of contents (VTOC) on the new root disk looks as shown below:
# prtvtoc -s /dev/rdsk/c2t2d0s2


Slice 0=root slice, slice 1=swap, slice 5=var, slice 6=opt, slice 7=dummy slice

11. Perform a file system check on all partitions containing file systems. In this example, there are three "ufs" file systems.
Clear the super-block flag for the file systems and any other errors encountered.
# fsck -F ufs /dev/rdsk/c2t2d0s0
# fsck -F ufs /dev/rdsk/c2t2d0s5
# fsck -F ufs /dev/rdsk/c2t2d0s6

12. Mount the root slice from the new disk on to /mnt and edit the /etc/system and /etc/vfstab files.
# mount -F ufs /dev/dsk/c2t2d0s0 /mnt

Edit the etc/system file and comment out the lines relevant to Volume Manager boot off encapsulated boot disks.
# vi /mnt/etc/system

Comment out the following two lines by placing an asterisk in the first column and then save the file.



Make a copy of etc/vfstab file and then edit the file. For all the file systems and swap volumes on the current root disk, change volume devices to disk partition devices (slices on new boot disk). Save the edited file.
# cd /mnt/etc
# cp vfstab vfstab.orig
# vi vfstab

In this example, the vfstab looks like this:



Unmount the root slice on the new boot disk.
# cd /
# umount /mnt

13. Take system to "ok" prompt and then boot off the new boot disk.
# init 0

ok> devalias

From the devalias output, select the new boot disk device alias. In our case, this is represented by the vx-newroot alias (a reset command may be required for the alias to be visible).

ok> boot vx-newroot

After the system has rebooted off the slices on the new boot disk, the file system mounts look as follows:
# df -k


Please note that although the system has used the new boot disk to boot off slices, Volume Manager is still able to start up due to the presence of the rootdg diskgroup. All rootdg volumes are started. However, due to the changes made to the vfstab file (at step 13), none of the file systems on the volumes in the old root disk should be mounted.

14. Encapsulate the new root disk and add it to the rootdg diskgroup.
Before we remove the original root disk from the rootdg diskgroup, it may be best to ensure that the underlying physical partitions exist for the volumes on that disk. By ensuring this, we can revert back to using that disk if necessary. Use the prtvtoc command to check for current slice information for the old root disk. Missing slices may be added to the disk VTOC using the vxmksdpart command (please see previous examples at steps 5 and 9).

Remove all volumes from the old root disk (and its mirrors, if any).
# vxassist -g rootdg remove volume rootvol
# vxassist -g rootdg remove volume swapvol
# vxassist -g rootdg remove volume var
# vxassist -g rootdg remove volume opt

Rename the old root disk. In this example, rootdisk is being renamed as rootold.
# vxedit -g rootdg rename rootdisk rootold

Encapsulate the new root disk (c2t2d0).
# /etc/vx/bin/vxencap rootdisk=c2t2d0

Set the default system boot device to be the new root disk. In our example, a device alias called vx-newroot already exists for the new disk.
# /etc/vx/bin/vxeeprom boot-device vx-newroot

Reboot the system. It will reboot twice, and come up on volumes on the new root disk.
# shutdown -g0 -y -i6

Examining the file systems mounted after the reboot, we can see volumes being used instead of slices.
# df -k


Remove the old root disk (and mirror disk, if applicable) from the rootdg diskgroup.
# vxdg -g rootdg rmdisk rootold

The disks on the system now appear as follows:
# vxdisk list


The rootdg diskgroup now consists of all the original volumes.
# vxprint -Qqhtg rootdg


In the output, note the volume rootdisk7vol. This volume has been created on account of the dummy slice that was added at step 10. This volume may be deleted as it does not contain any useful information.
# vxassist remove volume rootdisk7vol

The new root disk's VTOC looks as follows.
# prtvtoc -s /dev/rdsk/c2t2d0s2


15. If the new root disk is to be mirrored, ensure that a disk of at least similar capacity is available for the mirror.
You could, for example, replace the old boot disk with a new disk that is the same size as the new root disk.
In case a new disk will be used for the mirror, follow the steps described at step 1 (substitute the literal "new disk" with "mirror disk") to initialize the disk. Add the disk to the rootdg diskgroup.
Mirror all volumes on the boot disk to mirror disk (rootmir).
# /etc/vx/bin/vxmirror rootdisk rootmir  

At the conclusion of the above steps, we are left with a running Volume Manager configuration with an expanded root volume.x

Basic understanding about main.cf: the Cluster Server (VCS) configuration file


This TechNote will examine the contents of a basic VERITAS Cluster Server (VCS) configuration file - /etc/VRTSvcs/conf/config/main.cf.

The basic unit of a VERITAS Cluster Server configuration is the resource.  VERITAS Cluster Server is comprised of the engine (the "had" and "hashadow" daemons) and agents.  Agents are responsible for keeping track of Cluster Server resources of a well-defined type.  The resource types are defined, by default, in the /etc/VRTSvcs/conf/config/types.cf file.  

For example, the mount agent is responsible for mounting, unmounting, and monitoring the status of VCS controlled file systems.  It is necessary for the mount agent, therefore, to be aware of at least the device to mount ("BlockDevice" attribute) and the mount point (the aptly named "MountPoint" attribute).  For more information on the agents bundled with VCS and their attributes, see the VERITAS Cluster Server Bundled Agent Guide.

Resources will also have dependencies.  A resource dependency exists when one requires another be online or offline before it can be online or offline.  By way of example, a mount resource might depend on a volume resource (the volume agent monitors a VERITAS Volume Manager volume).

A collection of resources and their dependencies are a service group.  VERITAS Cluster Server failover occurs at the service group level.  This configuration is stored in the file /etc/VRTSvcs/conf/config/main.cf.  In the following example, the actual file contents appear in bold with an explanation of each entry purpose underneath.

# more main.cf
include "types.cf"

"include" statements tell VCS which files contain the definitions of resources.  The file "types.cf" contains the definitions for the standard VCS bundled agents.  Other VERITAS agents, such as the Enterprise Agent for Oracle, will be bundled with their own definition file (OracleTypes.cf, in the Oracle case), requiring a separate "include" statement.  

cluster vcs_01 (
       )
The name of the cluster of which this node is a part.

system camcs12 (
       )

system camcs9 (
       )
The "system" statements tell the cluster nodes which other systems are members of the same cluster.  Note that these are VCS-defined system names, which do not necessarily have to be the same as the systems host name defined in /etc/hosts.  These names are defined in /etc/llthosts.

group MyServiceGroup (
The "group" keyword indicates that a service group configuration follows.  The first portion of the service group contains attributes which apply at the service group level, such as SystemList (see following) or Parallel.

SystemList = { camcs9 = 0, camcs12 = 1, camcs13 = 2 }
The "SystemList" attribute tells VCS which nodes of this cluster can activate this service group, and in what order VCS should attempt to fail over.  Assuming the service group was online on camcs9, on failover VCS will first attempt to bring the group up on camcs12.  If the online should fail on camcs12, it will then try to bring it up on camcs13. 

AutoStartList = { camcs9 }
The "AutoStartList" attribute defines which nodes will automatically bring the service group online at VCS startup.  If the AutoStartList contains more than one system, each system will first check to see if the service is already online on another node in the cluster before attempting to bring the service group online locally.

)
The above 'close-paren' ends the service group attribute section.  At this point, the resources which make up the service group are now listed, along with their attributes.

Application MyApp(
A Resource begins with the Type of resource (Application), the user defined resource name ("MyApp"), and an open-paren.  Resource attributes will be read in until a close-paren is encountered.  Each resource has a unique set of attributes - for more information on each of the attributes, see the VCS Bundled Agent Guide.

StartProgram = "/apps/MyApp/online"
StopProgram = "/apps/MyApp/offline"
CleanProgram = "/apps/MyApp/clean"
MonitorProgram = "/apps/MyApp/monitor"
      )
Close-paren, indicating the application resource configuration is finished.

Other resources will follow, filling out the rest of the service group:

DiskGroup MyDG (
DiskGroup = "mydg"
)

Mount MyMount (
         MountPoint = "/mount"
         BlockDevice = "/dev/vx/dsk/myvol"
         )

Volume MyVol (
DiskGroup = "mydg"
Volume = "myvol"
)

The final portion of the service group configuration is the resource dependency tree.  Dependencies are specified in the format "resourceX requires resourceY" - resourceY is the pre-requisite for resourceX.  A resource will not be brought online until its prerequisite resources are online.  Additionally, a resource will not be brought offline if there are others that require it still online.

MyApp requires MyMount
MyMount requires MyVol
MyVol requires MyDG

In this case, the DiskGroup resource MyDG will be brought online first, followed by the volume resource MyVol, followed by MyMount, and finally MyApp.  The order will be reversed on offline - MyApp will be brought offline first (as there are no resources which require it), followed by MyMount, MyVol, and finally MyDG.

How to get rid of 'dgdisabled' flag:vxdisk list showing dg disabled:



I issues vxdctl enable command  in a two node VCS cluster and  it made all the dgs disabled as shown below

[root@ruabon1 ~]# vxdctl enable

You have new mail in /var/spool/mail/root
[root@ruabon1 ~]#
[root@ruabon1 ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
cciss/c0d0   auto:none       -            -            online invalid
sda          auto:cdsdisk    POEBS_ORAAPPL_01  POEBS_ORAAPPL_DG online dgdisabled
sdaa         auto:cdsdisk    POEBS_ORADATA01_15  POEBS_ORA_IMP_EXP_DG online dgdisabled
sdah         auto:cdsdisk    POEBS_ORAAPPL_03  POEBS_ORAAPPL_NFS_DG online dgdisabled
sdb          auto:cdsdisk    POEBS_ORADATA01_01  POEBS_ORADATA01_DG online dgdisabled
sdbq         auto:cdsdisk    POEBS_ORADATA01_17  POEBS_ORADATA01_DG online dgdisabled
sdc          auto:cdsdisk    POEBS_ORADATA01_02  POEBS_ORADATA01_DG online dgdisabled
sdd          auto:cdsdisk    POEBS_ORAREDO01_01  POEBS_ORAREDO01_DG online dgdisabled
sde          auto:cdsdisk    POEBS_ORALOGS_01  POEBS_ORALOGS_DG online dgdisabled
sdf          auto:cdsdisk    POEBS_ORAREDO02_01  POEBS_ORAREDO02_DG online dgdisabled
sdg          auto:cdsdisk    POEBS_ORADATA01_09  POEBS_ORADATA01_DG online dgdisabled
sdh          auto:cdsdisk    POEBS_ORADATA01_16  POEBS_ORADATA01_DG online dgdisabled
sdi          auto:cdsdisk    POEBS_ORAREDO01_02x  POEBS_ORAREDO01_DG online dgdisabled
sdj          auto:cdsdisk    POEBS_ORAREDO02_02x  POEBS_ORAREDO02_DG online dgdisabled
sdk          auto:cdsdisk    POEBS_ORA_IMP_EXP_DG01  POEBS_ORA_IMP_EXP_DG online dgdisabled
sdl          auto:cdsdisk    POEBS_ORAAPPL_02  POEBS_ORAAPPL_NFS_DG online dgdisabled
sdm          auto:cdsdisk    POEBS_MQ_01  POEBS_MQ_DG  online dgdisabled
sdn          auto:cdsdisk    POEBS_ORADATA01_03  POEBS_ORADATA01_DG online dgdisabled
sdo          auto:cdsdisk    POEBS_ORADATA01_04  POEBS_ORADATA01_DG online dgdisabled
sdp          auto:cdsdisk    POEBS_ORADATA01_05  POEBS_ORADATA01_DG online dgdisabled
sdq          auto:cdsdisk    POEBS_ORADATA01_06  POEBS_ORADATA01_DG online dgdisabled
sdr          auto:cdsdisk    POEBS_ORADATA01_07  POEBS_ORADATA01_DG online dgdisabled
sds          auto:cdsdisk    POEBS_ORADATA01_08  POEBS_ORADATA01_DG online dgdisabled
sdt          auto:cdsdisk    POEBS_ORADATA01_10  POEBS_ORADATA01_DG online dgdisabled
sdu          auto:cdsdisk    POEBS_ORADATA01_11  POEBS_ORADATA01_DG online dgdisabled
sdv          auto:cdsdisk    POEBS_ORADATA01_12  POEBS_ORADATA01_DG online dgdisabled
sdw          auto:cdsdisk    POEBS_ORADATA01_13  POEBS_ORADATA01_DG online dgdisabled
sdx          auto:cdsdisk    POEBS_ORADATA01_14  POEBS_ORADATA01_DG online dgdisabled
sdy          auto            -            -            error
sdz          auto            -            -            error



[root@ruabon1 ~]# hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen             

A  ruabon1              RUNNING              0                   
A  ruabon2              RUNNING              0                   

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State         

B  POEBS_DB        ruabon1              Y          N               PARTIAL       
B  POEBS_DB        ruabon2              Y          N               OFFLINE       
B  POEBS_MQ        ruabon1              Y          N               PARTIAL       
B  POEBS_MQ        ruabon2              Y          N               OFFLINE       

-- RESOURCES FAILED
-- Group           Type                 Resource             System             

C  POEBS_DB        Volume               mozart01_vol         ruabon1            
C  POEBS_DB        Volume               mozart02vol          ruabon1            
C  POEBS_DB        Volume               oraappl_nfs_vol      ruabon1            
C  POEBS_DB        Volume               oraappl_vol          ruabon1            
C  POEBS_DB        Volume               oradata01_vol        ruabon1            
C  POEBS_DB        Volume               oraexport_vol        ruabon1            
C  POEBS_DB        Volume               oraimport_vol        ruabon1            
C  POEBS_DB        Volume               oralogs_vol          ruabon1            
C  POEBS_DB        Volume               oraredo01_vol        ruabon1            
C  POEBS_DB        Volume               oraredo02_vol        ruabon1            
C  POEBS_MQ        Application          poebs_mq             ruabon1            

-- RESOURCES OFFLINING
-- Group           Type            Resource             System               IState

F  POEBS_DB        Mount           oraappl_nfs_fs       ruabon1              W_OFFLINE_PATH


Workaround
============


When the "dgdisabled" flag is displayed like that for your diskgroup, it means that vxconfigd lost access to all enabled configuration copies for the disk group.  Despite the loss of access to the configuration copies, file systems remain enabled because no IO resulted in fatal errors.

This is typical for a brief SAN outtage on the order of a few seconds, allowing file system IO to complete after retrying.

The condition itself means that the disk group in-memory is no longer connected to the underlying disk group configuration copies.  VM will not allow any further modification to the disk group.  This condition is to allow you to sanely bring down file systems / applications.

If you have a high confidence that your disks are all in a good state and no corruption has occurred, you can attempt to restart vxconfigd.  When vxconfigd starts back up, it will scan the underlying disks and if everything is clean and correct, it will reattach those devices to the disk group.

NOTE however that this procedure will further degrade your environment if it fails.

1. Freeze all Service Groups with VM resources:

# hagrp -freeze <group> -sys <system>

2. Restart vxconfigd

# vxconfigd -k -x syslog

3. Confirm resulting status:

# vxdg list

# vxdisk list

4. Unfreeze Service Groups if DG is now corrected

# hagrp -unfreeze <group> -sys <system>



After correcting the disk group condition, you will need to look at your cluster configuration for the "oraappl_nfs_fs" resource and determine what its mount point and block device are.  From the block device you can determine disk group and volume name.  

Umount the mount point and remount.  

Verify that the volume is in a good state in vxprint -htg <diskgroup>.



A node reboot could potentially correct all of this automatically as well.

Upgrading CVM protocol version and VxVM disk group version


The default Cluster Volume Manager protocol version is 110.

Run the following command to verify the CVM protocol version:

# /opt/VRTS/bin/vxdctl protocolversion
If the protocol version is not 110, run the following command to upgrade the version:

# /opt/VRTS/bin/vxdctl upgrade
All Veritas Volume Manager disk groups have an associated version number. Each VxVM release supports a specific set of disk group versions and can import and perform tasks on disk groups with those versions. Some new features and tasks work only on disk groups with the current disk group version. Before you can perform the tasks, you need to upgrade existing disk group version to 170.

Check the existing disk group version:

# vxdg list dg_name|grep -i version
If the disk group version is not 170, run the following command on the master node to upgrade the version:

# vxdg -T 170 upgrade dg_name

Howto get VxVM diskgroup version


If you need to determine the version of a Veritas diskgroup it can be done by two ways:

vxdg command:

Execute vxdg list <diskgroup> and look for the version field in the output.

root@vmnode1:~# vxdg list dg_sap
Group:     dg_sap
dgid:      1273503890.14.vmnode1
import-id: 1024.10
flags:     cds
version:   140 <--- VERSION!
alignment: 8192 (bytes)
local-activation: read-write
ssb:            on
detach-policy: global
dg-fail-policy: dgdisable
copies:    nconfig=default nlog=default
config:    seqno=0.1076 permlen=24072 free=24068 templen=2 loglen=3648
config disk disk27 copy 1 len=24072 state=clean online
config disk disk28 copy 1 len=24072 state=clean online
log disk disk27 copy 1 len=3648
log disk disk28 copy 1 len=3648
root@vmnode1:~#

vxprint command:

Run vxprint -l <diskgroup> and again look for the versiĆ³n field as shown in the example.

root@vmnode1:~# vxprint -l dg_sap
Disk group: dg_sap

Group:    dg_sap
info:     dgid=1273503890.14.vmnode1
version:  140 <--- VERSION!
alignment: 8192 (bytes)
activation: read-write
detach-policy: global
dg-fail-policy: dgdisable
copies:   nconfig=default nlog=default
devices:  max=32767 cur=1
minors:   >= 4000
cds=on

root@vmnode1:~#

How to convert a diskgroup to cds, and all the disks to cdsdisk type.


This procedure converts a diskgroup to cds, and all the disks to cdsdisk type to allow for cross-platform functionality. 

Note the "sliceddg" diskgroup has two "sliced" type devices listed:
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdg01 rootdg online
c0t1d0s2 auto:none - - online invalid
c2t0d0s2 auto:cdsdisk t0 cdsdg online
c2t1d0s2 auto:cdsdisk t1 cdsdg online
c2t2d0s2 auto:cdsdisk t2 cdsdg online
c2t3d0s2 auto:cdsdisk t3 cdsdg online
c2t4d0s2 auto:cdsdisk t4 cdsdg online
c2t8d0s2 auto:sliced t8 sliceddg online
c2t9d0s2 auto:sliced t9 sliceddg online

The diskgroup "flags" would list "cds" if this were a cds diskgroup, this one is not cds yet:
# vxdg list sliceddg
Group: sliceddg
dgid: 1241191141.35.ms1
import-id: 1024.17
flags:
version: 140
alignment: 8192 (bytes)
ssb: on
autotagging: on
detach-policy: global
dg-fail-policy: dgdisable
copies: nconfig=default nlog=default
config: seqno=0.1073 permlen=48144 free=48139 templen=4 loglen=7296
config disk c2t8d0s2 copy 1 len=50144 state=clean online
config disk c2t9d0s2 copy 1 len=50144 state=clean online
config disk c2t11d0s2 copy 1 len=48144 state=clean online
log disk c2t8d0s2 copy 1 len=7597
log disk c2t9d0s2 copy 1 len=7597

Run the following command to change the sliceddg diskgroup to "cds":
# vxcdsconvert -g sliceddg group

Ensure "cds" is listed in "flags" now after the convert:
# vxdg list sliceddg
Group: sliceddg
dgid: 1241191141.35.ms1
import-id: 1024.17
flags: cds
version: 140
alignment: 8192 (bytes)
ssb: on
autotagging: on
detach-policy: global
dg-fail-policy: dgdisable
copies: nconfig=default nlog=default
config: seqno=0.1084 permlen=48144 free=48139 templen=4 loglen=7296
config disk c2t8d0s2 copy 1 len=49936 state=clean online
config disk c2t9d0s2 copy 1 len=49936 state=clean online
config disk c2t11d0s2 copy 1 len=48144 state=clean online
log disk c2t8d0s2 copy 1 len=7568
log disk c2t9d0s2 copy 1 len=7568

Now convert all the disks to "cdsdisk" type:
# vxcdsconvert -g sliceddg alldisks

The two disks in sliceddg are now "cdsdisk" type:
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdg01 rootdg online
c0t1d0s2 auto:none - - online invalid
c2t0d0s2 auto:cdsdisk t0 cdsdg online
c2t1d0s2 auto:cdsdisk t1 cdsdg online
c2t2d0s2 auto:cdsdisk t2 cdsdg online
c2t3d0s2 auto:cdsdisk t3 cdsdg online
c2t4d0s2 auto:cdsdisk t4 cdsdg online
c2t8d0s2 auto:cdsdisk t8 sliceddg online
c2t9d0s2 auto:cdsdisk t9 sliceddg online

The public, and private region are now on a single slice as expected for cdsdisk type:
# vxdisk -g sliceddg list t8
Device: c2t8d0s2
devicetag: c2t8d0
type: auto
hostid: ms1
disk: name=t8 id=1240930523.23.ms1
group: name=sliceddg id=1241191141.35.ms1
info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags: online ready private autoconfig autoimport imported
pubpaths: block=/dev/vx/dmp/c2t8d0s2 char=/dev/vx/rdmp/c2t8d0s2
guid: {9c0140d4-1dd1-11b2-abd5-0003bad88c3e}
udid: SEAGATE%5FST39204LCSUN9.0G%5FDISKS%5F3BV0LWSY00007103A8FZ
site: -
version: 3.1
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=2 offset=68224 len=17613856 disk_offset=0
private: slice=2 offset=256 len=67968 disk_offset=0
update: time=1247852277 seqno=0.9
ssb: actual_seqno=0.0
headers: 0 240
configs: count=1 len=49936
logs: count=1 len=7568
Defined regions:
config priv 000048-000239[000192]: copy=01 offset=000000 enabled
config priv 000256-049999[049744]: copy=01 offset=000192 enabled
log priv 050000-057567[007568]: copy=01 offset=000000 enabled
lockrgn priv 057568-057711[000144]: part=00 offset=000000
Multipathing information:
numpaths: 1
c2t8d0s2 state=enabled

The new table of contents on one of the converted devices has changed:
# prtvtoc /dev/rdsk/c2t8d0s2
* /dev/rdsk/c2t8d0s2 partition map
*
* Dimensions:
* 512 bytes/sector
* 133 sectors/track
* 27 tracks/cylinder
* 3591 sectors/cylinder
* 4926 cylinders
* 4924 accessible cylinders
*
* Flags:
* 1: unmountable
* 10: read-only
*
* First Sector Last
* Partition Tag Flags Sector Count Sector Mount Directory
2 5 01 0 17682084 17682083
7 15 01 0 17682084 17682083

How convert cds disk format to a sliced disk format


Converting from sliced to cds is possible using the vxcdsconvert command but there is no direct command to convert cds disk to sliced disk. 

One work around is to evacuate the existing data to a new device which is initialized as sliced disk. 
Here is an example on how to evacuate data from cds disk to sliced disk. 

Example: 
Initialized one device as cdsdisk and one device as sliced: 
--------------- 
c3t50060E80004372C0d6s2 auto:sliced - - online 
c3t50060E80004372C0d17s2 auto:cdsdisk - - online 
--------------- 
Create a disk group with cdsdisk: 
--------------- 
# vxdg init mixdg cds_disk01=c3t50060E80004372C0d17s2 

# vxdisk -g mixdg list 
DEVICE TYPE DISK GROUP STATUS 
c3t50060E80004372C0d17s2 auto:cdsdisk cds_disk01 mixdg online <====== 
--------------- 
Create volumes and files on mixdg disk group 
--------------- 
# vxassist -g mixdg maxsize 
Maximum volume size: 1918976 (937Mb) 
# vxassist -g mixdg make datavol 1918976 
# mkfs -F vxfs /dev/vx/rdsk/mixdg/datavol 
version 7 layout 
1929216 sectors, 964608 blocks of size 1024, log size 16384 blocks 
largefiles supported 
--------------- 
Mount the filesystem and populate with data 
--------------- 
# mount -F vxfs /dev/vx/dsk/mixdg/datavol /datavol 
# df -k 
Filesystem kbytes used avail capacity Mounted on 
/dev/vx/dsk/mixdg/datavol 
959488 174847 735831 20% /datavol 
--------------- 
After creation of the cds diskgroup, the data needs to be 'evacuated' to a new sliced disk. 
Important: Make sure there is equal or larger space on the new device compared to the current device. 

To verify they have equal size, verify the public length 
This is the original cdsdisk which has existing data and part of the diskgroup "mixdg" 
--------------- 
# vxdisk list c3t50060E80004372C0d17s2 
Device: c3t50060E80004372C0d17s2 
devicetag: c3t50060E80004372C0d17 


iosize: min=512 (bytes) max=2048 (blocks) 
public: slice=2 offset=65792 len=1920000 disk_offset=0 <==== length is "1920000" 
private: slice=2 offset=256 len=65536 disk_offset=0 
update: time=1273452460 seqno=0.7 
ssb: actual_seqno=0.0 
--------------- 
This is the new lun / it has same size as the existing cds disk 
--------------- 
# vxdisk list c3t50060E80004372C0d6s2 
Device: c3t50060E80004372C0d6s2 
devicetag: c3t50060E80004372C0d6 


version: 2.1 
iosize: min=512 (bytes) max=2048 (blocks) 
public: slice=4 offset=0 len=1920000 disk_offset=76800 <==== pub length is "1920000" 
private: slice=3 offset=1 len=76799 disk_offset=0 
update: time=1273451487 seqno=0.2 
--------------- 
Add a new device with equal or larger size than the existing cds disk in the diskgroup 
--------------- 
# vxdg -g mixdg adddisk sliced_disk01=c3t50060E80004372C0d6s2 
VxVM vxdg ERROR V-5-1-6478 Device c3t50060E80004372C0d6s2 cannot be added to a CDS disk group 
-- Error messages informs that the diskgroup cds attribute is "on". 
-- Cannot add non-cds type when the cds=on. 
Turn 'cds' off so a sliced format can be added to an existing cds disk group. 
# vxdg -g mixdg set cds=off 
Add the sliced disk to the diskgroup: 
# vxdg -g mixdg adddisk sliced_disk01=c3t50060E80004372C0d6s2 
# vxdisk -g mixdg list 
DEVICE TYPE DISK GROUP STATUS 
c3t50060E80004372C0d6s2 auto:sliced sliced_disk01 mixdg online 
c3t50060E80004372C0d17s2 auto:cdsdisk cds_disk01 mixdg online 
--------------- 
vxprint output showing device in used is "cds_disk01" 
--------------- 
# vxprint -rthg mixdg 


dg mixdg default default 27000 1273448675.146.license 
dm cds_disk01 c3t50060E80004372C0d17s2 auto 65536 1931008 - 
dm sliced_disk01 c3t50060E80004372C0d6s2 auto 76543 1881600 - 
v datavol - ENABLED ACTIVE 1929216 SELECT - fsgen 
pl datavol-01 datavol ENABLED ACTIVE 1929216 CONCAT - RW 
sd cds_disk01-01 datavol-01 cds_disk01 0 1929216 0 c3t50060E80004372C0d17 ENA 
--------------- 
Evacuate data from cdsdisk to sliced disk: 
--------------- 
# vxevac -g mixdg cds_disk01 sliced_disk01 
-- takes a bit of time to complete it depends on how big is your data 
--------------- 
Verify if the data is now in "sliced_disk01" disk: 
--------------- 
# vxprint -rthg mixdg 


dg mixdg default default 38000 1273452460.153.license 
dm cds_disk01 c3t50060E80004372C0d17s2 auto 65536 1920000 - 
dm sliced_disk01 c3t50060E80004372C0d6s2 auto 76799 1920000 - 
v datavol - ENABLED ACTIVE 1918976 SELECT - fsgen 
pl datavol-01 datavol ENABLED ACTIVE 1918976 CONCAT - RW 
sd sliced_disk01-01 datavol-01 sliced_disk01 0 1918976 0 c3t50060E80004372C0d6 ENA 
--------------- 
-- Verify access to the data/file system 

Remove the cds disk from the diskgroup: 
--------------- 
# vxdg -g mixdg rmdisk cds_disk01 

# vxdisk -g mixdg list 
DEVICE TYPE DISK GROUP STATUS 
c3t50060E80004372C0d6s2 auto:sliced sliced_disk01 mixdg online 
--------------- 
Uninitialize the original device 
--------------- 
# vxdiskunsetup -C c3t50060E80004372C0d17 
--------------- 
All data is now in evacuated to the new device formatted as sliced. 

If the volume data needs to be moved back, use the same procedure. 

Initialize the original device as sliced 
--------------- 
# vxdisksetup -i c3t50060E80004372C0d17 format=sliced 
--------------- 
NOTE: Make sure the publength is the same or the new disk where you will evacuate the data a has a higher publength size. 
Add the disk back to the disk group 
--------------- 
# vxdg -g mixdg adddisk orig_sliced_disk01=c3t50060E80004372C0d17s2 

# vxdisk -g mixdg list 
DEVICE TYPE DISK GROUP STATUS 
c3t50060E80004372C0d6s2 auto:sliced sliced_disk01 mixdg online 
c3t50060E80004372C0d17s2 auto:sliced orig_sliced_disk01 mixdg online 

# vxprint -rthg mixdg 


dg mixdg default default 38000 1273452460.153.license 
dm orig_sliced_disk01 c3t50060E80004372C0d17s2 auto 76543 1881600 - 
dm sliced_disk01 c3t50060E80004372C0d6s2 auto 76799 1920000 - 
v datavol - ENABLED ACTIVE 1918976 SELECT - fsgen 
pl datavol-01 datavol ENABLED ACTIVE 1918976 CONCAT - RW 
sd sliced_disk01-01 datavol-01 sliced_disk01 0 1918976 0 c3t50060E80004372C0d6 ENA 
--------------- 
Evacuate data from temp_sliced disk to the original physical device: 
--------------- 
# vxevac -g mixdg sliced_disk01 orig_sliced_disk01 
-- takes a bit of time to complete as it depends on how big is the data being evacuated. 
--------------- 
# vxprint -rthg mixdg 


dg mixdg default default 38000 1273452460.153.license 
dm orig_sliced_disk01 c3t50060E80004372C0d17s2 auto 65536 1920000 - 
dm sliced_disk01 c3t50060E80004372C0d6s2 auto 76799 1920000 - 
v datavol - ENABLED ACTIVE 1918976 SELECT - fsgen 
pl datavol-01 datavol ENABLED ACTIVE 1918976 CONCAT - RW 
sd orig_sliced_disk01-01 datavol-01 orig_sliced_disk01 0 1918976 0 c3t50060E80004372C0d17 ENA 
--------------- 
Remove the temporary disk from the diskgroup: 
--------------- 
# vxdg -g mixdg rmdisk sliced_disk01 

# vxdisk -g mixdg list 
DEVICE TYPE DISK GROUP STATUS 
c3t50060E80004372C0d17s2 auto:cdsdisk orig_sliced_disk01 mixdg online 
--------------- 
Uninitialize the sliced_disk01 device 
--------------- 
# vxdiskunsetup -C c3t50060E80004372C0d6 
---------------