Sunday, November 6, 2016
How to dynamically remove a node from a live cluster without interruptions
Before making changes to the VERITAS Cluster Server (VCS) configuration, the main.cf file, make a good copy of the current main.cf. In this example, csvcs6 is removed from a two node cluster. Execute these commands on csvcs5, the system not to be removed.
1. cp -p /etc/VRTSvcs/conf/config/main.cf /etc/VRTSvcs/conf/config/main.cf.last_known.good
2. Check the current systems, group(s), and resource(s) status
# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A csvcs5 RUNNING 0
A csvcs6 RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B test_A csvcs5 Y N ONLINE
B test_A csvcs6 Y N OFFLINE
B test_B csvcs6 Y N ONLINE
B wvcs csvcs5 Y N OFFLINE
B wvcs csvcs6 Y N ONLINE
Based on the outputs, csvcs5 and csvcs6 are the two nodes cluster. Service group test_A and service group wvcs are configured to run on both nodes. Service group test_B is configured to run on csvcs6 only.
Both service groups test_B and wvcs are online on csvcs6. Now it is possible to failover service group wvcs to csvcs5 if it is to be online.
hagrp -switch <service_group> -to <node>
# hagrp -switch wvcs -to csvcs5
3. Check for service group dependency
# hagrp -dep#Parent Child Relationship
test_B test_A online global
4. Make VCS configuration writable
# haconf -makerw
5. Unlink the group dependency if there is any. In this case, the service group test_B requires test_A.
hagrp -unlink <parent_group> <Child_group>
# hagrp -unlink test_B test_A
6. Stop VCS on csvcs6, the node to be removed.
hastop -sys <node>
# hastop -sys csvcs6
7. Check the status again, making sure csvcs6 is EXITED and the failover service group is online on running node.
# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A csvcs5 RUNNING 0
A csvcs6 EXITED 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B test_A csvcs5 Y N ONLINE
B test_A csvcs6 Y N OFFLINE
B test_B csvcs6 Y N OFFLINE
B wvcs csvcs5 Y N ONLINE
B wvcs csvcs6 Y N OFFLINE
8. Delete csvcs6 from wvcs and test_A SystemList.
hagrp -modify <service_group> SystemList -delete <node>
# hagrp -modify wvcs SystemList -delete csvcs6# hagrp -modify test_A SystemList -delete csvcs6
9. Check all the resources belonging to the service group and delete all the resources from group test_B before removing the group.
hagrp -resources <service_group>
# hagrp -resources test_Bjprocess
kprocess
hares -delete <resource_name>
# hares -delete jprocess# hares -delete kprocess
hagrp -delete <service_group>
# hagrp -delete test_B
10. Check the status again, making sure all the service groups are online on the other node. In this case csvcs5.
# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A csvcs5 RUNNING 0
A csvcs6 EXITED 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B test_A csvcs5 Y N ONLINE
B wvcs csvcs5 Y N ONLINE
11. Delete system (node) from cluster, save the configuration, and make it read only.
# hasys -delete csvcs6
# haconf -dump -makero
12. Depending on how the cluster is defined or the number of nodes in the cluster, it might be necessary to reduce the number for " /sbin/gabconfig -c -n # " in the /etc/gabtab file on all the running nodes within the cluster. If the # is larger than the number of nodes in the cluster, the GAB will not be auto seed.
To prevent VCS from starting after rebooting, do the following on the removed node (csvcs6):
1. Unconfigure and unload GAB
/sbin/gabconfig -u
modunload -i `modinfo | grep gab | awk '{print $1}`
2. Unconfigure and unload LLT
/sbin/lltconfig -U
modunload -i `modinfo | grep llt | awk '{print $1}`
3. Prevent LLT, GAB and VCS from starting up in the future
mv /etc/rc2.d/S70llt /etc/rc2.d/s70llt
mv /etc/rc2.d/S92gab /etc/rc2.d/s92gab
mv /etc/rc3.d/S99vcs /etc/rc3.d/s99vcs
4. If it ** is not ** desired to be running VCS on this particular node again, all the VCS related packages and files can now be removed.
pkgrm VRTSperl
pkgrm VRTSvcs
pkgrm VRTSgab
pkgrm VRTSllt
rm /etc/llttab
rm /etc/gabtab
NOTE: Due to the complexity and variation of VCS configuration, it is not possible to cover all the possible situations and conditions of a cluster configuration in one technote. The above steps are essential for common configuration in most VCS setups and provide some idea how to deal with complex setups.
Subscribe to:
Post Comments (Atom)
 
No comments:
Post a Comment