-
Veritas Cluster Server Cluster File System – the ironic.
Posted on June 13th, 2009 No commentsLast Monday, I was at one customer site to perform Veritas Cluster Server – Cluster File System installation. It is a very rare applications used in Malaysia due to functionality and of course, the license price. As far as I know from presales department, the license is quoted based on how many CPU’s inside the server, more you have, the more you have to pay for the license. Most of the time, customers only opt to install High Availability version, not Cluster File System version.
VCSCFS enables you to share SAN storage between two and more clusters node in active mode, means it is equipped with IO fencing that prevent data lost during any split brain circumstances. The setup is a bit straight forward, but to configure it, it is really a pain in the ass. The configuration files that you will see in this cluster type is;
main.cf
types.cf
CVMTypes.cf
CFStypes.cf
master.main.cf
Let’s take a look on the infra setup first;
HPUX 11.23 IA64 running on HP rx7640 – 8 MP processors, 16 core, 80Gb memory, 2 x 146Gb SCSI HDD with RAID 1
Veritas Cluster Server Cluster File System 5.0
A series of patch list needed to be applied to make sure everything is running fine on this setup as per below;
HP-UX Patch ID Description
PHCO_32385 Enables fscat(1M).
PHCO_32387 Enables getext(1M).
PHCO_32388 Enables setext(1M).
PHCO_32389 Enables vxdump(1M).
PHCO_32390 Enables vxrestore(1M).
PHCO_32391 Enables vxfsstat(1M).
PHCO_32392 Enables vxtunefs(1M).
PHCO_32393 Enables vxupgrade(1M).
PHCO_32488 Enables LIBC for VxFS 4.1 and later file systems.
PHCO_32523 Enhancement to quota(1) for supporting large uids.
PHCO_32524 Enhancement to edquota for supporting large uids.
PHCO_32551 Enhancement to quotaon/quotaoff for supporting large uids.
PHCO_32552 Enhancement to repquota for supporting large uids.
PHCO_32596 Enables df(1M).
PHCO_32608 Enables bdf(1M).
PHCO_32609 Enables fstyp(1M).
PHCO_32610 Enables mount(1M).
PHCO_32611 Fix fs_wrapper to accept vxfs from subtype.
PHCO_33238 swapon(1M) cumulative patch.
PHCO_34036 LVM commands patch.
PHCO_34208 SAM cumilative patch.
PHCO_34191 Cumulative libc patch.
PHSS_32674 Obam patch (backend for the SAM patch).
PHKL_31500 Sept04 Base Patch
PHKL_32272 Changes to fix intermittent failures in getacl/setacl.
PHKL_32430 Changes to separate vxfs symbols from libdebug.a, so that symbols of VxFS 4.1and later are easily available in q4/p4.
PHKL_32431 Changes to disallow mounting of a file system on a vnode having VNOMOUNT set. Enhancements for supporting quotas on large uids.
PHKL_33312 LVM Cumulative Patch.
PHKL_34010 Cumulative VM Patch.
PHKL_36745 LVM Cumulative Patch
PHCO_36744 LVM Commands Patch
PHCO_37114 VRTS 5.0 MP1RP2 VRTSvxfs Command Patch
PHKL_37113 VRTS 5.0 MP1RP2 VRTSvxfs Kernel Patch
EnableVXFS Bundle (B.11.23.04 or later version is required after installing the latest HP-UX patches)
FSLibEnh Enhancement to LIBC libraries to understand VxFS disk layout Version 6 and later.
DiskQuota-Enh Enhancements to various quota related commands to support large uids.
FSCmdsEnh Enhancements to the mount command to support VxFS 5.0.
Setup diagram;

For the startup, we will need to prepare the system first. For this setup, we will use 3 connection, 2 for gab and for llt. What is gab and llt?
Group Membership Services/Atomic Broadcast (GAB)
The Group Membership Services/Atomic Broadcast protocol (GAB) is responsible for cluster membership and reliable cluster communications. GAB has two major functions.
Cluster membership
GAB maintains cluster membership by receiving input on the status of the heartbeat from each system via LLT. When a system no longer receives heartbeats from a cluster peer, LLT passes the heartbeat loss to GAB. GAB marks the peer as DOWN and excludes it from the cluster. In most configurations, membership arbitration is used to prevent network partitions.
Cluster communications
GAB’s second function is reliable cluster communications. GAB provides guaranteed delivery of messages to all cluster systems. The Atomic Broadcast functionality is used by HAD to ensure that all systems within
the cluster receive all configuration change messages, or are rolled back to the previous state, much like a database atomic commit. While the communications function in GAB is known as Atomic Broadcast, no actual
network broadcast traffic is generated. An Atomic Broadcast message is a series of point to point unicast messages from the sending system to each receiving system, with a corresponding acknowledgement from each receiving system.
Low Latency Transport (LLT)
The Low Latency Transport protocol is used for all cluster communications as a high-performance, low-latency replacement for the IP stack. LLT has two major functions.
Traffic distribution
LLT provides the communications backbone for GAB. LLT distributes (load balances) inter-system communication across all configured network links. This distribution ensures all cluster communications are evenly distributed across all network links for performance and fault resilience. If a link fails,
traffic is redirected to the remaining links. A maximum of eight network links are supported.
Heartbeat
LLT is responsible for sending and receiving heartbeat traffic over each configured network link. LLT heartbeat is an Ethernet broadcast packet. This broadcast heartbeat method allows a single packet to notify all other cluster members the sender is functional, as well as provide necessary address information for the receiver to send unicast traffic back to the sender. The heartbeat is the only broadcast traffic generated by VCS. Each system sends 2 heartbeat packets per second per interface. All other cluster communications, including all status and configuration traffic is point to point unicast. This heartbeat is used by the Group Membership Services to determine cluster membership.
The heartbeat signal is defined as follows:
LLT on each system in the cluster sends heartbeat packets out on all
configured LLT interfaces every half second.
LLT on each system tracks the heartbeat status from each peer on each
configured LLT interface.
LLT on each system forwards the heartbeat status of each system in the
cluster to the local Group Membership Services function of GAB.
GAB receives the status of heartbeat from all cluster systems from LLT and
makes membership determination based on this information.
For installation on 2 node, you will need to first configure rsh on the second node so that the first node that execute the installation script can connect to the second node, not to forget to edit the host file on both nodes for it to recognize each other, attached below is the installation guide from Symantec;
Veritas Storage Foundation™ Cluster File System Installation Guide - HPUX - 5.0 (39)Somehow after installation, I experienced one problem when executing VEA, the GUI can be loaded but it will not be able to connect to the management server, there is one secret command that will need to be executed but somehow I forgot where I put my note, will update you guys once I found it, later guys.
Leave a reply













































Recent Comments