Sunday, February 04, 2007

SFCFS – Storage Foundation Cluster File System

SFCFS is a product for sharing a single file system among multiple nodes. Luns from a storage sub-system need to be shared amongst the various nodes in a cluster and SFCFS maintains consistency on the FS shared by multipe nodes.

Say node-A, node-B, node-C share a file system /users, and node-A goes down, users can still access the /users filesystem on node-B and node-C on the cluster.

How does SFCFS benefit??

  1. Load balance applications.
  2. Shared FS can be resized dynamically.
  3. Extending clusters since the new node can simply mount the configuration FS.
  4. Reduce downtime for planned maintainence on systems.

In this short tutorial, I would explain the key components of the SFCFS architecture and also a simple example and some caveats to be taken care duiring installation.

Part 1 – SFCFS key components

SFCFS includes component products like Veritas Cluster Server (VCS), Cluster File System (CFS) and cluster functionality of Veritas Volume Manager (VCM).

  1. VCS manages the communication between nodes and the mechanism to add and register nodes.
  2. CFS manages the coherency in metadata on the shared filesystem. The node that mounts the FS first becomes the primary node, and is responsible for updating the metadata on the shared FS.
  3. CVM makes the logical volumes accessible throughout the cluster.

Part 2 – Membership ports

Each component in SFCFS registers with a membership port. Knowing the various ports and their significance helps in troubleshooting and ensuring approporiate membership.

port a – heartbeat membership.

port b – I/O fencing membership.

port f – Cluster file systems membership.

port u – temporary port used by CVM.

port v – CVM membership.

port q – used by CVM

Part 3 – Installing SFCFS

Although the entire installation process cannot be detiled in this tutorial, a few key steps will be highlighted.

Select a node (preferably the to be primary) to begin installation. Enable direct login via rsh to all other nodes from the primary node.

Mount the CDROM and proceed by running the installer script.

Enter the system names seperated by a space which are to be included in the cluster.

Select “n” for I/O fencing since it has to be manually configured later.

At our place we has uninstalled all Veritas components prior to installing SFCFS, if you have VxVM present on the systems, you will be asked to choose a new or exisiting configuraton.

Enter a cluster name : core_compute

Enter a cluster id : 123

Configure your heartbeat links appropriately. Heartbeat links are private channels used for communication by the VCS. Keep a list of interfaces ready for configuration as link1 and link2 for each node.

For further help refer the installation and admin guides for SFCFS on

Part 4 – Verifying the installation

a. LLT

The following files are important

1. /etc/llthosts – This file is just like /etc/hosts, except instead of IP->hostnames, it does llt node numbers (as set in set-node). You need this file for VCS to start. It should look like this:

0       coredev05
1       coredev06

2. /etc/llttab - low-latency transport configuration file
 # this sets our node ID, must be unique in cluster
set-node 0
 # set the heartbeat links
link hme1 /dev/hme:1 - ether - -
# link-lowpri is for public networks
link-lowpri hme0 /dev/hme:0 - ether - -
 # set cluster number, must be unique
set-cluster 123

b. GAB

GAB requires only one configuration file, /etc/gabtab.

This file lists the number of nodes in the cluster.

/sbin/gabconfig -c –n3
 c. Who’s the master??
 Type vxdctl –m mode on any node to view if it’s a master or slave.
Part 5 - IO Fencing
IO Fencing is a great and reliable way to prevent split brain during jeopardy.
Imagine if all heartbeat links fail, then you would have data corruption since the nodes
would’nt be able to update metadata appropriately.
IO Fencing needs a odd number of disks and alteast 3 of them to be configured
as co-ordinator disks. The co-ordinator disks need to suppport SCSI3-PGR reservations.
Choose 20 MB Luns as co-ordinator disks. You could even use smaller Luns
but we have run into problems with small Luns.

We nowadays use 50 MB Luns to avoid any last minute blues.
Use the vxfentsthdw utility to test disks for SCSI3-PG reservations.
Use this tool carefully since it wipes out all data from the disks.
Add all the selected disks to a disk group using VxVM commands, on any node
# vxdg init coorddg disk01=c7t11d3 disk02=c7t11d4 disk03=c7t11d5
Deport the disk group vxfendg
# vxdg deport coorddg
Re-import the disk group with the –t (temporary) option to prevent it from auto import
during reboot.
# vxdg –t import coorddg
Deport it again to prevent access by anybody.
# vxdg deport coorddg
Add the co-ord disk group to the file /etc/vxfendg on each system.
# echo “coordg” /etc/vxfendg
On each system start IO fencing.
/etc/init.d/vxfen start
This creates a file /etc/vxfentab and lists the disks in the DG listed in /etc/vxfendg.
In this example the /etc/vxfentab would consist of the following
 Part 6 – Regualar Operations
That was all about installation and configuring IO fencing.
Now to share and mount data disk groups.
a. To add a disk group bijudg and vol1 to the configuration and mount it, the
following steps need to be performed.
#cfsmntadm add bijudg vol1 /testmount all=rw
This will associate the mount point /testmount with the cluster.
# cfsmount /testmount
# df –k /testmount
/dev/vx/dsk/bijudg/var-crash  15482443      10 15327609     1%    /testmount
b. Resizing a FS
Just verify that you are resizing the FS on the primary node for that filesystem
# vxdctl –c mode
# fsclustadm –v showprimary /testmount
Use the vxresize command
# vxresize –g bijudg vol1 +5g

Thats all folks..!


At 4:53 PM, Blogger Wei said...

Good summary

At 9:08 PM, Blogger BIJUCYBORG said...

Thanks Wei,

Keep checking my blog, I will soon post good tutorials on upgrade of firmware on Brocade directors and switches.


Post a Comment

<< Home