Tuesday, January 10, 2006

Configuring QLogic (2340) on Solaris (sparc)

Well I aint gonna discuss, howto install a qlogic card on solaris but just howto configure and give visibilty for a set of LUN's. We will use the "SanSurfer CLI utilities" or "scli" for easily setting up persistent binding of LUNS to a particular HBA.

I found that configuring a Qlogic was really easy, compared to JNI and FCAW cards. But hey I never regret working on those since I learnt quite a few fundas abt SAN working on them. The more primitive the technology the more u learn (not always though).

First Task : Install SANsurfer FC HBA CLI for Solaris SPARC

Download the qlogic cli utilities for Solaris Sparc from
http://download.qlogic.com/drivers/36071/scli-1.06.16-38.SPARC-X86.Solaris.pkg.Z

$ uncompress scli-1.06.16-38.SPARC-X86.Solaris.pkg.Z
$ pkgadd -d ./scli-1.06.16-38.SPARC-X86.Solaris.pkg

Second Task : Connect the storage JBOD/Array via FC

Third Task : Find the instance of qlogic card connected to the desired storage.

$ cat /etc/path_to_inst|grep ql

Here I found that the instance is zero.

Fourth Task : Checking and Configuring

$ scli -t 0

-----------------------------------------------------------------------------
HBA Port 0 - QLA2340 OS: 0 Port Name: 21-00-00-E0-8B-1C-A9-9A Port Id: 00-00-E8
-----------------------------------------------------------------------------
Path : 0
Target : 0
Device ID : 0x00
Port ID : 00-00-EF
Product Vendor : NEXSAN
Product ID : ATAbeastF
Product Revision : 8239
Node Name : 20-03-00-04-02-E8-0C-FE
Port Name : 50-00-40-20-03-E8-0C-FE
Product Type : Disk
Number of LUN(s) : 5
Target OS Name :
Status : Online
--------------------------------------------------------------------------------
Now run the foll command

$ scli -p 0 20-03-00-04-02-E8-0C-FE 50-00-40-20-03-E8-0C-FE 00-00-EF 0
Info: Running dynamic update, please wait...

Configuration saved on HBA port 0. Changes have been saved to persistent storage.
ie.

$ scli -p

$ scli -p

Check the binding

$ scli -p 0 view

-----------------------------------------------------------------------------
HBA Port 0 - QLA2340 OS: 0 Port Name: 21-00-00-E0-8B-1C-A9-9A Port Id: 00-00-E8
-----------------------------------------------------------------------------
Bind Type Device Node Name Device Port Name Port ID ID
---- ------- ----------------------- ----------------------- -------- ---
Yes Disk 20-03-00-04-02-E8-0C-FE 50-00-40-20-03-E8-0C-FE 00-00-EF 0

Thats it voila, run echo | format to see the new LUNS in format.

26. c3t0d0 
/pci@1f,4000/fibre-channel@2/sd@0,0
27. c3t0d1
/pci@1f,4000/fibre-channel@2/sd@0,1
28. c3t0d3
/pci@1f,4000/fibre-channel@2/sd@0,3
29. c3t0d4
/pci@1f,4000/fibre-channel@2/sd@0,4
30. c3t0d25
/pci@1f,4000/fibre-channel@2/sd@0,19

Post your queries as comments.

Sunday, January 08, 2006

Netapp Upgrade

Part 1 : Ontap + Adding disk shelves

This was the second ONTAP upgrade within a years length. This time around we added two SATA disk
shelves and upgraded to ONTAP 7.0.3 to get the benefits of flexvol.

ONTAP upgrade was pretty simple, we just downloaded the software for FAS900 series from now.netapp.com and extracted it in a particular directory.

The root of the filer (vol0) was mounted on /mnt

./install_netapp -k /mnt

The upgrade was smooth, as simple as copying files.

we then connected a laptop to the NAS filer, on the filer prompt

Disabled the autosupport, since this was a scheduled downtime

Disabled the cluster with the following command

cf disable

The executables need to be copied to the boot blocks of the filer

download

A message would appear informing that the boot blocks were sucessfully copied to the disks.

and fired the halt command. The system came to the
ok:prompt

Fired the bye command to reboot
ok bye

On the other filer perform the same upgrade.

We then connected each disk shelf to each to one controller. There was no interconnect since cf is disable.

Checked whether the new set of disks could be detected by the filer using
sysconfig -a

Upgraded the disk firmware, this firmware is packaged with the new ONTAP upgrade.

disk_fw_update

Enabled the cluster

cf enable

Interconnect the new disk shelves to either filers.

Part 2: Creating Aggregate volumes.

Aggregates are the new feature that was enabled as a result of the ONTAP upgrade. We created a 14 disk aggregate with 13 disks. Sounds confusing but I'll try explain

We purchased a 2 x 14 disk shelves, one disk needs to be a spare as a Best practice policy (A good point to be noted is that spares are'nt global), so effectively we had 13 disks. The Aggregate options normally recommend 6-14-28 disks. So we went ahead with 14.

This aggregate can be expanded to 28 on the fly.

Now we would be creating Flex vols out of the aggregate. Here each vol would span 13 disks thus giving better performance. The Flex vols can be resized on the fly. Caution to be excercised is that for the volume to be resized it shd be arnd 20% free.

Netapp stood upto its name of a storage appliance, just 2 hrs to complete the upgrade. No wonder its got an order of 20 Petabytes.

Sunday, January 01, 2006

Veritas Layered Volumes

Layered volumes are VxVM objects made on top of other VxVM volumes. They advantage is that they offer greater
redundancy.

Examples are

1. Stripe Pro OR (Raid 1 + 0) OR stripe-mirror
2. Concat Pro

The difference is that Concat Pro is concat LV over mirrored volumes and the former being stripe LV over mirrored
volumes.

To convert from a non-layered to layered use the vxassist convert option.
Relayout works only between non-layered formats. For example we can relayout from stripe to concat.

Use the -o ordered option to gain some control over the disk assignment during layered volume creation.

The following command will create a layered volume as shown below

vxassist -g oradg -o ordered make oravol 150g layout=stripe-mirror ncol=2 mydisk01 mydisk02 mydisk03 mydisk04

stripe
mydisk01-01 mydisk02-01
mirror mirror
mydisk03-01 mydisk04-01

Heres a classic case from VxVM mailing list, I have modified it a bit to make things clear
and bring some brevity in the content presented.

Case : Converting Stripe volume to Stripe pro volume.

I need to set up a Stripe Pro (RAID 1+0) volume, I'm using 14 drives of a 22 drive Sun A5200 disk array (JBOD).
We have added a back plane with equal number of disks to the array. The front plane already has 6 disks in a stripe volume oravol.

I need to convert the oravol volume to a stripe pro for better redundancy.
FRONT:                   |     BACK:
-------------------            --------------------
c3t32  [oradb01] <-->  c3t48  [oradb02]
c3t33  [oradb03] <-->  c3t50  [oradb04]
c3t36  [oradb05] <-->  c3t52  [oradb06]
c3t37  [oradb07] <-->  c3t53  [oradb08]
c3t38  [oradb00] <-->  c3t54  [oradb10]
c3t40  [oradb11] <-->  c3t56  [oradb12]
c3t42  [oradb13]         c3t58  [oradb14] --------------à Spare
 
Any Ideas how to do this via actual command line recommendations?
 
Solution
Mirror all the disks on the other half, thus creating a mirrored stripe:
    --------------------------------------------------------------------------------------
    # vxassist mirror oravol init=active \
    alloc=oradb02,oradb04,oradb06,oradb08,oradb10,oradb12

Use vxassist to convert layout from RAID 0+1 to 1+0
    -----------------------------------------------------------------
    # vxassist  -g oradg  convert oravol layout=stripe-mirror,nolog

Here are some examples compiled with input from various searches on google.

Case 1: A power failure in the storage pulled some disks out of control of volume manager.

If the vxprint command shows that the plexes have a kernel state of NODEVICE.

Then it seems that the disks have gone offline as a result of the operation.

Vxreattach is the command for help.

Solution :

First check whether vxreattach is possible.

vxreattach –c c#t#d#s#

Example vxreattach -c c2t29d10s2
This will hopefully show what the disk_name used to be.
If so then run (re-attach in background).
vxreattach -br c2t29d9s2

Find the plexes that are in the DISABLED RECOVER state.

Here we consider that the volume automation is disabled and that the plex automation-01 is in DISABLED

For all DISABLED RECOVER plexes, perform the commands:

# vxmend -o force off (automation-01)

# vxmend on (automation-01)

# vxmend fix clean (automation-01)

# vxvol start (automation)

fsck the volume before mounting.

Case 2: 
From vxprint -ht:
v  bm-u1        -            ENABLED  ACTIVE   60817408 SELECT   -        fsgen
pl bm-u1-01     bm-u1        DISABLED TEMPRMSD 60817408 CONCAT   -        RW
sd raid60-v2-03 bm-u1-01     raid60-v2 49872896 60817408 0       fabric_5 ENA
pl bm-u1-02     bm-u1        ENABLED  STALE    60817408 CONCAT   -        WO
sd raid-82-vol1-11 bm-u1-02  raid-82-vol1 358612992 60817408 0   fabric_1 ENA
 
I want to remove this volume, it's not mounted anywhere and the usual
commands says:
 
# vxvol stop bm-u1
vxvm:vxvol: ERROR: Volume bm-u1 in use by another utility
# vxtask list
TASKID  PTID TYPE/STATE    PCT   PROGRESS
# vxassist remove volume bm-u1
vxvm:vxassist: ERROR:  Volume bm-u1 is adding a mirror
 
How do I get rid of it?  (data is of no importance)
 
Solution:
1) vxmend -g disk_group -r clear all bm-u1
2) vxedit -r rm bm-u1
Case 3:
Recovering a RAID5 volume, incase of one disk failure, should happen automatically. But things may not seem to work right sometimes. Largely this is due to parity corruption. Here are the steps to check and recover in such a sittuation.
If you run /etc/vx/bin/vxr5check on a volume and it tells you that the parity is bad, here is a procedure that allows you to rebuild it.
1.       # vxvol -g  stop 
2.       # vxmend -g  fix empty 
3.       # vxvol -g  start 
 Don't worry about the "fix empty stage deleting data". It will not. This procedure taken from Sun SRDB 12266.