Solaris and ‘Growing’ / ‘Expanding’ your Zpool – With little down time …

<zfs nerd post>

At home (on this server) I have a Zpool consisting of 3 disks in a RAIDZ1 group. I needed more disc space, so 1 by 1 I changed my 3 x 500G drives to 1tb drives. This would of been pretty much no down time (except for an export and import of the pool) if my hardware had hot swapable drive – The cheap Dell SC430 doesen’t do that unfortunately.

It took around 8 hours each for the first two discs – But the last one took over 20 hours! No idea why – My system was idle with just the resilvering taking place. If you have had experiences with this leave me a comment with how long this process took you.

SO my pool ‘dump’ went from

dump                   921G    37K   112G     1%    /dump

to

dump                   1.8T    37K   1.1T     1%    /dump

If you want to ever do this process, ZFS (once all the discs are the same size) just sees that and grows. All you need to do is replace your discs 1 by and and let them rebuild. No mucking around moving data. The best part is your Data is still avaliable, during the whole process, all be it slow to access but if you need a something you can get it. 🙂

</zfs nerd post>

Maxing out Gig Ethernet with Sun x4500 Thumpers

Well the time has come where I have finally got some hardware that can max out gig ethernet. I sent 3.4 tb in 9 hours! Thats awesome! GG Cisco 3750 and 2 x Sun x4500 Thumpers running Opensolaris snv_105. Good times – I bet the copper was warm 🙂

root@dumper1:sbin> ./zfsMbufferInitialSend.sh
Starting ZFS send to dumper-tmp.ansto.gov.au @ Fri Mar  6 16:02:47 EST 2009
in @  0.0 kB/s, out @ 34.9 MB/s, 3428 GB total, buffer   0% fullll

summary: 3428 GByte in 8 h 57 min  109 MB/s

real    537m40.533s
user    34m32.281s
sys     326m18.227s
Completed @ Sat Mar  7 01:00:34 EST 2009

If you do use x4500s or have the need to zfs send compile mbuffer today! It rocks – I went from 30mbyte a second with SSH to maxing out Gigabit Ethernet. I will post instructions on everything I did soon.

Solaris 10 + Jumbo Frames + Link Aggregation with Cisco 3750 Switch + NFS Exporting / Mounting

Do you want to get the most out of our x4500 network throughput with NFS. Read away if you do.

So, at work I am lucky enough to get to play with 3 Sun x4500 x86_64 Thumper Systems. You may be sitting there and saying big deal, I say it’s a lot of disk and sweet sexy Sun hardware.

The Sun x4500 Thumper
The Sun x4500 Thumper

I have posted this due to the hard time I found trying to find information on linking the Network Interfaces and using Jumbo Frames to maximise your network throughput from your x4500.

I have a x4500, using jumbo frames and has two Gig (e1000g0) interfaces running Solaris 10u6 with a rpool and a big fat data pool I call cesspool. I have shares exported by nfs. Below I will detail my conf and what I have found to be the best performing NFS mounting options from clients.

I did try to do this when I had the x4500 on 10u5, but had difficulties. Hosts that were not on the same switch as the device were having speed issues with NFS. I contacted Sun and got some things to try, along with things I tried and below is the end conf I have found to work best, please let me know if you have found better results or success with different configurations. Please note, I am now running Solaris 10u6, and apparently there was a bug with 10u5 and the e1000g driver.

1) Enabiling Jumbo Frames

Host (Solaris) Config:

On Solaris two things must be done to enable jumbo frames. Please ensure the switch is configured before enabiling the host:

HOSTNAME=god
INTERFACE=e1000g0
SIZE=9000

  1. Enable it on the driver – e.g. e1000g conf = /kernel/drv/e1000g.conf
    • Reboot will be required if not already enabled
  2. Enable Jumbo Frames it with ifconfig
    • From CLI = ifconfig ${INTERFACE} mtu ${SIZE}
    • At Boot = make /etc/hostname.${INTERFACE} =
    • ${HOSTNAME} mtu ${SIZE}

    – This has been tested on both Solaris 10u6 and Opensolaris 2008.11

Switch Config:

system mtu jumbo 9000 (this gets hidden in the IOS defaults)
system mtu routing 1500 (this is an auto insert command by IOS)

Show system mtu Output:
System MTU size is 1500 bytes
System Jumbo MTU size is 9000 bytes
Routing MTU size is 1500 bytes

Remember to copy run start once happy with config 🙂

2) Enabling Aggregated Interfaces

Host (Solaris) Config:

I wrote a script to apply. This script asumes you already have /etc/defaultrouter, /etc/netmasks, /etc/resolv.conf and /etc/nsswitch.conf all setcorrectly

Here is the script I used to apply the conf:

#!/usr/bin/bash

# Create Link aggr on plumper
# Ether Channel on Swith Ports 2 on each 3750 – 20081223

# Do I want these ?
# -l = LCAP mode – active, passive or disabled
# -T time – LCAP Timer …

ifconfig e1000g0 unplumb
ifconfig e1000g1 unplumb

# Sun’s Suggestion
dladm create-aggr -P L4 -l active -d e1000g0 -d e1000g1 1

# Move hostname file
mv /etc/hostname.e1000g0 /etc/hostname.aggr1

# Check Link
dladm show-aggr 1

# Set device IP # Can set MTU here if jumbo enabled
ifconfig aggr1 plumb x.x.x.x up

# Show me devs / links so I can watch
dladm show-dev -s -i 2

Switch Config:

# = Insert Integer

Configure a Port Group:

  • interface Port-channel#
    • switchport access vlan #
    • switchport mode access
  • exit
  • port-channel load-balance src-dst-ip

Please configure the ports you want in the channel (4 max) required as following:

# = Insert Integer

  • config term
    • interface INTERFACE
      • channel-group # mode passive
      • channel-protocol lacp
      • switchport access vlan #
      • switchport mode access
      • exit
    • end
  • show run (to verify)

Remember to copy run start once happy with config 🙂

3) Nfs Sharing w/zfs

This was another silly little mistake I was doing, I was turning sharenfs=on with the ZFS file systems I wished to share and then trying to apply the shares properties using share command and adding entries to the sharetab manually. With ZFS tho, all your NFS options should be applied to the sharenfs attribute on the ZFS filesystem, as the following example:

  • zfs set sharenfs=ro,rw=god.cooperlees.com,root=god.cooperlees.com

These arguments get pased to ‘share’ via ZFS @ boot time.

4) NFS Mount Options

Most of my clients (that I have tuned) are Linux boxes, running Scientific Linux 5.2 (a Redhat deriviative – similiar to CentOS). I have found once jumbo frames and aggregated interfaces are involved TCP performs better. By default, tcp is used on modern Linux nfs clients, but on older Linux, Irix etc. UDP is, which, once you try to move a large amount of data will not work if the host has a different MTU to the file server. (With old OS’s like this running you can tell I work @ a cientific research facility). Here are some examples of my mount options in /etc/fstab on these boxes:

Modern Linux Machines: (CentOS 5, Scientific Linux 5):
god.cooperlees.com:/cesspool/home      /home   nfs     defaults,bg,intr,hard,noacl     0 0

Old Linux Machines: (Redhat 7 etc.)
god.cooperlees.com:/cesspool/home /home          nfs     defaults,bg,intr,hard,tcp 0 0
-No mention of ACL’s and UDP is default here

Irix 6.5 (yuck – MIPS):
god.cooperlees.com:/cesspool/home /home nfs defaults,rw,sync,proto=tcp
-No acl and once again UDP …