iSCSI

iSCSI on RHEL6: Targets, Initiators and SANs

Here are the slides from the presentation I gave in the 2013 Spring Linux Presentation Series at Iowa State University.

iSCSI on RHEL6: Targets, Initiators and SANs (1.5MB PDF)


This adapter takes the guesswork out of iSCSI configuration. Just plug it into your USB port! Apologies to Brian Campbell.

Topic: 

Solved: iSCSI disconnects and timeouts after successful login

Consider the following from /var/log/messages:

iscsid: Connection16:0 to [target: iqn.2012-12.com.example:fooportal, portal: 198.51.100.3,3260] through [iface: default] is operational now
kernel: sde:
kernel: connection16:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4379432620, last ping 4379437620, now 4379442620
kernel: connection16:0: detected conn error (1011)
iscsid: Kernel reported iSCSI connection 16:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
iscsid: connection16:0 is operational after recovery (1 attempts)
kernel: connection16:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4379445875, last ping 4379450875, now 4379455875
kernel: connection16:0: detected conn error (1011)
iscsid: Kernel reported iSCSI connection 16:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)

As you can see, login to the iSCSI target was successful. But shortly thereafter, the client becomes unhappy and the connection fails out, only to be reinstated and disconnected repeatedly.

In my case, the problem ended up being jumbo frames. I diagnosed it by doing a wireshark capture of the bonded interface on the client, which revealed the following message:

scsi transfer limited due to allocation_length too small

and showed the message

[TCP Retransmission] SCSI: Read(10) LUN: 0x01 (LBA: 0x00000000, Len: 8)

Turning off jumbo frames in the bonded interfaces on both ends of the connection solved the problem.

This was happening because the switch I was going over (HP ProCurve 2910al) does not have jumbo frames enabled for the default VLAN:

# show vlan

Status and Counters - VLAN Information

  Maximum VLANs to support : 256                 

  VLAN ID Name                             | Status     Voice Jumbo
  ------- -------------------------------- + ---------- ----- -----
  1       DEFAULT_VLAN                     | Port-based No    No  

The ultimate solution was to create a separate VLAN on the switch and enable jumbo frames on the new VLAN. After that, everything worked swimmingly.

Topic: 

Using RHEL6 to share RAID volume via iSCSI: the Mystery of the Missing LUN

My use case was pretty simple. I wanted to share out a raw device via iSCSI to a nearby host on the 172.16.2.x network.

In addition to a minimal Red Hat Enterprise Linux 6 (or equivalent) install, a few packages are needed:

# yum install -y iscsi-initiator-utils scsi-target-utils sg3_utils lsof

I knew the device I wanted to share was /dev/sdb by looking at the output from dmesg:

# dmesg | grep sd
...
sd 0:0:1:0: [sdb] 7812456448 512-byte logical blocks: (3.99 TB/3.63 TiB)
...

The target definition in /etc/tgt/targets.conf was simple as well. Isn't everything simple after you've beaten your head against a wall for hours trying to get it working? The following configuration defines one target with one LUN shared as a raw device which may only be connected to by IP address 172.16.2.2:


Starting up the tgtd service and turning it on permanently led to wonderment and success:

# service tgtd start
# chkconfig tgtd on

Depending on your configuration, you may need to open port 3260 in your firewall.

However, after a reboot, only the controller showed up (as LUN 0). LUN 1 had disappeared!

# tgtadm --lld iscsi --op show --mode target
Target 1: iqn.2012-04.com.example:sharename
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
    Account information:
    ACL information:
        172.16.2.2

Why is LUN 1 not showing up? Telling tgt-admin to reparse targets.conf in verbose mode leads to the reason:

# tgt-admin --update ALL -v
# Removing target: iqn.2012-04.com.example:sharename
tgtadm -C 0 --mode target --op delete --tid=1

# Adding target: iqn.2012-04.com.example:sharename
tgtadm -C 0 --lld iscsi --op new --mode target --tid 1 -T iqn.2012-04.com.example:sharename
# Device /dev/sdb is used by the system (mounted, used by swap?).
# Skipping device /dev/sdb - it is in use.
# You can override it with --force or 'allow-in-use yes' config option.
# Note - do so only if you know what you're doing, you may damage your data.

What? I assure you that /dev/sdb is NOT in use by the system. Show me mounts:

# mount

Nope, /dev/sdb does not appear anywhere in the output. Show me a list of open files on /dev/sdb:

# lsof | grep sdb

Nothing. Show me active swap:

# swapon -s

Nothing! Finally, choirboy in the #rhel irc channel pointed me to the answer: you must create a filter in /etc/lvm/lvm.conf so that LVM leaves the device alone. Appropriate section of lvm.conf, showing new filter:

    # By default we accept every block device:
    #filter = [ "a/.*/" ]
    # Every block device except /dev/sdb, that is.
    filter = [ "r|/dev/sdb|" ]

After a restart, LUN 1 persists and the world is once again a happy place to be:

# tgtadm --lld iscsi --op show --mode target
Target 1: iqn.2012-04.com.example:sharename
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: 9206CBBG71194900CF07
            Size: 3999978 MB, Block size: 512
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/sdb
            Backing store flags:
    Account information:
    ACL information:
        172.16.2.2

Edit: mutipathd may also be the culprit. Take a look at the world-wide identifier (WWID) of /dev/sdb:

scsi_id -g -u /dev/sdb
36a4badb051c18d0018e462537ac943d6

Is that identifier also listed under /dev/mapper ?

If so, add

find_multipaths         yes

To the defaults section of /etc/multipath.conf, remove the matching line from /etc/multipath/wwids, and restart the multipath daemon:

service multipathd restart
Subscribe to RSS - iSCSI