Using RHEL6 to share RAID volume via iSCSI: the Mystery of the Missing LUN

My use case was pretty simple. I wanted to share out a raw device via iSCSI to a nearby host on the 172.16.2.x network.

In addition to a minimal Red Hat Enterprise Linux 6 (or equivalent) install, a few packages are needed:

# yum install -y iscsi-initiator-utils scsi-target-utils sg3_utils lsof

I knew the device I wanted to share was /dev/sdb by looking at the output from dmesg:

# dmesg | grep sd
...
sd 0:0:1:0: [sdb] 7812456448 512-byte logical blocks: (3.99 TB/3.63 TiB)
...

The target definition in /etc/tgt/targets.conf was simple as well. Isn't everything simple after you've beaten your head against a wall for hours trying to get it working? The following configuration defines one target with one LUN shared as a raw device which may only be connected to by IP address 172.16.2.2:


Starting up the tgtd service and turning it on permanently led to wonderment and success:

# service tgtd start
# chkconfig tgtd on

Depending on your configuration, you may need to open port 3260 in your firewall.

However, after a reboot, only the controller showed up (as LUN 0). LUN 1 had disappeared!

# tgtadm --lld iscsi --op show --mode target
Target 1: iqn.2012-04.com.example:sharename
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
    Account information:
    ACL information:
        172.16.2.2

Why is LUN 1 not showing up? Telling tgt-admin to reparse targets.conf in verbose mode leads to the reason:

# tgt-admin --update ALL -v
# Removing target: iqn.2012-04.com.example:sharename
tgtadm -C 0 --mode target --op delete --tid=1

# Adding target: iqn.2012-04.com.example:sharename
tgtadm -C 0 --lld iscsi --op new --mode target --tid 1 -T iqn.2012-04.com.example:sharename
# Device /dev/sdb is used by the system (mounted, used by swap?).
# Skipping device /dev/sdb - it is in use.
# You can override it with --force or 'allow-in-use yes' config option.
# Note - do so only if you know what you're doing, you may damage your data.

What? I assure you that /dev/sdb is NOT in use by the system. Show me mounts:

# mount

Nope, /dev/sdb does not appear anywhere in the output. Show me a list of open files on /dev/sdb:

# lsof | grep sdb

Nothing. Show me active swap:

# swapon -s

Nothing! Finally, choirboy in the #rhel irc channel pointed me to the answer: you must create a filter in /etc/lvm/lvm.conf so that LVM leaves the device alone. Appropriate section of lvm.conf, showing new filter:

    # By default we accept every block device:
    #filter = [ "a/.*/" ]
    # Every block device except /dev/sdb, that is.
    filter = [ "r|/dev/sdb|" ]

After a restart, LUN 1 persists and the world is once again a happy place to be:

# tgtadm --lld iscsi --op show --mode target
Target 1: iqn.2012-04.com.example:sharename
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: 9206CBBG71194900CF07
            Size: 3999978 MB, Block size: 512
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/sdb
            Backing store flags:
    Account information:
    ACL information:
        172.16.2.2

Edit: mutipathd may also be the culprit. Take a look at the world-wide identifier (WWID) of /dev/sdb:

scsi_id -g -u /dev/sdb
36a4badb051c18d0018e462537ac943d6

Is that identifier also listed under /dev/mapper ?

If so, add

find_multipaths         yes

To the defaults section of /etc/multipath.conf, remove the matching line from /etc/multipath/wwids, and restart the multipath daemon:

service multipathd restart

Comments

Any idea on how to share a Raw software RAID device (md0)? I keep getting this error "Config::General: EndBlock "</target>" has no StartBlock statement (level: 1, chunk 2)!"

10x, useful ;) but.. what if the next time after reboot /dev/sdb apears as something else e.g /dev/sdX ?

Create a udev rule?

I use UDEV rules based on drive model and serial number:

1. Find drive ID_MODEL and ID_SCSI_SERIAL using 'udevadm info --query=property --name=/dev/sdX'

2. Create and edit file /etc/udev/rules.d/70-persistent-drive.rules

3. Add a UDEV rule like:
SUBSYSTEM=="block", ENV{ID_MODEL}=="C300-CTFDDAC128M", ENV{ID_SCSI_SERIAL}=="00000000123456789", SYMLINK+="rdX%n"

Note that I create a new drive designator that is a symlink to the actual /dev/sdX drive as renaming the actual sdX seems to cause all sorts of problems with the sysfs. There's probably a way of doing it safely but the above works for me and I just use the fixed rdX symlinks instead of the mobile sdX devices.

Hope that helps.

My dear friend you saved my life. I cannot thank you enough. God bless you :)

Really helpful, thanks!

In my case I found that hddtemp had the drive locked!

Thank you so much for the helpful and well written explanation. I suspected the trouble I was having with iscsi was somehow due to LVM, but your post made clear why the problem was happening and how to resolve the issue. I wish I had found your post an hour sooner.