https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Bug ID: 1416129 Summary: can't run iscsid in a docker container Product: Red Hat Enterprise Linux 7 Version: 7.3 Component: docker Keywords: Reopened Severity: medium Assignee: dwalsh@redhat.com Reporter: jhunsaker@redhat.com QA Contact: atomic-bugs@redhat.com CC: abeausol@redhat.com, admiller@redhat.com, amurdaca@redhat.com, berthiaume_wayne@emc.com, borgan@redhat.com, cleech@redhat.com, david.barnhill@emc.com, dwalsh@redhat.com, fsimonce@redhat.com, golang-updates@lists.fedoraproject.org, jkeck@redhat.com, jsafrane@redhat.com, jslagle@redhat.com, lsm5@redhat.com, mattdm@redhat.com, mgoldman@redhat.com, pedro@aguatechnologies.com, rvykydal@redhat.com, s@shk.io, vaibhav.khanduja@emc.com, vbatts@redhat.com Depends On: 1100000 Blocks: 1102911
+++ This bug was initially created as a clone of Bug #1100000 +++
Description of problem: Can't start iscsid in a docker container
Version-Release number of selected component (if applicable):
# rpm -q docker-io docker-io-0.11.1-3.fc20.x86_64
How reproducible: always
Steps to Reproduce: 1. use the published fedora image, docker pull fedora 2. start the container, docker run -t -i fedora /bin/bash 3. install iscsi-initiator-utils 4. try to start iscsid:
bash-4.2# iscsid -f iscsid: can not bind NETLINK_ISCSI socket
strace also attached
--- Additional comment from James Slagle on 2014-05-21 14:45:57 EDT ---
To give a little more context into what I'm doing, I'm trying to run OpenStack nova compute configured to use the nova-baremetal driver inside a container.
when nova-baremetal provisions a machine it acts as an iscsi initiator and logs into a target that has been created on the machine that is being provisioned. It then dd's the requested image onto the disk.
therefore, aiui, iscsid must be running inside the container where you are also running iscsiadm.
This same thing has also been tried in lxc, with what I expect is the same issue: https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1226855
--- Additional comment from Daniel Walsh on 2014-05-28 12:29:35 EDT ---
Was SELinux involved? If you put the machine into permissive mode does it work? Or try this as a privleged image. Might be something that we are doing to lock the system down.
--- Additional comment from Daniel Walsh on 2014-05-28 12:33:18 EDT ---
What are the permissions on /opt/hello
ls -ld /opt ls -ld /opt/hello
--- Additional comment from James Slagle on 2014-05-28 15:01:53 EDT ---
SELinux is already in permissive mode on the Docker host.
I did try in a privileged container, and I get something slightly different. iscsid -f just hangs forever on the command line.
An strace shows (attached) it polling forever on a fd, i had to ctrl-c it in both cases.
--- Additional comment from James Slagle on 2014-05-28 15:02:26 EDT ---
--- Additional comment from James Slagle on 2014-05-28 15:03:16 EDT ---
i think comment 3 was for another bug maybe? anyway, if not:
bash-4.2# ls -ld /opt drwxr-xr-x. 2 root root 4096 Aug 7 2013 /opt bash-4.2# ls -ld /opt/hello ls: cannot access /opt/hello: No such file or directory
--- Additional comment from James Slagle on 2014-05-28 15:08:58 EDT ---
ah....actually, i suspect iscsid running forever in the foreground may indicate it *is* working. sorry, i wasn't thinking that -f was telling it to run in the foreground.
i will see if i can actually connect to a target from the privileged container and report back
--- Additional comment from James Slagle on 2014-05-29 09:34:32 EDT ---
I'm using a privileged container running sshd as the process (so that I can login with a couple of different shells),
I had to add this to my Dockerfile for the container, otherwise iscsid won't start: VOLUME ["/var/lock/iscsi"]
i start the conatiner with: docker run --privileged -ti --name initiator -p 8022:22 -d iscsi-initiator
then ssh in, and i start iscsid: iscsid -d 8 -f that appears to start fine
then ssh in another session, and i can discover the target (target is actually on the container host): [root@bcf697cb8673 ~]# iscsiadm -m discovery -t st -p 192.168.122.1 192.168.122.1:3260,1 iqn.2013-07.com.example.storage.ssd1
but when i try to login to the target, iscsid exits (or crashes, hard to tell): [root@bcf697cb8673 ~]# iscsiadm -m node --targetname iqn.2013-07.com.example.storage.ssd1 --portal 192.168.122.1 --login Logging in to [iface: default, target: iqn.2013-07.com.example.storage.ssd1, portal: 192.168.122.1,3260] (multiple) iscsiadm: got read error (0/0), daemon died? iscsiadm: Could not login to [iface: default, target: iqn.2013-07.com.example.storage.ssd1, portal: 192.168.122.1,3260]. iscsiadm: initiator reported error (18 - could not communicate to iscsid) iscsiadm: Could not log into all portals
from the ssh session where iscsid is running i see: iscsid: mgmt_ipc_write_rsp: rsp to fd 5 iscsid: poll result 1 iscsid: mgmt_ipc_write_rsp: rsp to fd 5 iscsid: poll result 1 iscsid: in read_transports iscsid: Adding new transport tcp iscsid: Matched transport tcp
iscsid: sysfs_attr_get_value: open '/class/iscsi_transport/tcp'/'handle'
iscsid: sysfs_attr_get_value: new uncached attribute '/sys/class/iscsi_transport/tcp/handle'
iscsid: sysfs_attr_get_value: add to cache '/sys/class/iscsi_transport/tcp/handle'
iscsid: sysfs_attr_get_value: cache '/sys/class/iscsi_transport/tcp/handle' with attribute value '18446744072107593760'
iscsid: sysfs_attr_get_value: open '/class/iscsi_transport/tcp'/'caps'
iscsid: sysfs_attr_get_value: new uncached attribute '/sys/class/iscsi_transport/tcp/caps'
iscsid: sysfs_attr_get_value: add to cache '/sys/class/iscsi_transport/tcp/caps'
iscsid: sysfs_attr_get_value: cache '/sys/class/iscsi_transport/tcp/caps' with attribute value '0x39'
iscsid: Allocted session 0x7f38ceb4f9b0 iscsid: no authentication configured... iscsid: resolved 192.168.122.1 to 192.168.122.1 iscsid: setting iface default, dev , set ip , hw , transport tcp.
iscsid: get ev context 0x7f38ceb5c470 iscsid: set TCP recv window size to 524288, actually got 425984 iscsid: set TCP send window size to 524288, actually got 425984 iscsid: connecting to 192.168.122.1:3260 iscsid: sched conn context 0x7f38ceb5c470 event 2, tmo 0 iscsid: thread 0x7f38ceb5c470 schedule: delay 0 state 3 iscsid: Setting login timer 0x7f38ceb578e0 timeout 15 iscsid: thread 0x7f38ceb578e0 schedule: delay 60 state 3 iscsid: exec thread 7f38ceb5c470 callback iscsid: put ev context 0x7f38ceb5c470 iscsid: connected local port 37259 to 192.168.122.1:3260 iscsid: in kcreate_session iscsid: in __kipc_call iscsid: in kwritev iscsid: sendmsg: bug? ctrl_fd 4
Maybe the lines with sysfs_attr_get_value are indicative of something that's needed from /sys still?
These exact same discovery and login commands work fine running from a libvirt vm connecting to the same target.
On my container host, i do have the correct iscsi kernel modules, and I also see these in the container: on the host: [root@teletran-1 docker]# lsmod | grep iscsi iscsi_tcp 18333 0 libiscsi_tcp 24176 1 iscsi_tcp libiscsi 54750 2 libiscsi_tcp,iscsi_tcp scsi_transport_iscsi 97405 4 iscsi_tcp,libiscsi
on the container: [root@bcf697cb8673 ~]# lsmod | grep iscsi iscsi_tcp 18333 0 libiscsi_tcp 24176 1 iscsi_tcp libiscsi 54750 2 libiscsi_tcp,iscsi_tcp scsi_transport_iscsi 97405 4 iscsi_tcp,libiscsi
i'll attach an strace of the iscsid process that's exiting, if that helps. i can also attach an strace of an iscsid process that shows it working from a libvirt VM, if you think that would be helpful to compare.
--- Additional comment from James Slagle on 2014-05-29 09:37:12 EDT ---
strace of iscsid generated with: strace -f -o iscsid.strace iscsid -d 12 -f
the iscsid process exits when you try to login to an iscsi target from the container.
--- Additional comment from James Slagle on 2014-05-29 10:08:33 EDT ---
note that the output i'm now seeing seems to match very closely what was reported in https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1226855 when this same thing was tried with lxc-tools instead of Docker.
--- Additional comment from Daniel Walsh on 2014-05-29 15:32:31 EDT ---
So this looks to be more specific to iscsid and namespacing then to docker. I think you should open the bug with them and see if they can help.
--- Additional comment from Chris Leech on 2014-06-04 19:44:37 EDT ---
It looks like the iSCSI netlink control interface isn't namespace aware, and the kernel side of the iSCSI initiator rejects messages that don't come from the default network namespace. I suppose it might make sense to track active sessions per netns.
--- Additional comment from James Slagle on 2014-06-05 07:51:06 EDT ---
--- Additional comment from Radek Vykydal on 2015-03-30 08:50:41 EDT ---
FYI I was playing with running iscsid in superprivileged container: https://github.com/rvykydal/dockerfile-iscsid/tree/master/rhel7
--- Additional comment from Chris Leech on 2015-04-23 13:30:38 EDT ---
I've spent some more time looking at running iscsid in a network namespace container, and there are a number of kernel issues that need to be worked out.
The kernel side of the iSCSI netlink family only listens in the initial network namespace (just like some other storage related netlink families). It's easy enough to add per-namespace kernel sockets, a bit more work to associate network namespaces to iSCSI objects in order to route async event notifications to the right place.
iSCSI makes heavy use of sysfs as well, and many of the iSCSI sysfs devices will need network namespace tags for filtered views of sysfs. I think that makes sense for iscsi_host, iscsi_session, and iscsi_connection. Certainly not for iscsi_transport. Possibly for iscsi_endpoint and iscsi_iface.
Without growing some way to assign an iscsi_host to a net namespace like is done for network devices, this will probably work for dynamically generated hosts (iscsi_tcp) but not for offload hardware.
I can imagine use cases where it might be nice to have multiple iscsid instances in their own containers managing their own set of iSCSI sessions.
I've got some work started, but it's not ready for review and testing just yet.
Without that, I don't see how we can get to multiple working iscsid processes or even a single iscsid running without --net=host
--- Additional comment from Fedora End Of Life on 2015-05-29 07:55:11 EDT ---
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'.
Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
--- Additional comment from Vaibhav Khanduja on 2015-07-09 16:29:41 EDT ---
(In reply to Chris Leech from comment #15)
I've spent some more time looking at running iscsid in a network namespace container, and there are a number of kernel issues that need to be worked out.
The kernel side of the iSCSI netlink family only listens in the initial network namespace (just like some other storage related netlink families). It's easy enough to add per-namespace kernel sockets, a bit more work to associate network namespaces to iSCSI objects in order to route async event notifications to the right place.
iSCSI makes heavy use of sysfs as well, and many of the iSCSI sysfs devices will need network namespace tags for filtered views of sysfs. I think that makes sense for iscsi_host, iscsi_session, and iscsi_connection. Certainly not for iscsi_transport. Possibly for iscsi_endpoint and iscsi_iface.
Without growing some way to assign an iscsi_host to a net namespace like is done for network devices, this will probably work for dynamically generated hosts (iscsi_tcp) but not for offload hardware.
I can imagine use cases where it might be nice to have multiple iscsid instances in their own containers managing their own set of iSCSI sessions.
I've got some work started, but it's not ready for review and testing just yet.
Without that, I don't see how we can get to multiple working iscsid processes or even a single iscsid running without --net=host
do you have working patch? I wanted to test this out to check the working?
--- Additional comment from Chris Leech on 2015-07-13 14:09:21 EDT ---
(In reply to Vaibhav Khanduja from comment #17)
do you have working patch? I wanted to test this out to check the working?
In progress patches sent via email.
--- Additional comment from paguayo on 2016-04-11 14:39:46 EDT ---
(In reply to Chris Leech from comment #18)
(In reply to Vaibhav Khanduja from comment #17)
do you have working patch? I wanted to test this out to check the working?
In progress patches sent via email.
Any progress on this patch?
--- Additional comment from Fedora End Of Life on 2016-07-19 07:32:46 EDT ---
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug.
Thank you for reporting this bug and we are sorry it could not be fixed.
--- Additional comment from Daniel Walsh on 2016-07-19 10:18:03 EDT ---
Is this something we should continue to care about?
--- Additional comment from Jan Kurik on 2016-07-26 00:08:27 EDT ---
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle. Changing version to '25'.
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1100000 [Bug 1100000] can't run iscsid in a docker container https://bugzilla.redhat.com/show_bug.cgi?id=1102911 [Bug 1102911] can't run iscsid in a docker container
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Red Hat Bugzilla Rules Engine rule-engine@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Extras
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Daniel Walsh dwalsh@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|docker |kernel Assignee|dwalsh@redhat.com |kernel-mgr@redhat.com QA Contact|atomic-bugs@redhat.com |kernel-qe@redhat.com
Red Hat Bugzilla bugzilla@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Group| |private
--- Comment #3 from Daniel Walsh dwalsh@redhat.com --- This is a kernel issue in that iscsid can not be run in a non host network namespace.
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
--- Comment #4 from Chris Leech cleech@redhat.com --- (In reply to Daniel Walsh from comment #3)
This is a kernel issue in that iscsid can not be run in a non host network namespace.
That's correct. I have patches to fix the iSCSI netlink code in kernel to run from a network namespace, at least for iscsi_tcp, with session visibility limited per-namespace. There's some sysfs visibility issues that need fixing, for objects that were implemented as bus devices that need migrating to class code for the namespace filtering support. And offloading HBAs would still need to be managed from the root network namespace unless additional work was done to implement a way to assign HBA hosts to a namespace as well.
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
--- Comment #5 from Chris Leech cleech@redhat.com --- (In reply to Chris Leech from comment #4)
(In reply to Daniel Walsh from comment #3)
This is a kernel issue in that iscsid can not be run in a non host network namespace.
That's correct. I have patches to fix the iSCSI netlink code in kernel to run from a network namespace, at least for iscsi_tcp, with session visibility limited per-namespace. There's some sysfs visibility issues that need fixing, for objects that were implemented as bus devices that need migrating to class code for the namespace filtering support. And offloading HBAs would still need to be managed from the root network namespace unless additional work was done to implement a way to assign HBA hosts to a namespace as well.
Oh, and while this could get the iSCSI session management to be namespace aware, all of the attached SCSI devices (and host/target objects) would remain visible to the entire system.
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Marcelo Ricardo Leitner mleitner@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Sub Component| |Networking Priority|unspecified |medium CC| |atragler@redhat.com, | |kzhang@redhat.com, | |mleitner@redhat.com Assignee|kernel-mgr@redhat.com |rkhan@redhat.com QA Contact|kernel-qe@redhat.com |network-qe@redhat.com
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Marcelo Ricardo Leitner mleitner@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Sub Component|Networking |Storage Storage Drivers Assignee|rkhan@redhat.com |revers@redhat.com QA Contact|network-qe@redhat.com |storage-qe@redhat.com
--- Comment #6 from Marcelo Ricardo Leitner mleitner@redhat.com --- Moving to storage, this is iscsi driver in the kernel.
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|revers@redhat.com |cleech@redhat.com
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |1422628
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |1429639
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks|1429639 |
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Williams cww@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |1420851
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
yanfu,wang yanwang@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(cleech@redhat.com | |)
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Martin Hoyer mhoyer@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |mhoyer@redhat.com QA Contact|storage-qe@redhat.com |mhoyer@redhat.com
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Derrick Ornelas dornelas@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dornelas@redhat.com Blocks| |1186913
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |POST
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Chris Leech cleech@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|POST |ASSIGNED
--- Comment #10 from Chris Leech cleech@redhat.com --- Upstream acceptance is held up with an issue surrounding namespace removal with active iSCSI sessions. My current patchset leaves iSCSI active, and holds the namespace alive but without any referencing processes that would allow it to be recovered for manual cleanup.
Unfortunately this isn't ready for 7.5, and I'll need to postpone to continue working on it.
https://bugzilla.redhat.com/show_bug.cgi?id=1416129 Bug 1416129 depends on bug 1100000, which changed state.
Bug 1100000 Summary: can't run iscsid in a docker container https://bugzilla.redhat.com/show_bug.cgi?id=1100000
What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |CLOSED Resolution|--- |EOL
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
loberman loberman@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |loberman@redhat.com Blocks| |1546181
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Matthew Owens mowens@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |mowens@redhat.com
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
Rob Evers revers@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |revers@redhat.com Blocks|1546181 |
https://bugzilla.redhat.com/show_bug.cgi?id=1416129
--- Comment #14 from Chris Leech cleech@redhat.com --- patches not resolved upstream, deferring
golang-updates@lists.stg.fedoraproject.org