Currently, vdsm deamon script starts (respawn) vdsm process as vdsm user, and during that vdsm starts super vdsm process as root.
- when vdsm dies, svdsm process supposed to notice and exit by itself.
- if svdsm dies, in the next proxy call by vdsm, vdsm supposed to
start new svdsm instance after verifying that old instance is dead and call it. This is super vdsm and vdsm startup design in general. --
In current implementation we have some edge cases that we need to handle:
- vsdm can try to communicate with old instance of svdsm.
- vdsm can kill wrong process that got last svdsm pid before starting
the new instance of svdsm.
- vdsm can try to communicate with svdsm before svdsm starts to serve
requests. And i guess there are many more possible senerios that can cause bugs..
As it seems, it doesn't make sense that vdsm manages root process, and can kill it..
What if svdsm will be the manager of vdsm:
- deamon script starts up svdsm instance instead of vdsm
- svdsm forks vdsm and changes its privilege (uid to vdsm)
- vdsm receives as parameter svdsm pid for sudo requests
you mean vdsm can kill svdsm? I'd suggest svdsm exports a getProxy().killSvdsm() functinon.
- when vdsm dies, svdsm will start new instance of vdsm automatically,
supervdsm poke should be change to periodically wait() and restart.
and note the crash reason to syslog.
- svdsm starts with respawn, so when svdsm dies, vdsm dies also as its
son process, and another instance of svdsm will start automatically and start new instance of vdsm.
usecase: svdsm recieve SIGTERM, (1)if svdsm waits to join vdsm in a thread: vdsm as its child process should receive signal to terminate, Using Process() to start vdsm, and if supervdsm reload SIGTERM(sigterm handler is to kill vdsm), supervdsm will never join vdsm and exit.It will just hang there.
(2)when svdsm not wait: vdsm still be alive,svdsm will need to kill vdsm at next time startup or will result in: error: [Errno 98] Address already in use when start a new vdsm to bind xmlrpc server
also, Not sure about if it is safe for svdsm to terminate vdsm because vdsm may in the middle of other activities.
Also, svdsm can init whatever vdsm needs and is limited to do as a vdsm user (check log permissions, clean old temporary files and so on if needed..)
Royce Lv has already started to work on such design here http://gerrit.ovirt.org/#/c/4145/
So my previous choice is to : 1) As root, create a new process for supervdsmServer 2) Drop privileges in the parent 3) Start vdsm in the parent
Then, if vdsm detects a problem with supervdsm, it can just exit. This will cause respawn to restart everything again. So if either process (vdsm, or supervdsm) dies, the whole thing will be respawned.
See the draft here, let's discuss about it: http://www.ovirt.org/Normalize_vdsm_start_up_process
I'm also agree with breaking these two as dependent services just as vdsmd and libvirtd. These will involve supervdsm died and vdsm reconnect, if we use key this will be security issue. So now we use key as local variable.If just use child/parent pipe, here is similar restart logic to handle.
I want to update it and push it forward.
I would like to hear more opinions and point of views on this change,
Thanks.
on 01/04/2013 21:08, Royce Lv wrote:
I'm also agree with breaking these two as dependent services just as vdsmd and libvirtd. These will involve supervdsm died and vdsm reconnect, if we use key this will be security issue. So now we use key as local variable.If just use child/parent pipe, here is similar restart logic to handle.
The auth key of Python manager mechanism is good, but currently when VDSM starts super VDSM, it passes the key in the command line. You can get the key from "ps aux | grep supervdsmServer". This is not secure at all. You can just start a python and try this.
# PYTHONPATH=/usr/share/vdsm python Python 2.7.3 (default, Jul 24 2012, 10:05:38) [GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import supervdsm m = supervdsm._SuperVdsmManager(address='/var/run/vdsm/svdsm.sock', authkey='5e21c5e1-0050-4eca-85a0-098433f3a820') m.register('instance') m.register('open') m.connect() s = m.instance() s.ping()
True
Here auth key is copied from the output of the ps programme. In fact the super VDSM server is not protected by the auth key, but the "srwxr-xr-x vdsm:kvm" of the /var/run/vdsm/svdsm.sock . Ordinary users can not write to this socket. The auth key is generated by VDSM each time it launches a new super VDSM server instance. I think the most useful thing of this auth key is that, after a totally restart, to prevent VDSM to connect to a previous stuck but not yet died super VDSM server, in this case we will get "Authentication Error" and kill it explicitly.
The Python manager framework starts a new thread for each request, that means the super VDSM server serve each call in a new thread. So even a operation cause a thread to stuck, the super VDSM server is still functional. In case that some operation causing the whole process of the super VDSM server stuck, the super VDSM server should fork a child itself and delegate the operation to the child. Anyway what operation will stuck the whole process?
I think splitting VDSM and super VDSM into two services and delegate everything to systemd is simple and robust, just like libvirtd and VDSM. The auth key problem you mentioned also applies to connecting libvirtd, we can just follow the existing solution for it.
----- Original Message -----
From: "Zhou Zheng Sheng" zhshzhou@linux.vnet.ibm.com To: vdsm-devel@lists.fedorahosted.org Sent: Sunday, January 6, 2013 11:03:59 AM Subject: Re: [vdsm] starting up vdsm and svdsm I think splitting VDSM and super VDSM into two services and delegate everything to systemd is simple and robust, just like libvirtd and VDSM. The auth key problem you mentioned also applies to connecting libvirtd, we can just follow the existing solution for it.
I don't understand this auth key thing. Why is it required? Shouldn't it be sufficient to allow only vdsm user to interact with svdsm?
Thanks, Alon.
on 01/06/2013 17:07, Alon Bar-Lev wrote:
----- Original Message -----
From: "Zhou Zheng Sheng" zhshzhou@linux.vnet.ibm.com To: vdsm-devel@lists.fedorahosted.org Sent: Sunday, January 6, 2013 11:03:59 AM Subject: Re: [vdsm] starting up vdsm and svdsm I think splitting VDSM and super VDSM into two services and delegate everything to systemd is simple and robust, just like libvirtd and VDSM. The auth key problem you mentioned also applies to connecting libvirtd, we can just follow the existing solution for it.
I don't understand this auth key thing. Why is it required? Shouldn't it be sufficient to allow only vdsm user to interact with svdsm?
Thanks, Alon.
The auth key is not very useful. It is passed in the command arguments of super VDSM server, very insecure.
By writing follow the existing solution, I mean libvirtd refer to a SASL DB for password and VDSM refer to /etc/pki/vdsm/keys/libvirt_password when connecting to libvirtd.
I agree to allow only vdsm user to access the svdsm.sock and forget the auth key thing because saving the auth key in a vdsm user readonly file does not improve any security level. If the some one can access svdsm.sock, he can always access libvirt_password. libvirtd is mean to be used by many clients so its unix socket file can not be restricted to vdsm user only, it needs a password for each user in the SASL DB. The super VDSM server is only for VDSM itself, so restricting access svdsm.sock is enough, no auth key needed.
----- Original Message -----
From: "Zhou Zheng Sheng" zhshzhou@linux.vnet.ibm.com To: "Alon Bar-Lev" alonbl@redhat.com Cc: vdsm-devel@lists.fedorahosted.org Sent: Sunday, January 6, 2013 11:25:39 AM Subject: Re: [vdsm] starting up vdsm and svdsm
on 01/06/2013 17:07, Alon Bar-Lev wrote:
----- Original Message -----
From: "Zhou Zheng Sheng" zhshzhou@linux.vnet.ibm.com To: vdsm-devel@lists.fedorahosted.org Sent: Sunday, January 6, 2013 11:03:59 AM Subject: Re: [vdsm] starting up vdsm and svdsm I think splitting VDSM and super VDSM into two services and delegate everything to systemd is simple and robust, just like libvirtd and VDSM. The auth key problem you mentioned also applies to connecting libvirtd, we can just follow the existing solution for it.
I don't understand this auth key thing. Why is it required? Shouldn't it be sufficient to allow only vdsm user to interact with svdsm?
Thanks, Alon.
The auth key is not very useful. It is passed in the command arguments of super VDSM server, very insecure.
By writing follow the existing solution, I mean libvirtd refer to a SASL DB for password and VDSM refer to /etc/pki/vdsm/keys/libvirt_password when connecting to libvirtd.
I agree to allow only vdsm user to access the svdsm.sock and forget the auth key thing because saving the auth key in a vdsm user readonly file does not improve any security level. If the some one can access svdsm.sock, he can always access libvirt_password. libvirtd is mean to be used by many clients so its unix socket file can not be restricted to vdsm user only, it needs a password for each user in the SASL DB. The super VDSM server is only for VDSM itself, so restricting access svdsm.sock is enough, no auth key needed.
Great.
BTW: The auth key is not required even if you use multiple local users, as usock can ask the identity of the other party[1].
Alon
[1] http://linux.die.net/man/7/unix SCM_CREDENTIALS
vdsm-devel@lists.stg.fedorahosted.org