hi!
I'm having problems with enumerate = true
If i query single user with id <user> or getent passwd <user> i get a succesful response. but I can't get a full list with
getent group
or getent passwd
I get this in log:
(Mon Apr 7 13:05:42 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1]. (Mon Apr 7 13:05:42 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1]. (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_setgrent_send] (0x0100): Received setgrent request (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_getgrent] (0x0100): Requesting info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_getgrent] (0x0100): Requesting info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_endgrent] (0x0100): Terminating request info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [client_recv] (0x0200): Client disconnected! (Mon Apr 7 13:05:46 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1]. (Mon Apr 7 13:05:46 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1]. (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_setpwent_send] (0x0100): Received setpwent request (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_getpwent] (0x0100): Requesting info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_getpwent] (0x0100): Requesting info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_endpwent] (0x0100): Terminating request info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [client_recv] (0x0200): Client disconnected!
this is my conf:
[sssd] config_file_version = 2 reconnection_retries = 3 sbus_timeout = 30 services = nss, pam domains = XXX.net
[nss] filter_groups = root filter_users = root reconnection_retries = 3 debug_level = 5
[pam] reconnection_retries = 3
[domain/XXX.net] enumerate = true cache_credentials = true debug_level = 5 min_id = 500
id_provider = ldap auth_provider = ldap chpass_provider = ldap
ldap_schema = rfc2307bis
ldap_uri = ldap://XXX.XXX.net ldap_search_base = dc=example,dc=global
ldap_tls_reqcert = demand # ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt ldap_tls_cacertdir = /etc/ssl/certs
some more info:
# sssd --version 1.9.2
# lsb_release -a LSB Version: :base-4.0-ia32:base-4.0-noarch:core-4.0-ia32:core-4.0-noarch Distributor ID: CentOS Description: CentOS release 6.5 (Final) Release: 6.5 Codename: Final
anyone facing same problem?
abosch
On Mon, Apr 07, 2014 at 11:08:00AM +0200, Angel Bosch wrote:
hi!
I'm having problems with enumerate = true
If i query single user with id <user> or getent passwd <user> i get a succesful response. but I can't get a full list with
getent group
or getent passwd
I get this in log:
(Mon Apr 7 13:05:42 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1]. (Mon Apr 7 13:05:42 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1]. (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_setgrent_send] (0x0100): Received setgrent request (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_getgrent] (0x0100): Requesting info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_getgrent] (0x0100): Requesting info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_endgrent] (0x0100): Terminating request info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [client_recv] (0x0200): Client disconnected! (Mon Apr 7 13:05:46 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1]. (Mon Apr 7 13:05:46 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1]. (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_setpwent_send] (0x0100): Received setpwent request (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_getpwent] (0x0100): Requesting info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_getpwent] (0x0100): Requesting info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_endpwent] (0x0100): Terminating request info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [client_recv] (0x0200): Client disconnected!
this is my conf:
[sssd] config_file_version = 2 reconnection_retries = 3 sbus_timeout = 30 services = nss, pam domains = XXX.net
[nss] filter_groups = root filter_users = root reconnection_retries = 3 debug_level = 5
[pam] reconnection_retries = 3
[domain/XXX.net] enumerate = true cache_credentials = true debug_level = 5 min_id = 500
id_provider = ldap auth_provider = ldap chpass_provider = ldap
ldap_schema = rfc2307bis
ldap_uri = ldap://XXX.XXX.net ldap_search_base = dc=example,dc=global
ldap_tls_reqcert = demand # ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt ldap_tls_cacertdir = /etc/ssl/certs
some more info:
# sssd --version 1.9.2
# lsb_release -a LSB Version: :base-4.0-ia32:base-4.0-noarch:core-4.0-ia32:core-4.0-noarch Distributor ID: CentOS Description: CentOS release 6.5 (Final) Release: 6.5 Codename: Final
anyone facing same problem?
Can you see the enumeration task in the sssd domain log? It should run after startup (with some delay in order not to interfere with system startup) and download all users and groups.
I'd start the investigation there, the config file looks OK.
btw in general I don't recommend enumeration if your directory is very large (thousands or tens of thousands of entries), the sssd might be trashing the disk saving so many entries.
Can you see the enumeration task in the sssd domain log? It should run after startup (with some delay in order not to interfere with system startup) and download all users and groups.
I'd start the investigation there, the config file looks OK.
ok, after raising log level to 7 (thanks for the tip) I think I've found the problem:
(Mon Apr 7 13:36:54 2014) [sssd[be[XXX.net]]] [sdap_get_generic_ext_done] (0x0400): Search result: Administrative limit exceeded(11), no errmsg set
I must use a binddn user without restritions.
btw in general I don't recommend enumeration if your directory is very large (thousands or tens of thousands of entries), the sssd might be trashing the disk saving so many entries.
I'm aware of that but I need some machines to be able to enumerate all users for some cron scripts.
abosch
On Mon, Apr 07, 2014 at 11:40:58AM +0200, Angel Bosch wrote:
Can you see the enumeration task in the sssd domain log? It should run after startup (with some delay in order not to interfere with system startup) and download all users and groups.
I'd start the investigation there, the config file looks OK.
ok, after raising log level to 7 (thanks for the tip) I think I've found the problem:
(Mon Apr 7 13:36:54 2014) [sssd[be[XXX.net]]] [sdap_get_generic_ext_done] (0x0400): Search result: Administrative limit exceeded(11), no errmsg set
I must use a binddn user without restritions.
As an alternative you can try to lower ldap_page_size where the default is 1000.
HTH
bye, Sumit
btw in general I don't recommend enumeration if your directory is very large (thousands or tens of thousands of entries), the sssd might be trashing the disk saving so many entries.
I'm aware of that but I need some machines to be able to enumerate all users for some cron scripts.
abosch
sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
On (07/04/14 12:56), Angel Bosch wrote:
As an alternative you can try to lower ldap_page_size where the default is 1000.
thx for the tip, I'll try that.
I'm having a lot of warnings like this:
[sysdb_store_user] (0x0080): A user with the same UID [593663] was removed from the cache
It means that at least two users have the same UID. You should fix your LDAP server. From security point ov view, it is not a good idea to have different users with the same UID.
LS
On Mon, Apr 07, 2014 at 01:07:59PM +0200, Lukas Slebodnik wrote:
On (07/04/14 12:56), Angel Bosch wrote:
As an alternative you can try to lower ldap_page_size where the default is 1000.
thx for the tip, I'll try that.
I'm having a lot of warnings like this:
[sysdb_store_user] (0x0080): A user with the same UID [593663] was removed from the cache
It means that at least two users have the same UID. You should fix your LDAP server. From security point ov view, it is not a good idea to have different users with the same UID.
LS
And in general enumeration will not work correctly with ID duplicates.
On Mon, Apr 07, 2014 at 12:56:21PM +0200, Angel Bosch wrote:
As an alternative you can try to lower ldap_page_size where the default is 1000.
thx for the tip, I'll try that.
I'm having a lot of warnings like this:
[sysdb_store_user] (0x0080): A user with the same UID [593663] was removed from the cache
is that normal?
I'm sorry, but no. If you started with an empty cache this indicates that multiple LDAP users have the same value in the UID attribute, which is bad, because it should be unique. Maybe you have to choose a more specific search base?.
bye, Sumit
abosch _______________________________________________ sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
I'm sorry, but no. If you started with an empty cache this indicates that multiple LDAP users have the same value in the UID attribute, which is bad, because it should be unique. Maybe you have to choose a more specific search base?.
ok, I've fixed duplication problems. It was just some miscreated users. No more messages regarding duplicated uid.
I've removed /var/lib/sss/db and restarted sssd.
Now I can see sssd_be process eating 100% cpu non-stop and I get 3 errors every 2 minutes in domain log:
(Wed Apr 9 10:23:58 2014) [sssd[be[xxx.net]]] [fo_add_server] (0x0080): Adding new server 'xxx.net', to service 'LDAP' (Wed Apr 9 10:23:58 2014) [sssd[be[xxx.net]]] [be_process_init] (0x0080): No SUDO module provided for [xxx.xxx.net] !! (Wed Apr 9 10:23:58 2014) [sssd[be[xxx.net]]] [get_single_value_as_string] (0x0080): More than one value found.
I understand that first message is just informative. Second one is about sudo module that I don't use, so I guess there's no worry about that either. But last message is strange and I'm not sure if I should worry about that one.
I also get this on /var/log/messages
Apr 9 12:17:57 ostra sssd: Starting up Apr 9 12:17:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:17:58 ostra sssd[nss]: Starting up Apr 9 12:17:58 ostra sssd[pam]: Starting up Apr 9 12:19:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:21:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:23:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:25:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:27:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:29:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:31:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:33:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:35:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:37:58 ostra sssd[be[xxx.net]]: Starting up
How long should it take sssd_be to be started?
abosch
On (09/04/14 10:40), Angel Bosch wrote:
I'm sorry, but no. If you started with an empty cache this indicates that multiple LDAP users have the same value in the UID attribute, which is bad, because it should be unique. Maybe you have to choose a more specific search base?.
ok, I've fixed duplication problems. It was just some miscreated users. No more messages regarding duplicated uid.
I've removed /var/lib/sss/db and restarted sssd.
Now I can see sssd_be process eating 100% cpu non-stop and I get 3 errors every 2 minutes in domain log:
(Wed Apr 9 10:23:58 2014) [sssd[be[xxx.net]]] [fo_add_server] (0x0080): Adding new server 'xxx.net', to service 'LDAP' (Wed Apr 9 10:23:58 2014) [sssd[be[xxx.net]]] [be_process_init] (0x0080): No SUDO module provided for [xxx.xxx.net] !! (Wed Apr 9 10:23:58 2014) [sssd[be[xxx.net]]] [get_single_value_as_string] (0x0080): More than one value found.
I understand that first message is just informative. Second one is about sudo module that I don't use, so I guess there's no worry about that either. But last message is strange and I'm not sure if I should worry about that one.
I also get this on /var/log/messages
Apr 9 12:17:57 ostra sssd: Starting up Apr 9 12:17:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:17:58 ostra sssd[nss]: Starting up Apr 9 12:17:58 ostra sssd[pam]: Starting up Apr 9 12:19:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:21:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:23:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:25:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:27:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:29:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:31:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:33:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:35:58 ostra sssd[be[xxx.net]]: Starting up Apr 9 12:37:58 ostra sssd[be[xxx.net]]: Starting up
How long should it take sssd_be to be started?
abosch
The problem is enumeration.
I would say you have a lot of users in your LDAP. sssd_be was fetching informations from LDAP (reason of high CPU usage). It took very long time and sssd_be diddn't have a time to reply with "pong" to main process. Therefore sssd_be was restarted.
You can try to increase timeout from default value 10 seconds to 15 or 20. Please do not use very big value, because it can have negative consequences for other processes.
LS
The problem is enumeration.
I would say you have a lot of users in your LDAP. sssd_be was fetching informations from LDAP (reason of high CPU usage). It took very long time and sssd_be diddn't have a time to reply with "pong" to main process. Therefore sssd_be was restarted.
You can try to increase timeout from default value 10 seconds to 15 or 20. Please do not use very big value, because it can have negative consequences for other processes.
ok, finally got it working.
now I see lot of errors like this:
(Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_grpmem] (0x0040): Failed to save user maquines (Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_groups] (0x0040): Failed to store group 5 members.
does it mean I have some other conflict with my objects?
abosch
On (09/04/14 11:16), Angel Bosch wrote:
The problem is enumeration.
I would say you have a lot of users in your LDAP. sssd_be was fetching informations from LDAP (reason of high CPU usage). It took very long time and sssd_be diddn't have a time to reply with "pong" to main process. Therefore sssd_be was restarted.
You can try to increase timeout from default value 10 seconds to 15 or 20. Please do not use very big value, because it can have negative consequences for other processes.
ok, finally got it working.
now I see lot of errors like this:
(Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_grpmem] (0x0040): Failed to save user maquines (Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_groups] (0x0040): Failed to store group 5 members.
It is imposible to say that from these two lines. We need bigger context or better whole log file from domain.
LS
On Wed, Apr 09, 2014 at 11:24:33AM +0200, Lukas Slebodnik wrote:
On (09/04/14 11:16), Angel Bosch wrote:
The problem is enumeration.
I would say you have a lot of users in your LDAP. sssd_be was fetching informations from LDAP (reason of high CPU usage). It took very long time and sssd_be diddn't have a time to reply with "pong" to main process. Therefore sssd_be was restarted.
You can try to increase timeout from default value 10 seconds to 15 or 20. Please do not use very big value, because it can have negative consequences for other processes.
ok, finally got it working.
now I see lot of errors like this:
(Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_grpmem] (0x0040): Failed to save user maquines (Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_groups] (0x0040): Failed to store group 5 members.
It is imposible to say that from these two lines. We need bigger context or better whole log file from domain.
yes, more context would be needed.
Additionally please check if e.g. the user maquines is in all expected groups. If yes, then this messages might just be a side effect of enumeration. If SSSD tries to add a user to a group where it is already a member a error code indicating this might be returned.
bye, Sumit
LS _______________________________________________ sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
Additionally please check if e.g. the user maquines is in all expected groups. If yes, then this messages might just be a side effect of enumeration. If SSSD tries to add a user to a group where it is already a member a error code indicating this might be returned.
something is wrong. maquines is a group, not a user:
# getent group maquines maquines:*:92011: # getent passwd maquines #
so I've digged a little bit and I've found that past certain point sss detects some groups as users.
after increasing verbosity first error I've found is:
(Wed Apr 9 14:32:14 2014) [sssd[be[xxx.net]]] [sysdb_search_group_by_name] (0x0400): No such entry (Wed Apr 9 14:32:14 2014) [sssd[be[xxx.net]]] [sysdb_add_group] (0x0400): Error: 17 (El fitxer ja existeix) (Wed Apr 9 14:32:14 2014) [sssd[be[xxx.net]]] [sysdb_search_group_by_gid] (0x0400): No such entry
this (El fitxer ja existeix) means (File already exists).
then I have some "Failed to save user" and "Failed to store group" errors.
I've uploaded 130 lines here: http://paste.ubuntu.com/7226080/
abosch
On Wed, Apr 09, 2014 at 02:56:42PM +0200, Angel Bosch wrote:
Additionally please check if e.g. the user maquines is in all expected groups. If yes, then this messages might just be a side effect of enumeration. If SSSD tries to add a user to a group where it is already a member a error code indicating this might be returned.
something is wrong. maquines is a group, not a user:
# getent group maquines maquines:*:92011: # getent passwd maquines #
so I've digged a little bit and I've found that past certain point sss detects some groups as users.
after increasing verbosity first error I've found is:
(Wed Apr 9 14:32:14 2014) [sssd[be[xxx.net]]] [sysdb_search_group_by_name] (0x0400): No such entry (Wed Apr 9 14:32:14 2014) [sssd[be[xxx.net]]] [sysdb_add_group] (0x0400): Error: 17 (El fitxer ja existeix) (Wed Apr 9 14:32:14 2014) [sssd[be[xxx.net]]] [sysdb_search_group_by_gid] (0x0400): No such entry
this (El fitxer ja existeix) means (File already exists).
then I have some "Failed to save user" and "Failed to store group" errors.
I've uploaded 130 lines here: http://paste.ubuntu.com/7226080/
abosch
Did you clear the cache after pruning the duplicates from the directory?
Are there maybe some groups that are named the same?
On Wed, Apr 09, 2014 at 11:16:41AM +0200, Angel Bosch wrote:
The problem is enumeration.
I would say you have a lot of users in your LDAP. sssd_be was fetching informations from LDAP (reason of high CPU usage). It took very long time and sssd_be diddn't have a time to reply with "pong" to main process. Therefore sssd_be was restarted.
You can try to increase timeout from default value 10 seconds to 15 or 20. Please do not use very big value, because it can have negative consequences for other processes.
ok, finally got it working.
now I see lot of errors like this:
(Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_grpmem] (0x0040): Failed to save user maquines (Wed Apr 9 11:13:18 2014) [sssd[be[xxx.net]]] [sdap_save_groups] (0x0040): Failed to store group 5 members.
Is this all the logs say even with a high log level? I think the code is not particularly verbose, unfortunately.
Pavel R. will send a patch to include more debug messages at least in master, though..
On Mon, 2014-04-07 at 11:40 +0200, Angel Bosch wrote:
I'm aware of that but I need some machines to be able to enumerate all users for some cron scripts.
Note that if the cron jobs you need to run is for a specified subset of user, you could query explicitly those users in a root cron job :) instead of turning on enumeration.
If those users are in a specific group it is quite simple:
pull-users.sh: #!/bin/bash IFS="," users=`getent group mycrongroup | cut -d ":" -f 4` for u in $users; do getent passwd $u; done
HTH, Simo.
On (07/04/14 11:08), Angel Bosch wrote:
hi!
I'm having problems with enumerate = true
If i query single user with id <user> or getent passwd <user> i get a succesful response. but I can't get a full list with
getent group
or getent passwd
I get this in log:
(Mon Apr 7 13:05:42 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1]. (Mon Apr 7 13:05:42 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1]. (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_setgrent_send] (0x0100): Received setgrent request (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_getgrent] (0x0100): Requesting info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_getgrent] (0x0100): Requesting info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [nss_cmd_endgrent] (0x0100): Terminating request info for all groups (Mon Apr 7 13:05:42 2014) [sssd[nss]] [client_recv] (0x0200): Client disconnected! (Mon Apr 7 13:05:46 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1]. (Mon Apr 7 13:05:46 2014) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1]. (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_setpwent_send] (0x0100): Received setpwent request (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_getpwent] (0x0100): Requesting info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_getpwent] (0x0100): Requesting info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [nss_cmd_endpwent] (0x0100): Terminating request info for all accounts (Mon Apr 7 13:05:46 2014) [sssd[nss]] [client_recv] (0x0200): Client disconnected!
Could you attach full log files from? (sss_nss.log and sssd_XXX.net.log) It will be better if you use debug_level 7 (instead of 5)
LS
sssd-users@lists.fedorahosted.org