Hi!
We are running a CentOS6 server using SSSD that connects to 389DS containing 70k user entries. Both servers are fully updated. SSSD and 389DS package versions: sssd-1.9.2-129.el6_5.4.x86_64 389-ds-base-1.2.11.15-31.el6_5.x86_64
Authoconfig was used to enable sssd. authconfig --enablesssd --enablesssdauth --ldapbasedn=dc=users,dc=company,dc=tld --enableshadow --enablemkhomedir --enablelocauthorize --update PAM an NSS configs were updated as well. I have attached our sssd.conf.
The setup itself works allowing users to authenticate, but we are concerned about the performance. At first we tried with enumeration enabled, but there was a significant responsiveness drop during enumeration. A simple getent -s sss passwd USERNAME took more than 15 seconds. Result paging did not help.
Next we turned enumeration off and deleted the cache for a clean start. We tried simple getent requests with 1000 random usernames taken from a file. We ran the bash script consecutively a few times. The results: - run 1: 0m10.831s - run 2: 0m20.914s - run 3: 0m31.422s and so on. Each run took about 10 seconds more than the previous one. During the test sssd_be was using 100% of one core. During this time 389DS was practically idling. Its load (CPU, I/O) hardly showed any change.
What could be the reason for this performace issue? How would we best go about tuning this system?
Regards, Mitja
On Thu, Jan 16, 2014 at 11:29:53AM +0100, Mitja Mihelič wrote:
Hi!
We are running a CentOS6 server using SSSD that connects to 389DS containing 70k user entries. Both servers are fully updated. SSSD and 389DS package versions: sssd-1.9.2-129.el6_5.4.x86_64 389-ds-base-1.2.11.15-31.el6_5.x86_64
Authoconfig was used to enable sssd. authconfig --enablesssd --enablesssdauth --ldapbasedn=dc=users,dc=company,dc=tld --enableshadow --enablemkhomedir --enablelocauthorize --update PAM an NSS configs were updated as well. I have attached our sssd.conf.
The setup itself works allowing users to authenticate, but we are concerned about the performance. At first we tried with enumeration enabled, but there was a significant responsiveness drop during enumeration. A simple getent -s sss passwd USERNAME took more than 15 seconds. Result paging did not help.
Next we turned enumeration off and deleted the cache for a clean start. We tried simple getent requests with 1000 random usernames taken from a file. We ran the bash script consecutively a few times. The results:
- run 1: 0m10.831s
- run 2: 0m20.914s
- run 3: 0m31.422s
and so on. Each run took about 10 seconds more than the previous one. During the test sssd_be was using 100% of one core. During this time 389DS was practically idling. Its load (CPU, I/O) hardly showed any change.
Do you also see bad performance when the cache is not empty? (For instance when many users log in in the morning when the day starts..)
What could be the reason for this performace issue? How would we best go about tuning this system?
One thing that might help would be fine-tuning the cache expiration, in particular entry_cache_nowait_percentage. This midpoint refresh would allow the SSSD to return data from cache right away while issuing an update in the background.
Some users have also symlinked the cache to /dev/shm, but putting the cache to ramdisk obviously removes it on reboot.
On (16/01/14 12:06), Jakub Hrozek wrote:
On Thu, Jan 16, 2014 at 11:29:53AM +0100, Mitja Mihelič wrote:
Hi!
We are running a CentOS6 server using SSSD that connects to 389DS containing 70k user entries. Both servers are fully updated. SSSD and 389DS package versions: sssd-1.9.2-129.el6_5.4.x86_64 389-ds-base-1.2.11.15-31.el6_5.x86_64
Authoconfig was used to enable sssd. authconfig --enablesssd --enablesssdauth --ldapbasedn=dc=users,dc=company,dc=tld --enableshadow --enablemkhomedir --enablelocauthorize --update PAM an NSS configs were updated as well. I have attached our sssd.conf.
The setup itself works allowing users to authenticate, but we are concerned about the performance. At first we tried with enumeration enabled, but there was a significant responsiveness drop during enumeration. A simple getent -s sss passwd USERNAME took more than 15 seconds. Result paging did not help.
Next we turned enumeration off and deleted the cache for a clean start. We tried simple getent requests with 1000 random usernames taken from a file. We ran the bash script consecutively a few times. The results:
- run 1: 0m10.831s
- run 2: 0m20.914s
- run 3: 0m31.422s
and so on. Each run took about 10 seconds more than the previous one. During the test sssd_be was using 100% of one core. During this time 389DS was practically idling. Its load (CPU, I/O) hardly showed any change.
Do you also see bad performance when the cache is not empty? (For instance when many users log in in the morning when the day starts..)
What could be the reason for this performace issue? How would we best go about tuning this system?
One thing that might help would be fine-tuning the cache expiration, in particular entry_cache_nowait_percentage. This midpoint refresh would allow the SSSD to return data from cache right away while issuing an update in the background.
Some users have also symlinked the cache to /dev/shm, but putting the cache to ramdisk obviously removes it on reboot.
my line in /etc/fstab. (I had it for testing purpose)
tmpfs /var/lib/sss/db/ tmpfs size=300M,mode=0700,noauto,rootcontext=system_u:object_r:sssd_var_lib_t:s0 0 0
LS
On 16. 01. 2014 12:09, Lukas Slebodnik wrote:
On (16/01/14 12:06), Jakub Hrozek wrote:
On Thu, Jan 16, 2014 at 11:29:53AM +0100, Mitja Mihelič wrote:
Hi!
We are running a CentOS6 server using SSSD that connects to 389DS containing 70k user entries. Both servers are fully updated. SSSD and 389DS package versions: sssd-1.9.2-129.el6_5.4.x86_64 389-ds-base-1.2.11.15-31.el6_5.x86_64
Authoconfig was used to enable sssd. authconfig --enablesssd --enablesssdauth --ldapbasedn=dc=users,dc=company,dc=tld --enableshadow --enablemkhomedir --enablelocauthorize --update PAM an NSS configs were updated as well. I have attached our sssd.conf.
The setup itself works allowing users to authenticate, but we are concerned about the performance. At first we tried with enumeration enabled, but there was a significant responsiveness drop during enumeration. A simple getent -s sss passwd USERNAME took more than 15 seconds. Result paging did not help.
Next we turned enumeration off and deleted the cache for a clean start. We tried simple getent requests with 1000 random usernames taken from a file. We ran the bash script consecutively a few times. The results:
- run 1: 0m10.831s
- run 2: 0m20.914s
- run 3: 0m31.422s
and so on. Each run took about 10 seconds more than the previous one. During the test sssd_be was using 100% of one core. During this time 389DS was practically idling. Its load (CPU, I/O) hardly showed any change.
Do you also see bad performance when the cache is not empty? (For instance when many users log in in the morning when the day starts..)
No. When the entries are cached the performance is as we expected. A script requesting info about 1000 cached users completes in about two seconds consistently.
What could be the reason for this performace issue? How would we best go about tuning this system?
One thing that might help would be fine-tuning the cache expiration, in particular entry_cache_nowait_percentage. This midpoint refresh would allow the SSSD to return data from cache right away while issuing an update in the background.
We had not explicitly set entry_cache_timeout for the LDAP domain. It was expected the default value of 5400 seconds was used. In the NSS section entry_cache_nowait_percentage was set to 50.
What exactly does memcache_timeout do? The man explanation is quite clear to me.
Some users have also symlinked the cache to /dev/shm, but putting the cache to ramdisk obviously removes it on reboot.
Non-persistent cache should not be a problem. We could do an enumeration at boot to fill/seed the cache and then disable it. Moving the cache to tmpfs reduced the load from 100% to just below 95%.
We are also using an SSL connection to 389DS. Could that have such an impact in CPU resources?
my line in /etc/fstab. (I had it for testing purpose)
tmpfs /var/lib/sss/db/ tmpfs size=300M,mode=0700,noauto,rootcontext=system_u:object_r:sssd_var_lib_t:s0 0 0
LS _______________________________________________ sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
On 01/16/2014 05:29 AM, Mitja Mihelič wrote:
Hi!
We are running a CentOS6 server using SSSD that connects to 389DS containing 70k user entries. Both servers are fully updated. SSSD and 389DS package versions: sssd-1.9.2-129.el6_5.4.x86_64 389-ds-base-1.2.11.15-31.el6_5.x86_64
Authoconfig was used to enable sssd. authconfig --enablesssd --enablesssdauth --ldapbasedn=dc=users,dc=company,dc=tld --enableshadow --enablemkhomedir --enablelocauthorize --update PAM an NSS configs were updated as well. I have attached our sssd.conf.
The setup itself works allowing users to authenticate, but we are concerned about the performance. At first we tried with enumeration enabled, but there was a significant responsiveness drop during enumeration. A simple getent -s sss passwd USERNAME took more than 15 seconds. Result paging did not help.
Next we turned enumeration off and deleted the cache for a clean start. We tried simple getent requests with 1000 random usernames taken from a file. We ran the bash script consecutively a few times. The results:
- run 1: 0m10.831s
- run 2: 0m20.914s
- run 3: 0m31.422s
and so on. Each run took about 10 seconds more than the previous one. During the test sssd_be was using 100% of one core. During this time 389DS was practically idling. Its load (CPU, I/O) hardly showed any change.
What could be the reason for this performace issue? How would we best go about tuning this system?
Regards, Mitja
sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
Can it be due to group membership refresh? Do you have a group that all 70K users are in?
On 16. 01. 2014 14:29, Dmitri Pal wrote:
On 01/16/2014 05:29 AM, Mitja Mihelič wrote:
Hi!
We are running a CentOS6 server using SSSD that connects to 389DS containing 70k user entries. Both servers are fully updated. SSSD and 389DS package versions: sssd-1.9.2-129.el6_5.4.x86_64 389-ds-base-1.2.11.15-31.el6_5.x86_64
Authoconfig was used to enable sssd. authconfig --enablesssd --enablesssdauth --ldapbasedn=dc=users,dc=company,dc=tld --enableshadow --enablemkhomedir --enablelocauthorize --update PAM an NSS configs were updated as well. I have attached our sssd.conf.
The setup itself works allowing users to authenticate, but we are concerned about the performance. At first we tried with enumeration enabled, but there was a significant responsiveness drop during enumeration. A simple getent -s sss passwd USERNAME took more than 15 seconds. Result paging did not help.
Next we turned enumeration off and deleted the cache for a clean start. We tried simple getent requests with 1000 random usernames taken from a file. We ran the bash script consecutively a few times. The results:
- run 1: 0m10.831s
- run 2: 0m20.914s
- run 3: 0m31.422s
and so on. Each run took about 10 seconds more than the previous one. During the test sssd_be was using 100% of one core. During this time 389DS was practically idling. Its load (CPU, I/O) hardly showed any change.
What could be the reason for this performace issue? How would we best go about tuning this system?
Regards, Mitja
sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
Can it be due to group membership refresh? Do you have a group that all 70K users are in?
All users except 27 out of 70k are members of the same group. The group is defined locally in /etc/group. In /etc/nsswitch.conf we have group: files
Regards, Mitja
On Thu, Jan 16, 2014 at 04:16:32PM +0100, Mitja Mihelič wrote:
Can it be due to group membership refresh? Do you have a group that all 70K users are in?
All users except 27 out of 70k are members of the same group. The group is defined locally in /etc/group. In /etc/nsswitch.conf we have group: files
Regards, Mitja
Right, even if they were members of an LDAP group, then 'getent passwd' shouldn't be the culprit, usually 'getent group' or 'id' is slow.
sssd-users@lists.fedorahosted.org