On 06/02/2014 07:51 AM, John Hodrien wrote:
On Mon, 2 Jun 2014, Stephen Gallagher wrote:
This is the real problem. If SSSD can route to the IP address, then we have to proceed assuming that the LDAP server should be available (thereby attempting to connect to it and perform online authentication). There's really no way to determine ahead of time whether the service is "supposed" to be available.
You may want to play with the option 'ldap_opt_timeout' (see sssd-ldap(5)). It controls how long the OpenLDAP client libraries will wait for a response (in your case, how long it will wait while the packets are dropped. It defaults to 6s).
This should be a one off hit though, right? If I discover the LDAP server is offline, I should remember this, admittedly recheck periodically, but never cause another delay waiting for it to spring back into life. Given the way some of these laptops are used, I'd even quite like to configure it to default to this state.
When I last tried this (which was a while ago) these delays would happen repeatedly, so the setup was unusable, and I had to ditch sssd on the laptop.
Well, in most common cases, the LDAP server is unresolvable when not on the VPN/inside the network, so SSSD immediately detects that it can't get there and the delay is unnoticeable.
It's those cases where the server is addressable but unresponsive that is much harder to handle.
Right now, we have a two-minute sleep between operations trying to go online again. (I think I saw a patch go in for 1.12 that makes this configurable). That's mostly so that we catch cases where you've connected to the VPN but for one reason or another SSSD doesn't get notified that the network state changed (there are lots of edge-cases that cause this).
I am not 100% sure that the LDAP server being unresponsive is the cause... Once I have the logs I will know more!
But isn't this is design flaw of the LDAP connectivity test? If connectivity is tested only after some application/the system is requesting information from SSSD and the server is unresponsive, this causes a long and unpleasant delay if the request is kept pending until the connection times out.
Hence, I'd suggest that SSSD periodically tests the LDAP connection in the background (or after network state change) *without* an actual request triggering this. As long as the LDAP server is unreachable or unresponsive, SSSD should stay in offline mode and answer requests right away with cached results.
Joschi Brauchle
On Mon, 2014-06-02 at 17:36 +0200, Joschi Brauchle wrote:
On 06/02/2014 07:51 AM, John Hodrien wrote:
On Mon, 2 Jun 2014, Stephen Gallagher wrote:
This is the real problem. If SSSD can route to the IP address, then we have to proceed assuming that the LDAP server should be available (thereby attempting to connect to it and perform online authentication). There's really no way to determine ahead of time whether the service is "supposed" to be available.
You may want to play with the option 'ldap_opt_timeout' (see sssd-ldap(5)). It controls how long the OpenLDAP client libraries will wait for a response (in your case, how long it will wait while the packets are dropped. It defaults to 6s).
This should be a one off hit though, right? If I discover the LDAP server is offline, I should remember this, admittedly recheck periodically, but never cause another delay waiting for it to spring back into life. Given the way some of these laptops are used, I'd even quite like to configure it to default to this state.
When I last tried this (which was a while ago) these delays would happen repeatedly, so the setup was unusable, and I had to ditch sssd on the laptop.
Well, in most common cases, the LDAP server is unresolvable when not on the VPN/inside the network, so SSSD immediately detects that it can't get there and the delay is unnoticeable.
It's those cases where the server is addressable but unresponsive that is much harder to handle.
Right now, we have a two-minute sleep between operations trying to go online again. (I think I saw a patch go in for 1.12 that makes this configurable). That's mostly so that we catch cases where you've connected to the VPN but for one reason or another SSSD doesn't get notified that the network state changed (there are lots of edge-cases that cause this).
I am not 100% sure that the LDAP server being unresponsive is the cause... Once I have the logs I will know more!
But isn't this is design flaw of the LDAP connectivity test? If connectivity is tested only after some application/the system is requesting information from SSSD and the server is unresponsive, this causes a long and unpleasant delay if the request is kept pending until the connection times out.
Hence, I'd suggest that SSSD periodically tests the LDAP connection in the background (or after network state change) *without* an actual request triggering this. As long as the LDAP server is unreachable or unresponsive, SSSD should stay in offline mode and answer requests right away with cached results.
SSSD should already do this by way of the midway refresh feature. However I am not sure it works as expected when the fast cache is in use.
You can temporarily workaround this by having a background script (cron ?) that regularly runs a getent passwd username
So that hopefully users will almost always hit sssd when it is already offline.
Simo.
On Mon, Jun 02, 2014 at 11:58:23AM -0400, Simo Sorce wrote:
On Mon, 2014-06-02 at 17:36 +0200, Joschi Brauchle wrote:
On 06/02/2014 07:51 AM, John Hodrien wrote:
On Mon, 2 Jun 2014, Stephen Gallagher wrote:
This is the real problem. If SSSD can route to the IP address, then we have to proceed assuming that the LDAP server should be available (thereby attempting to connect to it and perform online authentication). There's really no way to determine ahead of time whether the service is "supposed" to be available.
You may want to play with the option 'ldap_opt_timeout' (see sssd-ldap(5)). It controls how long the OpenLDAP client libraries will wait for a response (in your case, how long it will wait while the packets are dropped. It defaults to 6s).
This should be a one off hit though, right? If I discover the LDAP server is offline, I should remember this, admittedly recheck periodically, but never cause another delay waiting for it to spring back into life. Given the way some of these laptops are used, I'd even quite like to configure it to default to this state.
When I last tried this (which was a while ago) these delays would happen repeatedly, so the setup was unusable, and I had to ditch sssd on the laptop.
Well, in most common cases, the LDAP server is unresolvable when not on the VPN/inside the network, so SSSD immediately detects that it can't get there and the delay is unnoticeable.
It's those cases where the server is addressable but unresponsive that is much harder to handle.
Right now, we have a two-minute sleep between operations trying to go online again. (I think I saw a patch go in for 1.12 that makes this configurable). That's mostly so that we catch cases where you've connected to the VPN but for one reason or another SSSD doesn't get notified that the network state changed (there are lots of edge-cases that cause this).
I am not 100% sure that the LDAP server being unresponsive is the cause... Once I have the logs I will know more!
But isn't this is design flaw of the LDAP connectivity test? If connectivity is tested only after some application/the system is requesting information from SSSD and the server is unresponsive, this causes a long and unpleasant delay if the request is kept pending until the connection times out.
Hence, I'd suggest that SSSD periodically tests the LDAP connection in the background (or after network state change) *without* an actual request triggering this. As long as the LDAP server is unreachable or unresponsive, SSSD should stay in offline mode and answer requests right away with cached results.
SSSD should already do this by way of the midway refresh feature. However I am not sure it works as expected when the fast cache is in use.
Correct, but the midpoint refresh only works this way if you're between 50% (by default) and 100% of cache validity. Once you're past cache expiration completely, you trigger a back end lookup. Also for initgroups during login, we retry online more aggressively.
You can temporarily workaround this by having a background script (cron ?) that regularly runs a getent passwd username
So that hopefully users will almost always hit sssd when it is already offline.
These feature requests like Joschi's are more and more frequent and I'm thinking we should extend the periodical background refresh task to also users and groups..
IIRC the current version only supports netgroups (mostly because netgroups can't be enumerated).
Simo.
-- Simo Sorce * Red Hat, Inc * New York
sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
On Mon, 2014-06-02 at 18:47 +0200, Jakub Hrozek wrote:
On Mon, Jun 02, 2014 at 11:58:23AM -0400, Simo Sorce wrote:
On Mon, 2014-06-02 at 17:36 +0200, Joschi Brauchle wrote:
On 06/02/2014 07:51 AM, John Hodrien wrote:
On Mon, 2 Jun 2014, Stephen Gallagher wrote:
This is the real problem. If SSSD can route to the IP address, then we have to proceed assuming that the LDAP server should be available (thereby attempting to connect to it and perform online authentication). There's really no way to determine ahead of time whether the service is "supposed" to be available.
You may want to play with the option 'ldap_opt_timeout' (see sssd-ldap(5)). It controls how long the OpenLDAP client libraries will wait for a response (in your case, how long it will wait while the packets are dropped. It defaults to 6s).
This should be a one off hit though, right? If I discover the LDAP server is offline, I should remember this, admittedly recheck periodically, but never cause another delay waiting for it to spring back into life. Given the way some of these laptops are used, I'd even quite like to configure it to default to this state.
When I last tried this (which was a while ago) these delays would happen repeatedly, so the setup was unusable, and I had to ditch sssd on the laptop.
Well, in most common cases, the LDAP server is unresolvable when not on the VPN/inside the network, so SSSD immediately detects that it can't get there and the delay is unnoticeable.
It's those cases where the server is addressable but unresponsive that is much harder to handle.
Right now, we have a two-minute sleep between operations trying to go online again. (I think I saw a patch go in for 1.12 that makes this configurable). That's mostly so that we catch cases where you've connected to the VPN but for one reason or another SSSD doesn't get notified that the network state changed (there are lots of edge-cases that cause this).
I am not 100% sure that the LDAP server being unresponsive is the cause... Once I have the logs I will know more!
But isn't this is design flaw of the LDAP connectivity test? If connectivity is tested only after some application/the system is requesting information from SSSD and the server is unresponsive, this causes a long and unpleasant delay if the request is kept pending until the connection times out.
Hence, I'd suggest that SSSD periodically tests the LDAP connection in the background (or after network state change) *without* an actual request triggering this. As long as the LDAP server is unreachable or unresponsive, SSSD should stay in offline mode and answer requests right away with cached results.
SSSD should already do this by way of the midway refresh feature. However I am not sure it works as expected when the fast cache is in use.
Correct, but the midpoint refresh only works this way if you're between 50% (by default) and 100% of cache validity. Once you're past cache expiration completely, you trigger a back end lookup. Also for initgroups during login, we retry online more aggressively.
You can temporarily workaround this by having a background script (cron ?) that regularly runs a getent passwd username
So that hopefully users will almost always hit sssd when it is already offline.
These feature requests like Joschi's are more and more frequent and I'm thinking we should extend the periodical background refresh task to also users and groups..
IIRC the current version only supports netgroups (mostly because netgroups can't be enumerated).
+1 as an option, it would be valuable.
Simo.
On Mon, Jun 02, 2014 at 01:28:35PM -0400, Simo Sorce wrote:
On Mon, 2014-06-02 at 18:47 +0200, Jakub Hrozek wrote:
On Mon, Jun 02, 2014 at 11:58:23AM -0400, Simo Sorce wrote:
On Mon, 2014-06-02 at 17:36 +0200, Joschi Brauchle wrote:
On 06/02/2014 07:51 AM, John Hodrien wrote:
On Mon, 2 Jun 2014, Stephen Gallagher wrote:
> This is the real problem. If SSSD can route to the IP address, > then we have to proceed assuming that the LDAP server should be > available (thereby attempting to connect to it and perform > online authentication). There's really no way to determine ahead > of time whether the service is "supposed" to be available. > > You may want to play with the option 'ldap_opt_timeout' (see > sssd-ldap(5)). It controls how long the OpenLDAP client libraries > will wait for a response (in your case, how long it will wait > while the packets are dropped. It defaults to 6s).
This should be a one off hit though, right? If I discover the LDAP server is offline, I should remember this, admittedly recheck periodically, but never cause another delay waiting for it to spring back into life. Given the way some of these laptops are used, I'd even quite like to configure it to default to this state.
When I last tried this (which was a while ago) these delays would happen repeatedly, so the setup was unusable, and I had to ditch sssd on the laptop.
Well, in most common cases, the LDAP server is unresolvable when not on the VPN/inside the network, so SSSD immediately detects that it can't get there and the delay is unnoticeable.
It's those cases where the server is addressable but unresponsive that is much harder to handle.
Right now, we have a two-minute sleep between operations trying to go online again. (I think I saw a patch go in for 1.12 that makes this configurable). That's mostly so that we catch cases where you've connected to the VPN but for one reason or another SSSD doesn't get notified that the network state changed (there are lots of edge-cases that cause this).
I am not 100% sure that the LDAP server being unresponsive is the cause... Once I have the logs I will know more!
But isn't this is design flaw of the LDAP connectivity test? If connectivity is tested only after some application/the system is requesting information from SSSD and the server is unresponsive, this causes a long and unpleasant delay if the request is kept pending until the connection times out.
Hence, I'd suggest that SSSD periodically tests the LDAP connection in the background (or after network state change) *without* an actual request triggering this. As long as the LDAP server is unreachable or unresponsive, SSSD should stay in offline mode and answer requests right away with cached results.
SSSD should already do this by way of the midway refresh feature. However I am not sure it works as expected when the fast cache is in use.
Correct, but the midpoint refresh only works this way if you're between 50% (by default) and 100% of cache validity. Once you're past cache expiration completely, you trigger a back end lookup. Also for initgroups during login, we retry online more aggressively.
You can temporarily workaround this by having a background script (cron ?) that regularly runs a getent passwd username
So that hopefully users will almost always hit sssd when it is already offline.
These feature requests like Joschi's are more and more frequent and I'm thinking we should extend the periodical background refresh task to also users and groups..
IIRC the current version only supports netgroups (mostly because netgroups can't be enumerated).
+1 as an option, it would be valuable.
Simo.
OK, I couldn't find the relevant upstream ticket, so I filed this one: https://fedorahosted.org/sssd/ticket/2346
sssd-users@lists.fedorahosted.org