sssd states service is not responding to pings - sssd-users - Fedora mailing-lists

18 Jan 2017


      Hello all, hope all is well
Seeing an odd issue on a host.  Periodically sssd will state it can't ping
the domain...well the service named the same as the domain....and then
shutdown and restart. Users can't auth and login till the service restarts.
So this effectively restricts access to hosts to filtered users like root
Windows DCs are available the whole time. See nothing untoward in a pcap
during that time. Also, since we've been having these issues, the host has
not been used for prod duty, so lightly loaded, during these sssd
disconnects. Really see heavy traffic to the DCs during the issue.
What does it mean for a service ping to timeout in sssd speak? Service on
the dbus?
Posted snippets from journalctl/sssd logs/sssd.conf all below
thanks in advance, any and all help would be appreciated
host is ubuntu xenial
from journalctl......from the event today
Jan 18 12:51:55 X sssd[41083]: Killing service [foo], not responding to
pings!
Jan 18 12:52:08 X sshd[104273]: fatal: Access denied for user srv_ti by PAM
account configuration [preauth]
Jan 18 12:52:52 X sshd[104298]: Connection closed by 99.99.99.99 port 60245
[preauth]
Jan 18 12:52:55 X sssd[41083]: [foo][41084] is not responding to SIGTERM.
Sending SIGKILL.
Jan 18 12:52:55 X sssd[be[104300]: Starting up
sssd_log today at debug 9 set in sssd.conf
...
...
...
...
...
...
(Wed Jan 18 12:51:05 2017) [sssd] [ping_check] (0x2000): Service foo
replied to ping
(Wed Jan 18 12:51:05 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:05 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe109f0
(Wed Jan 18 12:51:05 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:05 2017) [sssd] [ping_check] (0x2000): Service nss
replied to ping
(Wed Jan 18 12:51:05 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:05 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe11ac0
(Wed Jan 18 12:51:05 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:05 2017) [sssd] [ping_check] (0x2000): Service pam
replied to ping
(Wed Jan 18 12:51:15 2017) [sssd] [service_send_ping] (0x2000): Pinging foo
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:15 2017) [sssd] [service_send_ping] (0x2000): Pinging nss
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:15 2017) [sssd] [service_send_ping] (0x2000): Pinging pam
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe0d600
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe109f0
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:15 2017) [sssd] [ping_check] (0x2000): Service nss
replied to ping
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe0d600
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe11ac0
(Wed Jan 18 12:51:15 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:15 2017) [sssd] [ping_check] (0x2000): Service pam
replied to ping
(Wed Jan 18 12:51:25 2017) [sssd] [service_send_ping] (0x2000): Pinging foo
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe0d600
(Wed Jan 18 12:51:25 2017) [sssd] [service_send_ping] (0x2000): Pinging nss
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:25 2017) [sssd] [service_send_ping] (0x2000): Pinging pam
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe09430
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe0c370
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:25 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [0]
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe109f0
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:25 2017) [sssd] [ping_check] (0x2000): Service nss
replied to ping
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe09430
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe11ac0
(Wed Jan 18 12:51:25 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:25 2017) [sssd] [ping_check] (0x2000): Service pam
replied to ping
(Wed Jan 18 12:51:35 2017) [sssd] [service_send_ping] (0x2000): Pinging foo
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe09430
(Wed Jan 18 12:51:35 2017) [sssd] [service_send_ping] (0x2000): Pinging nss
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:35 2017) [sssd] [service_send_ping] (0x2000): Pinging pam
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe109f0
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:35 2017) [sssd] [ping_check] (0x2000): Service nss
replied to ping
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe11ac0
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:35 2017) [sssd] [ping_check] (0x2000): Service pam
replied to ping
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe0d600
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe0c370
(Wed Jan 18 12:51:35 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:35 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [1]
(Wed Jan 18 12:51:45 2017) [sssd] [service_send_ping] (0x2000): Pinging foo
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe0d600
(Wed Jan 18 12:51:45 2017) [sssd] [service_send_ping] (0x2000): Pinging nss
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:45 2017) [sssd] [service_send_ping] (0x2000): Pinging pam
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_add_timeout] (0x2000): 0xe15880
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe09430
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe0c370
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
(Wed Jan 18 12:51:45 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [2]
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_remove_timeout] (0x2000): 0xe14540
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_dispatch] (0x4000): dbus conn:
0xe109f0
(Wed Jan 18 12:51:45 2017) [sssd] [sbus_dispatch] (0x4000): Dispatching.
This also happen this past Monday evening
...
...
...
(Mon Jan 16 19:22:30 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [0]
(Mon Jan 16 19:22:40 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [1]
(Mon Jan 16 19:22:50 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [2]
(Mon Jan 16 19:23:00 2017) [sssd] [tasks_check_handler] (0x0020): Killing
service [foo], not responding to pings!
(Mon Jan 16 19:23:00 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [3]
(Mon Jan 16 19:23:10 2017) [sssd] [ping_check] (0x0020): A service PING
timed out on [foo]. Attempt [4]
(Mon Jan 16 19:24:00 2017) [sssd] [mt_svc_sigkill] (0x0010): [foo][2084] is
not responding to SIGTERM. Sending SIGKILL.
(Mon Jan 16 19:24:00 2017) [sssd] [mt_svc_exit_handler] (0x0040): Child
[foo] terminated with signal [9]
(Mon Jan 16 19:24:00 2017) [sssd] [mt_svc_restart] (0x0400): Scheduling
service foo for restart 1
(Mon Jan 16 19:24:00 2017) [sssd] [get_ping_config] (0x0100): Time between
service pings for [foo]: [10]
(Mon Jan 16 19:24:00 2017) [sssd] [get_ping_config] (0x0100): Time between
SIGTERM and SIGKILL for [foo]: [60]
(Mon Jan 16 19:24:00 2017) [sssd] [start_service] (0x0100): Queueing
service foo for startup
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_server_init_new_connection]
(0x0200): Entering.
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_server_init_new_connection]
(0x0200): Adding connection 0x1588b70.
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_init_connection] (0x0400): Adding
connection 0x1588b70
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_server_init_new_connection]
(0x0200): Got a connection
(Mon Jan 16 19:24:00 2017) [sssd] [monitor_service_init] (0x0400):
Initializing D-BUS Service
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_opath_hash_add_iface] (0x0400):
Registering interface org.freedesktop.sssd.monitor with path
/org/freedesktop/sssd/mon
itor
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_conn_register_path] (0x0400):
Registering object path /org/freedesktop/sssd/monitor with D-Bus connection
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_opath_hash_add_iface] (0x0400):
Registering interface org.freedesktop.DBus.Properties with path
/org/freedesktop/sssd/
monitor
(Mon Jan 16 19:24:00 2017) [sssd] [sbus_opath_hash_add_iface] (0x0400):
Registering interface org.freedesktop.DBus.Introspectable with path
/org/freedesktop/s
ssd/monitor
(Mon Jan 16 19:24:00 2017) [sssd] [client_registration] (0x0100): Received
ID registration: (%BE_foo,1)
(Mon Jan 16 19:24:00 2017) [sssd] [mark_service_as_started] (0x0200):
Marking foo as started.
(Mon Jan 16 19:24:00 2017) [sssd] [mark_service_as_started] (0x0080):
Invalid parent pid: 1963
...
...
...
sssd.conf
[sssd]
config_file_version = 2
debug_level = 9
reconnection_retries = 3
sbus_timeout = 30
services = nss, pam
domains = foo
[nss]
filter_groups = root,
filter_users = root,
reconnection_retries = 3
[pam]
reconnection_retries = 3
[domain/foo]
enumerate = False
id_provider = ad
chpass_provider = ad
auth_provider = ad
min_id = 1000
ad_hostname = X.us.foo.com
ad_domain = us.foo.com
dyndns_update = false
ldap_id_mapping = false
ldap_user_home_directory = unixHomeDirectory
ldap_user_object_class = user
ldap_group_object_class = top
ldap_group_nesting_level = 5
ldap_group_name = sAMAccountName
ldap_group_search_base =
ou=accounts,dc=us,dc=foo,dc=com?subtree?&(objectClass=top)(!(objectClass=computer))(gidnumber=*)(|(groupType<=0)(&(objectClass=user)(objectCategory=person)(uidNumber=*)))
access_provider = simple
simple_allow_users = appadmin,srv_ti,
simple_allow_groups = SG-MCServices,SG-MTO-SE-Dev,