Hi folks,
I recently posted a thread to pki-users, https://www.redhat.com/archives/pki-users/2018-December/msg00003.html . Working with 'cipherboy' on IRC in #dogtag-pki, we narrowed the issue down to the searches that Dogtag performs against a VLV index/search. These are being paginated to 2,000 entries. I've since, opened up: https://bugzilla.redhat.com/show_bug.cgi?id=1658280 but, haven't been able to figure out a solution. I'm hoping that someone on this list might be able to troubleshoot further with me on this.
Has anyone looked at this before? So far, I'm unable to determine where the 2,000 paging size is getting set.
The linked bugzilla issue should have a bunch of necessary details but, I'm happy to provide any additional details that might help.
Thanks, Jared
On 12/20/18 3:49 PM, Jared Ledvina via FreeIPA-users wrote:
Hi folks,
I recently posted a thread to pki-users, https://www.redhat.com/archives/pki-users/2018-December/msg00003.html . Working with 'cipherboy' on IRC in #dogtag-pki, we narrowed the issue down to the searches that Dogtag performs against a VLV index/search. These are being paginated to 2,000 entries. I've since, opened up: https://bugzilla.redhat.com/show_bug.cgi?id=1658280 but, haven't been able to figure out a solution. I'm hoping that someone on this list might be able to troubleshoot further with me on this.
Has anyone looked at this before? So far, I'm unable to determine where the 2,000 paging size is getting set.
The linked bugzilla issue should have a bunch of necessary details but, I'm happy to provide any additional details that might help.
Thanks, Jared
Hi,
the size and time limits can be defined at multiple levels:
* per user: dn: uid=<user>,cn=users,cn=accounts,$BASEDN nsSizeLimit: nsTimeLimit: nsLookThroughLimit: nsPagedLookThroughLimit: nsPagedSizeLimit: nsIdleTimeout: nsIDListScanLimit: nsPagedIDListScanLimit: The meaning for each attribute can be found in [1].
* if it's not defined at the user level, the global ds settings apply dn: cn=config nsslapd-sizelimit: nsslapd-timelimit: nsslapd-pagedsizelimit: nsslapd-idletimeout: nsslapd-anonlimitsdn: cn=anonymous-limits,cn=etc,$BASEDN (points to an entry containing limits for anonymous operations) The meaning for each attribute can be found in [1].
* settings per DS database: dn: cn=config,cn=ldbm database,cn=plugins,cn=config nsslapd-lookthroughlimit: nsslapd-idlistscanlimit: nsslapd-pagedlookthroughlimit: nsslapd-pagedidlistscanlimit: nsslapd-rangelookthroughlimit: The meaning for each attribute can be found in [2].
* settings at IPA level: dn: cn=ipaconfig,cn=etc,$BASEDN ipaSearchRecordsLimit: ipaSearchTimeLimit:
If you find your 2000 value in one of those attributes, you will be able to understand which limit applies.
HTH, flo
[1] https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/ht... [2] https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/ht... [3] https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/ht...
Hi Florence,
Thanks for the reply! So, I've been looking at those and I currently, don't have any limit that I can find configured to 2,000 entries.
Current setup: https://paste.fedoraproject.org/paste/75jhSM1qonlQB-Uqtgug-Q
However, with those set, and after restarting ipa (to make sure any setting that requires a restart is in-place), we still see this: # ipa cert-find --hosts=testhost-abc-notreal-1.ops.example.com ---------------------- 0 certificates matched ---------------------- ---------------------------- Number of entries returned 0 ----------------------------
# /var/log/pki/pki-tomcat/ca/debug [20/Dec/2018:17:19:59][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getEntries() [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: getEntries: exception java.lang.ClassCastException: netscape.ldap.LDAPException cannot be cast to netscape.ldap.LDAPEntry [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList: entries: 2000 [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getPage(5998) [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getEntries() [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: getEntries: exception java.lang.ClassCastException: netscape.ldap.LDAPException cannot be cast to netscape.ldap.LDAPEntry [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList: entries: 2000 [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getPage(7997) [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getEntries() [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList: entries: 1842
Any other ideas on what is causing these entries to be paginated at 2,000 entries?
As I mentioned on https://bugzilla.redhat.com/show_bug.cgi?id=1658280 as well, sizeLimit and timeLimit in the LDAP request being sent to 389-ds as 0/unset (it's not clear if wireshark is changing that to 0).
Thanks, Jared
On 12/20/18 6:33 PM, Jared Ledvina via FreeIPA-users wrote:
Hi Florence,
Thanks for the reply! So, I've been looking at those and I currently, don't have any limit that I can find configured to 2,000 entries.
Current setup: https://paste.fedoraproject.org/paste/75jhSM1qonlQB-Uqtgug-Q
However, with those set, and after restarting ipa (to make sure any setting that requires a restart is in-place), we still see this:
# ipa cert-find --hosts=testhost-abc-notreal-1.ops.example.com
0 certificates matched
Number of entries returned 0
# /var/log/pki/pki-tomcat/ca/debug [20/Dec/2018:17:19:59][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getEntries() [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: getEntries: exception java.lang.ClassCastException: netscape.ldap.LDAPException cannot be cast to netscape.ldap.LDAPEntry [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList: entries: 2000 [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getPage(5998) [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getEntries() [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: getEntries: exception java.lang.ClassCastException: netscape.ldap.LDAPException cannot be cast to netscape.ldap.LDAPEntry [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList: entries: 2000 [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getPage(7997) [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList.getEntries() [20/Dec/2018:17:20:00][ajp-bio-127.0.0.1-8009-exec-3]: DBVirtualList: entries: 1842
Any other ideas on what is causing these entries to be paginated at 2,000 entries?
Hi,
Sorry I didn't spot it earlier, but in the bugzilla you opened [1], the tcpdump output displays a VLV Request control. The page size is set in this control (see the RFC [2] for the control definition). This control is used with the Server-Side Sort Control and specifies a number of entries to be returned and a starting index. If you feel like inspecting dogtag code, you can look for LDAPVirtualListControl objects.
HTH, flo
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1658280 [2] https://tools.ietf.org/html/draft-ietf-ldapext-ldapv3-vlv-09
As I mentioned on https://bugzilla.redhat.com/show_bug.cgi?id=1658280 as well, sizeLimit and timeLimit in the LDAP request being sent to 389-ds as 0/unset (it's not clear if wireshark is changing that to 0).
Thanks, Jared _______________________________________________ FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahoste...
Hi Florence,
Thanks again for getting back to me! I took another look at this, I was able to re-grab the LDAP access logs. The issue here is reproducible simply by running 'ipa cert-find --host=$fqdn', the host doesn't even have to exist. When that runs, from the 389-ds access logs we see this:
[21/Dec/2018:17:13:32.314209537 +0000] conn=19 op=38 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:32.314371156 +0000] conn=19 op=38 SORT serialno [21/Dec/2018:17:13:32.314380526 +0000] conn=19 op=38 VLV 0:2147483647:A 9838:9838 (0) [21/Dec/2018:17:13:32.314521575 +0000] conn=19 op=38 RESULT err=0 tag=101 nentries=1 etime=0.0000383091 [21/Dec/2018:17:13:32.317496089 +0000] conn=19 op=39 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:32.323799789 +0000] conn=19 op=39 SORT serialno [21/Dec/2018:17:13:32.323808123 +0000] conn=19 op=39 VLV 0:2147483647:0:9838 1:9838 (0) [21/Dec/2018:17:13:32.446840192 +0000] conn=19 op=39 RESULT err=4 tag=101 nentries=2000 etime=0.0132104757 [21/Dec/2018:17:13:32.488055654 +0000] conn=19 op=40 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:32.493148488 +0000] conn=19 op=40 SORT serialno [21/Dec/2018:17:13:32.493163030 +0000] conn=19 op=40 VLV 0:2147483647:1999:9838 2000:9838 (0) [21/Dec/2018:17:13:32.601344856 +0000] conn=19 op=40 RESULT err=4 tag=101 nentries=2000 etime=0.0154444399 [21/Dec/2018:17:13:32.665053088 +0000] conn=19 op=41 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:32.668917938 +0000] conn=19 op=41 SORT serialno [21/Dec/2018:17:13:32.668928663 +0000] conn=19 op=41 VLV 0:2147483647:3998:9838 3999:9838 (0) [21/Dec/2018:17:13:32.765771934 +0000] conn=19 op=41 RESULT err=4 tag=101 nentries=2000 etime=0.0164377186 [21/Dec/2018:17:13:32.850446325 +0000] conn=19 op=42 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:32.854495450 +0000] conn=19 op=42 SORT serialno [21/Dec/2018:17:13:32.854513780 +0000] conn=19 op=42 VLV 0:2147483647:5997:9838 5998:9838 (0) [21/Dec/2018:17:13:32.933633117 +0000] conn=19 op=42 RESULT err=4 tag=101 nentries=2000 etime=0.0167819330 [21/Dec/2018:17:13:33.007060840 +0000] conn=19 op=43 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:33.008414443 +0000] conn=19 op=43 SORT serialno [21/Dec/2018:17:13:33.008424406 +0000] conn=19 op=43 VLV 0:2147483647:7996:9838 7997:9838 (0) [21/Dec/2018:17:13:33.112973364 +0000] conn=19 op=43 RESULT err=0 tag=101 nentries=1842 etime=0.1820700649
I then went ahead and tried to create an ldapsearch that would reproduce this condition, I used the following: ldapsearch -E 'vlv=0/2147483647/0/9838' -E 'sss=serialno' -LLL -o ldif-wrap=no -D 'cn=Directory Manager' -W -h $(hostname -f) -s one -b 'ou=certificateRepository,ou=ca,o=ipaca' '(certStatus=*)'
And then from the LDAP access logs with that command we se: [21/Dec/2018:17:12:42.100058456 +0000] conn=162 fd=135 slot=135 connection from 172.16.2.104 to 172.16.2.104 [21/Dec/2018:17:12:42.100331492 +0000] conn=162 op=0 BIND dn="cn=Directory Manager" method=128 version=3 [21/Dec/2018:17:12:42.100393997 +0000] conn=162 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0000288246 dn="cn=directory manager" [21/Dec/2018:17:12:42.100511746 +0000] conn=162 op=1 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:12:42.106792071 +0000] conn=162 op=1 SORT serialno [21/Dec/2018:17:12:42.106800194 +0000] conn=162 op=1 VLV 0:2147483647:0:9838 1:9838 (0) [21/Dec/2018:17:12:48.563464695 +0000] conn=162 op=1 RESULT err=0 tag=101 nentries=9838 etime=6.1537001563 [21/Dec/2018:17:12:59.836640254 +0000] conn=162 op=-1 fd=135 closed - B1
To call out explicitly, my LDAP search logs: [21/Dec/2018:17:12:42.100511746 +0000] conn=162 op=1 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:12:42.106792071 +0000] conn=162 op=1 SORT serialno [21/Dec/2018:17:12:42.106800194 +0000] conn=162 op=1 VLV 0:2147483647:0:9838 1:9838 (0) [21/Dec/2018:17:12:48.563464695 +0000] conn=162 op=1 RESULT err=0 tag=101 nentries=9838 etime=6.1537001563
and the dogtag one has the same lines: [21/Dec/2018:17:13:32.317496089 +0000] conn=19 op=39 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 filter="(certStatus=*)" attrs=ALL [21/Dec/2018:17:13:32.323799789 +0000] conn=19 op=39 SORT serialno [21/Dec/2018:17:13:32.323808123 +0000] conn=19 op=39 VLV 0:2147483647:0:9838 1:9838 (0) [21/Dec/2018:17:13:32.446840192 +0000] conn=19 op=39 RESULT err=4 tag=101 nentries=2000 etime=0.0132104757
However, it seems like the dogtag one, has an nentries=2000, unlike mine, which returns everything, even though the VLV line is the same (maybe 389-ds doesn't log everything for the request?)
As for the Dogtag code, I can only find one use of 'LDAPVirtualListControl' and that's in https://github.com/dogtagpki/pki/blob/DOGTAG_10_5_1_FEDORA_27/base/server/cm...
The only piece of the Dogtag code that stands out to me is 'mMaxReturns' set to 2000 here https://github.com/dogtagpki/pki/blob/DOGTAG_10_5_1_FEDORA_27/base/server/cm.... It looks like it does try to read that in from some config via, https://github.com/dogtagpki/pki/blob/DOGTAG_10_5_1_FEDORA_27/base/server/cm... but, I can't figure out where that would be or how to set it.
Anyway, that's about as far as I've gotten today. If you or anyone on this list is more familiar with the Dogtag codebase and sees something I'm missing. Please let me know, more than happy to throw anything at the wall and see what sticks as the host cleanup in production that we're running is going to go for a few weeks at this speed.
Thanks again, Jared
freeipa-users@lists.fedorahosted.org