On Mar 3, 2014, at 11:01 AM, Rich Megginson <rmeggins@redhat.com> wrote:

On 03/03/2014 11:38 AM, Russell Beall wrote:

On Mar 3, 2014, at 6:39 AM, Rich Megginson <rmeggins@redhat.com> wrote:

On 02/28/2014 05:26 PM, Russell Beall wrote:
This has led me to finally discover the true bottleneck in the indexing of one particular attribute.  The attribute is a custom attribute similar to memberOf.  When converting from SJES I took the syntax, which was DN syntax, and modified the matching rule in the schema into distinguishedNameMatch.  So the index of this attribute was constructed based on that.  Apparently even though the values are DN values (almost all of them), this particular index type must be incompatible.  When I switched from distinguishedNameMatch to caseExactMatch, performance improved drastically (I'm now switching to caseIgnoreMatch to see how that fares). 

Are these values DNs or not?  If they are DNs, they should use distinguishedNameMatch, and it looks like the real problem is DN normalization/comparison/indexing speed.

Most of the values are DNs corresponding to group entries.  We added an enhancement wherein some of the values will be tagged with a "scope" which means that the group membership applies specifically to some particular role.  This way, all roles can be filtered and passed to applications using the base membership dn value.  This enhancement probably was throwing out the logic on the distinguishedNameMatch.  I'd be curious to see the difference if all values really were DNs, but I'm not sure at this point if I will have time to run that test…

And yes, caseExactMatch was something I tried because the ACIs use exact case and I wanted to see the performance at its best.  I switched to caseIgnoreMatch and the performance was about the same.

Even with the performance enhancement, the check of this multi-valued attribute with lots of values is still the key bottleneck in ACI processing.  I get a 10-fold improvement in processing time when that check is stripped out and only used for the one service account I'm testing with.  I'm not sure if there is any way to improve upon that, but I haven't finished playing with things like nsslapd-idlistscanlimit.  Hopefully I can bump up the performance some more because these machines really should way outperform the old Sun machines rather than appearing to perform very similarly.

Yeah, Sun optimized their DS for very large group entries, and we didn't.

We have a proposed fix for 1.2.11 which should drastically speed up performance of large groups.

https://fedorahosted.org/389/ticket/346

Looks like a lot of work went into fixing this. I have been tracking that on-and-off since it was created but I haven't retested lately.  I'm currently using on the new servers:
        389-Directory/1.2.11.15 B2013.312.1642

I downloaded this one some time back to test the memory usage patch in my dev VM:
        389-Directory/1.2.11.23 B2013.275.1555

Would I need to upgrade to a different version to test or see performance enhancements on group sizes (hopefully applying to any generic attribute with lots of values)?

Thanks,
Russ.




Regards,
Russ.

Note that if these values are really DNs, and you cannot control the format that the DNs are sent by the clients, you may run into problems using caseExactMatch. 


Now we are on par with the old Sun servers and no longer clogging up when many threads are running simultaneously.

Thank you so much, Rich and Ludwig, for helping me dig through this!

Thanks,
Russ.

On Feb 28, 2014, at 12:50 AM, Ludwig Krispenz <lkrispen@redhat.com>
 wrote:

It seems this means that the ACI processing is not correctly using the indexes, or I didn't create them properly.
Aci processing is done per entry, so you already have looked up all the entries (which could use some indexes) and now perform evaluation on a single entry. If you have acis with groupdns and you have very large groups the check if the bound user is a member of th egroup can become a severe performance problem 

It could mean
1) the aci performance is not related to indexing
2) we don't know what indexes it is using

I have run db2index.pl and it successfully reprocessed the entire set of indexes.  I also ran dbverify to successful completion.  I can tell that the indexes are being used in simple searches (based on the fact that I get the "Administrative Limit Exceeded" error when there is no index).

Does this information point to anything that I should look into further?

If the aci processing is doing internal searches that don't show up in logconv.pl, then turning on access logging for internal searches should show those unindexed internal searches, which should show up using logconv.pl


Thanks,
Russ.


On Feb 27, 2014, at 1:19 PM, Rich Megginson <rmeggins@redhat.com> wrote:

On 02/27/2014 12:49 PM, Russell Beall wrote:
Hi Rich,

Thanks for the data.  I've been continuing to experiment and work on this and especially making sure that everything that might be used in the ACIs is indexed.  All the indexes appear to be in order, but I am confused by one thing…  It looks like there is no entryDN index and only an entryrdn index.

Correct.

This new format will be fully workable for complicated dn lookups in the ACIs, correct?  (We have a lot of "groupdn=" and "userdn=" restrictions).

Correct.  groupdn= and userdn= do not use the entrydn index.


There is no one single ACI which degrades performance, but I did notice that when adding back in certain of the ACIs, performance does degrade quicker than should be expected just for the cost of processing only one additional ACI.  I believe there may definitely be a problem with the indexes as you suggested but It is hiding well...

You could enable access logging of internal operations.
https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnconfig-nsslapd_accesslog_level

An additional symptom which may point to a server configuration problem is a strange inability to import or reindex quickly.  My small dev VM can import several hundred entries at a time, but this server will only import or reindex at a rate of 18-30 records per second.  I've ensured that there is plenty of import memory as well as cachememsize which should enable a very speedy import, but even though all 32 cores are burning bright, the import speed seems incredibly slow.  (This is of course after all indexes are created and it is trying to index while importing.  Import speed with no indexes is fairly fast).

Any obvious clues I'm missing?

No, not sure what's going on.


Thanks,
Russ.

On Feb 19, 2014, at 4:08 PM, Rich Megginson <rmeggins@redhat.com> wrote:

On 02/19/2014 04:56 PM, Russell Beall wrote:
Hi all,

We've just set up monster-sized server nodes to run 389 as a replacement to Sun DS.  I've been running my tests and I am pleased to report that the memory issue seems to be in check with growth only up to double the initial memory usage after large quantities of ldapmodify calls.  We have plenty of room in these boxes to accommodate caching the entire database.

The key blocker on this is still the ACL processing times for which I have been unable to find a decent resolution.  We have 135 ACIs at the root of the suffix.  When I comment out most of them but leave one service account active, processing times are very nicely fast.  When I leave them all on, that same service account takes 2.5 seconds to respond when only one request is pending.  A new kink in the puzzle here which is probably going to be a deal breaker is that if I run the same request on multiple threads, each thread takes proportionately longer to respond depending on the number of threads.  If I have 12 threads going doing a simple lookup, each thread responds in 45-55 seconds.  If I have 24 threads going, each thread takes 1m45s - 1m55s to respond.  The box has 32 cores available.  While processing, each thread is burning 100% of an available CPU thread for the entire time.  Theoretically when up to 32 requests are simultaneously processing, each thread should return in 2.5 seconds just as if it were one thread.

Note that the directory server performance does not scale linearly with the number of cores.  At some point you will run into thread contention.  Also things like replication compete for thread resources.


Since all threads are burning 100% the entire time, it doesn't seem like that would be caused by simple thread locking where some threads are waiting for others.

No, see below.


I'm thinking the system is not properly configured in some way and there is a system bottleneck blocking the processing.  When burning the CPU there is very little percentage allocated to the user percentage, most of the CPU usage is listed under the system CPU usage.  Is this normal, or is this indicative of some system layer that is bottlenecking the processing?

Sounds like the ACI may be doing some sort of unindexed internal search.

Have you narrowed it down to a particular ACI that is causing the problem?

Another question I posed earlier is whether or not it is possible to replicate three subtrees independently and then keep the aci entry at the root suffix independent so it can be set separately for multiple downstream replicants.  That way we could possibly subdivide the service accounts across different nodes.  Is that possible?

No.


Thanks,
Russ.

==============================
Russell Beall
Systems Programmer IV
Enterprise Identity Management
University of Southern California
==============================





--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users




--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users