We have just released version 10.1.0.0 of the Ping Identity Directory Server. See the release notes for a complete overview of changes, but here’s my summary:
Summary of New Features and Enhancements
- Added the ability to include presence components in composite index filter patterns [more information]
- Added the ability to include approximate-match components in composite index filter patterns [more information]
- Added the ability to include static equality components in composite index filter patterns [more information]
- Added the ability to stream search results directly from a composite index [more information]
- Added support for caching the candidate set for searches using the simple paged results control [more information]
- Improved Directory Proxy Server’s handling of requests with the simple paged results control [more information]
- Updated the access control handler to provide enhanced support for controlling which attributes may be included in add requests [more information]
- Added support for a verify password extended operation [more information]
- Added support for collation matching rules for improved extensible matching support for non-English values [more information]
- Added a new compare-ldap-schemas tool [more information]
- Reduced the performance impact of exploded index cleanup [more information]
- Improved warnings about high index entry limits for attribute indexes [more information]
- Improved overall write performance and reduced the number of outliers for write operations with higher response times
- Improved performance when applying changes via replication
- Improved performance when retrieving the database environment monitor entry
- Improved the efficiency of replicating server schema information between servers
- Reduced the default size of messages used in the course of monitoring replication
- Reduced the amount of memory that the server needs to cache information about dynamic groups
- Enabled the expensive operations logger by default so that information about any operations taking longer than 1 second to complete will be written to logs/expensive-ops
- Added the ability to include extended information about the associated connection in access log messages about requested operations
- Added the ability to exclude specific certain kinds of messages from the server error log, based on message category, severity, message ID, and message content
- Added the ability to define Prometheus metrics for Boolean monitor attributes by using a value of 1 for true and 0 for false
- Improved the logic used to determine whether a given replica should be considered obsolete
- Added an --ignoreDuplicateAttributeValues argument to the import-ldif command, which will allow it to successfully import entries that have duplicate values for the same attribute (with only one copy of each attribute value)
- Updated the interactive setup process so that when asking about whether to prime the contents of the backend into the cache during server startup, the default response has been changed from enabling priming to disabling priming
- Updated the server so that it will now only retain the last 100 copies of former configurations by default
- Added a new repair-topology-listener-certificates tool that can be used to recover from issues related to improperly updating certificates that the server uses for TLS communication
- Improved the efficiency of the Directory Proxy Server’s replication backlog health check
- Updated the export-reversible-passwords tool to make it possible to include only entries below a specified set of base DNs, or to exclude entries from a specified set of base DNs
- Added a subtree-modify-dn-size-limit property to the backend configuration that can be used to limit the size of subtree move and rename operations, and these operations are now limited by default to subtrees with no more than 100 entries
- Added the ability to specify the key wrapping transformation that the PKCS #11 cipher stream provider should use to protect the contents of the encryption settings database
- Updated the Synchronization Server to support synchronizing USER.LOCKED and USER.UNLOCKED events from the PingOne service
- Added the ability to obscure sensitive producer property values when using the Kafka sync destination
Summary of Bug Fixes
- Fixed an issue that could cause inconsistency in entryUUID values across replicas in servers configured with a custom password validator created with the Server SDK
- Fixed an issue that could allow insufficiently authorized clients to use the get password policy state issues request control through the Directory Proxy Server
- Fixed an issue in which manage-profile replace-profile could apply configuration changes in an incorrect order
- Fixed an issue that could cause dsreplication status to fail after disabling replication
- Fixed an issue that could cause dsreplication enable to report an error when run in interactive mode
- Fixed an issue that could cause the server to store multiple duplicate copies of the values of some attributes in which the associated attribute type has one or more subordinate types
- Fixed an issue that could prevent the server from adding real attribute values to a replicated entry that already had virtual values for the same attribute
- Fixed an issue that could prevent the server from adding or modifying entries that matched the criteria for an untrusted composite index if debug logging was enabled
- Fixed an issue that prevented the server from properly using a virtual list view index to process an applicable search using an extensible matching filter
- Fixed an issue in which the server could have incorrectly reported that the underlying JVM did not provide support for strong encryption (e.g., 256-bit AES)
- Fixed an issue that could result in increased memory pressure, and potential out-of-memory errors, when running in FIPS-compliant mode as a result of a quirk in the Bouncy Castle implementation for the AES cipher
- Fixed an issue that could cause the server to add duplicate entries to the configuration when setting up the server in FIPS 140-2-compliant mode
- Fixed a rare issue in which the server could report an error on startup when one or more replicas were not online
- Fixed an issue in which the Synchronization Server would not properly encode certain UTF-8 characters when constructing a URI for interacting with a source or destination server
- Fixed an issue in which the Synchronization Server could incorrectly omit certain attributes when synchronizing from the PingOne service when the modified-attributes-only mode
- Fixed an issue in which the Synchronization Server could incorrectly omit certain escape characters in search filters sent to the PingOne service
- Fixed an issue in which the Active Directory Password Synchronization Agent did not properly handle the case in which multiple users in a forest had the same sAMAccountName
- Cleaned up an error message that may be used when attempting to generate a Delegated Admin report with an invalid SCIM filter
Composite Index Improvements
We have made a number of improvements in our support for composite indexes. These improvements basically fall into two categories: new types of components that you can use in filter patterns, and the ability to stream results as the server is reading an index.
New Composite Index Filter Component Types
We have added support for new types of components in composite index filter patterns to make it possible to replace more types of attribute indexes with composite indexes. This is especially useful for cases in which you have attribute indexes that match a large number of entries and have a high index entry limit, which causes them to be maintained as exploded indexes. Composite indexes are better than exploded indexes pretty much every way, with better read performance, basically equivalent write performance, and more compact storage.
The new component types that we support in composite index filter patterns are:
- Presence components, like “(attributeName=*)”. These components will match any entry that has the specified attribute, regardless of what value(s) it has.
- Approximate-match components, like “(attributeName~=?)”. These components can be used to match entries that have a value that is approximately equal to a given value. In most cases, the server treats “approximately equal to” as meaning “sounds like”.
- Equality components with a static value rather than a wildcard, for example, “(attributeName=specificValue)”. This can be useful in cases where you want to limit an index to only matching entries with a specific value for a given attribute and you don’t care about indexing other values for that attribute.
These are in addition to the other filter pattern component types that we already support, including:
- Equality components with a wildcard, like “(attributeName=?)”. These can be used to match entries that have a given value for a particular attribute. In addition, if the filter pattern is only an equality wildcard component, or if it’s the last component of an AND filter pattern, then you can also use them for ordering matching in greater-or-equal and less-or-equal filters, or in substring filters with at least a subInitial (starts with) component.
- Substring components with a wildcard, like “(attributeName=*?*)”. You can only have one of these components in a filter pattern, and if it’s an AND filter pattern, then it must be the last component in that pattern. These can be used performing substring matching against attribute values, although they are primarily intended for use with substring filters that don’t include a subInitial component, since equality components are better for those.
Streaming Results Directly From a Composite Index
Normally, when the server is processing an indexed search request, it will first use the applicable indexes to come up with a candidate set containing the IDs of all of the entries that have the potential to match the search criteria, and then it will iterate through that candidate set, retrieving the entries, performing any additional processing needed to make sure they actually match the search criteria, and then returning them to the client. This works really well in the vast majority of cases, but it’s not necessarily the ideal approach to use in cases where the search criteria matches a huge number of entries. In that case, the resulting candidate set could consume a substantial amount of memory, and the server can’t start returning matching entries until it has identified all of the potential candidates.
In the 10.1 release, we’ve updated the server so that it can skip the process of building the candidate set in certain cases. When it does this, it will simply iterate through the pages of the composite index one-by-one, retrieving each of the entries referenced on that page and returning them to the client. This means that it doesn’t have to hold the entire candidate set in memory, and it can start returning matching entries right away.
The server can stream results from a composite index under the following conditions:
- The search request must have a wholeSubtree scope. We won’t attempt to stream results for searches with a baseObject, singleLevel, or subordinateSubtree scope.
-
The search request can only include a limited set of controls. Most notably, it can’t be used with controls that attempt to alter the order or set of entries in the result set, like the server-side sort, simple paged results, or virtual list view controls. The request controls that are compatible with streaming include:
- Access log field
- Account usable
- Administrative operation
- Assertion
- Get server ID
- Intermediate client
- Join
- LDAP subentries
- Permit unindexed search
- Proxied authorization (either v1 or v2)
- Reject unindexed search
-
The filter used in the search request must be one of the following:
- A simple presence filter
- A simple equality filter
- A simple approximate-match filter
- An AND filter that only contains some combination of presence, equality, and/or approximate-match components, and where each of the components targets a different attribute type
- The filter used in the search request must directly correspond to the filter pattern used in the composite index. There can’t be any extra components in the search filter that aren’t covered by the composite index filter pattern.
- The scope of the search request must directly correspond to the scope of the composite index. If the composite index is defined with a base DN pattern, then the base DN of the search request must match that filter pattern (and must not be subordinate to an entry that matches the pattern). If the composite index is not defined with a base DN pattern, then the base DN of the search request must be the base DN for the backend.
- The server must not believe that the target index key has exceeded the index entry limit.
Basically, it means that it must be possible to perfectly satisfy the search using exactly one composite index record. It can’t require iterating across multiple records (which means that it can’t be used for searches with greater-or-equal, less-or-equal, or substring components), and that index record can’t include any entries that don’t match the search criteria (which means that it can’t be used for searches whose filter is more specific than the filter pattern).
The server will automatically use streaming for search requests that meet all of the necessary conditions, so you don’t need to do anything to enable it. For search requests that don’t meet these requirements, the server will fall back to the traditional approach of building a candidate set and then working through it to return matching entries.
Improvements in Simple Paged Results Control Support
By default, when the server processes an indexed search operation that includes the simple paged results control, it will recompile the ID list for the entire set of entries that have the potential to match the search criteria for each of the requests used to obtain pages of the result set. However, we have added a new simple-paged-results-id-set-cache-duration property in the backend configuration that can be used to enable caching for this candidate set so that the server only needs to compute the candidate set once at the start of the search, and then it can use that cached ID set for subsequent pages of the search, as long as the client doesn’t wait too long between requests to retrieve subsequent pages.
Note that the caching mechanism should only be enabled in environments in which all servers support this ability, so don’t turn it on until all servers in the topology have been updated to version 10.1 or later. Also note that it works best if clients consistently send requests to retrieve pages from the result set from the same Directory Server (or Directory Proxy Server) instance.
In addition, we have improved the logic that the Directory Proxy Server uses when forwarding requests that include the simple paged results control to backend servers. Previously, the presence or absence of this control did not affect the Directory Proxy Server’s choice of which backend server to use when handling the request, but it will now try to consistently route requests to retrieve all pages of the search from the same server. This makes it better able to take advantage of the Directory Server’s new ability to cache the candidate ID set, and it can also help avoid issues in which entries may have been returned in different orders for requests sent to different backend servers.
Improved Access Control Support for Add Operations
Historically, granting someone the “add” access control right has always given them the ability to add an entry with any set of attributes, without regard for the targetattr keyword, which can yield unexpected behavior in cases where administrators expect the ability to restrict the attributes that someone can include in an add request. To address this, we have added a new evaluate-target-attribute-rights-for-add-operations property to the access control handler configuration, which will cause the server to consider the targetattr element for any ACIs that grant the right to add entries. Note that this property is set to false by default to preserve backward compatibility.
The Verify Password Extended Operation
When processing a bind request, the server is careful to not expose any information that could help the client (which may be a malicious user or application) identify the reason for the authentication failure, and especially not whether the provided credentials are correct for the target user. If the server knows that their account isn’t in a usable state (for example, if the account is administratively disabled, or if it’s been locked because of too many failed attempts or because it’s been unused for too long), then the server won’t even attempt to verify the password.
However, some applications do attempt to determine the reason for the authentication failure, whether by using the get password policy state issues request control, or by retrieving the entry and using the ds-pwp-state-json attribute to identify any issues that may make the account unusable. This is usually done under the guise of customizing the page returned in response to the failure with options that can help them succeed, like allowing them to unlock their account or reset their password if they’ve been locked out as a result of too many failed attempts, even though it leaks information about the reason for the authentication failure. And some customers have even expressed an interest in being able to do this but only if the provided password was actually correct for the user. This is an absolutely terrible idea, because it directly circumvents the entire point of account lockout, giving an attacker unlimited attempts to guess a user’s password. This is something that we will never implement in the Ping Identity Directory Server.
Nevertheless, because some organizations seem absolutely dead-set on backdooring their security configuration, we have introduced a new extended operation in the server that can be used to determine whether a proposed password is correct for a user without performing any other password policy processing. It doesn’t care if the account is locked or disabled, if the password is expired, or if there’s any other reason that the user wouldn’t be able to actually authenticate. Similarly, it doesn’t cause any updates to the account as a result of the validation attempt. For example, if the password provided is not correct, it does not count as a failed authentication attempt toward a lockout.
Because this is an obviously dangerous feature that should definitely not be exposed to regular clients, there are a number of safeguards in place to prevent it from being made available to malicious clients. These include:
- The extended operation handler is not defined in the server configuration by default. An administrator must explicitly configure and enable it for the verify password operation to be available.
- The requester must have access control permission to use the extended operation. The server does not have any ACIs that grant access to the control in the out-of-the-box configuration (although this restriction does not apply to clients with the bypass-acl privilege, since they aren’t subject to access control restrictions).
- The requester must have the permit-verify-password-request privilege. No one has this privilege by default, even root users and topology administrators.
- The request must be issued over a secure connection.
If all of these conditions are satisfied, then the client can send a verify password extended request to determine whether a provided password is correct for a given user. The server will return a compareTrue result if the password is correct, compareFalse if it’s not, or it will use some other result code if it couldn’t make the determination for some reason.
Collation Matching Rules
LDAP schema tends to be designed with a fairly English-centric mindset. Most attributes meant to hold textual values use matching rules that work well for ASCII values, but not necessarily as well for values that contain non-ASCII characters. For example, a search request with a filter of “(givenName=Francois)” won’t match an entry with a givenName value of “François”, and a search request with a filter of “(givenName=François)” won’t match an entry with a givenName value of “Francois”.
And on top of that, there can be multiple ways of encoding some non-ASCII characters. For example, the “ç” character can be encoded in UTF-8 using either the bytes 0xC3A7 or the bytes 0x63CCA7, and values encoded in one form are not automatically considered equivalent to values encoded in the other form.
To properly handle this scenario, you need to use alternative matching rules that are specifically designed to work with values in the language that you’re trying to match. These are called collation matching rules, and we’ve just updated the server to support them for a whole bunch of them. See the documentation for details about the locales and languages that we support, and how to use extensible matching filters to use them to perform better language-aware matching.
The compare-ldap-schemas Tool
We have added a new compare-ldap-schemas command-line tool that can be used to examine schema definitions in two LDAP servers and identify differences between them, like definitions that only exist in one server, or definitions that exist in both servers but have differences between them. You can choose to only look at elements of certain types, can include or exclude elements with a given name prefix, or can include or exclude elements with given extension values. You can also optionally ignore differences that only affect the element description or element descriptions.
Improvements Around Exploded Attribute Indexes
Traditionally, attribute indexes are stored so that each key has a single database record. For example, in an equality attribute index, every unique value for that attribute will have its own database record, with the key being the normalized representation of that value, and the data being a list of the entry IDs for all entries that contain that specific value. This is great when using the index to process search requests because it only takes a single database read to identify all entries that contain the associated attribute value. However, as the number of entries with that attribute value increases, it becomes more expensive to update that index record because it gets bigger and bigger, and the server needs to rewrite the entire ID list any time it changes. It can also increase the overall size of the database and the amount of work that the cleaner needs to do.
To help avoid these write performance issues, once an attribute index key matches at least 50,000 entries, the server converts it into an exploded form. Instead of a single database record whose value is a list of all associated entry IDs, it is converted into multiple records, with a separate record for each of the entry IDs. This dramatically improves performance for write operations that involve updating that index key, but that makes it more expensive to retrieve the entire ID set when it’s needed for a search operation.
Further, in the event that the number of entries matching that key exceeds the index entry limit, the server needs to remove all of the associated database records. It does this in a background thread so that it doesn’t tie up the worker thread processing the operation that caused the limit to be exceeded. Nevertheless, we have observed that this background cleanup processing can have a notable performance impact for other writes attempted while it’s in progress. To help alleviate that, we have introduced rate limiting for that cleanup processing so that it is less likely to affect the performance of other write operations.
Nevertheless, if you have index keys that are expected to match a large number of entries, we strongly recommend that you use a composite index rather than an attribute index. Composite indexes can provide much better overall read and write performance in cases like this, and they don’t require the same expensive background cleanup processing for keys that have exceeded the index entry limit. To help reinforce this recommendation, we have updated the server so that it will write warning messages to the server’s error log for any attribute indexes that are configured with an index entry limit of 100,000 or more, and so that it will generate administrative alerts on startup for any attribute indexes that are configured with an index entry limit of 1,000,000 or more. In addition, both dsconfig and the Administration Console will now display a notice recommending composite indexes over attribute indexes when altering the index entry limit for an attribute index.