LDAP SDK Features: LDIF Processing

The LDAP Data Interchange Format (LDIF) is a standard way of representing directory data in plain text files, as defined in RFC 2849. LDIF records can represent entire entires, or they can represent changes to be made to directory contents (add new entries, or remove, modify, or rename existing entries). The UnboundID LDAP SDK for Java provides rich support for interacting with LDIF records, in many ways. This post will describe the capabilities that it offers in this area.

Reading LDIF Records

The com.unboundid.ldif.LDIFReader class provides methods for reading data from LDIF files or from input streams. Some of the methods it offers are:

readEntry() — This reads the next entry from the LDIF file or input stream. This should be used when reading LDIF data known to contain only entries.
readChangeRecord() — This reads the next change record from the LDIF file or input stream. This should only be used when reading LDIF data known to contain only change records.
readLDIFRecord() — This reads the next entry or change record from the LDIF file or input stream (both entries and change records implement the LDIFRecord interface). This should be used when reading LDIF data that may contain a mix of entries and change records.
decodeEntry(String...) — This decodes the contents of the provided array as an LDIF entry. The elements of the array should be the lines of LDIF entry to be parsed. This is a static method and doesn’t require an LDIFReader object to be created to be able to use it.
decodeChangeRecord(String...) — This decodes the contents of the provided array as an LDIF change record. The elements of the array should be the lines of the LDIF change record to be parsed. This is a static method and doesn’t require an LDIFReader object to be created to be able to use it.

When reading and parsing LDIF records, if an invalid record is encountered then the LDIF reader will throw an LDIFException. It will include a message about the problem encountered, the line number in the LDIF data on which the problem was encountered, and a flag indicating whether or not it is possible to continue reading data from the LDIF source. Methods used to read LDIF data from a file or input stream may also throw an IOException.

Writing LDIF Records

The com.unboundid.ldif.LDIFWriter class provides methods for writing LDIF content to files or output streams. Some of those methods include:

writeEntry(Entry) — This method writes the provided entry to the LDIF file or output stream.
writeEntry(Entry,String) — This method writes the provided entry to the LDIF file or output stream, immediately preceded by the specified comment.
writeChangeRecord(LDIFChangeRecord) — This method writes the provided change record to the LDIF file or output stream.
writeChangeRecord(LDIFChangeRecord,String) — This method writes the provided change record to the LDIF file or output stream, immediately preceded by the specified comment.
writeLDIFRecord(LDIFRecord) — This method writes the given LDIF record (which may either be an entry or change record) to the LDIF file or output stream.
writeLDIFRecord(LDIFRecord,String) — This method writes the given LDIF record (which may either be an entry or change record) to the LDIF file or output stream, immediately preceded by the specified comment.

Methods used for writing LDIF data may throw an IOException.

Parallel LDIF Processing

We’ve worked hard to make the process of reading and writing LDIF data as fast as possible. While it’s pretty fast on its own as a serial process, portions of the processing may be parallelized for better performance on multi-CPU and/or multi-core systems.

When reading LDIF data, there are two phases of processing: the process of reading raw data from the LDIF file or input stream, and the process of decoding that data as an entry or change record. The first phase must be performed serially, but the second can be parallelized and multiple concurrent threads may be used to decode LDIF records. As a result, when the LDIF reader is configured to use multiple concurrent threads, it may be the case that the limiting factor is the speed at which data can be read from the disk or input stream.

In order to use parallelism in the LDIF reader, it is only necessary to specify the number of additional threads to use when parsing entries or change records at the time the reader is created. This parallelism will be provided completely behind the scenes, so that the caller may continue to use the readEntry(), readChangeRecord(), and readLDIFRecord() methods just as if all processing were performed serially.

Of course, introducing parallelism in the LDIF reader would be of limited usefulness if it introduced the possibility that entries or change records might be made available to the caller in a different order than they were contained in the original LDIF data. As a result, the LDIF reader will ensure that the order of the data is preserved so that the only perceptible change between serial and parallel processing is the relative speed with which LDIF records may be made available to the caller.

The LDIF writer also provides support for parallel processing, in which case multiple entries or change records may be formatted into their LDIF representations in parallel, and then will be written serially to the LDIF file or output stream. As with the LDIF reader, parallelism may be enabled by specifying the number of threads to use to perform that processing at the time that the LDIF writer is created. However, parallel processing will only be performed in the LDIF writer if the entries or change records are provided to it using the writeLDIFRecords(List<? extends LDIFRecord>) method. The order that the records are provided in the list will be preserved when they are written to the LDIF file or output stream.

Transforming Entries Read from LDIF

When reading entries from an LDIF file or input stream, it may be useful to alter those entries in some way before they are made available to the caller, and in some cases it may be desirable to exclude entries altogether. Either or both of these may be performed by providing a class which implements the com.unboundid.ldif.LDIFReaderEntryTranslator interface to the LDIF reader. If an entry translator is provided, then whenever an entry is read, the translate(Entry,long) method will be invoked to allow the translator to return a modified version of the provided entry, a completely new entry, or null to indicate that the entry should be omitted from the data made available to callers.

If the LDIF reader has been configured to perform parallel processing, then the entry translator will be invoked during the parallel portion of that processing, and as a result it may be faster to perform this transformation using the LDIFReaderEntryTranslator interface than performing it separately after the entry has been retrieved by the caller using a method like readEntry(). This does require the translator to be threadsafe, and it cannot depend on the order in which entries are processed.

Obtaining LDIF Representations of LDAP SDK Objects

There are a number of objects provided as part of the LDAP SDK which can be represented in LDIF form without the need for an LDIF writer. This includes the following types of objects:

com.unboundid.ldap.sdk.Entry — This can be represented as an LDIF entry.
com.unboundid.ldap.sdk.AddRequest — This can be represented as an LDIF add change record.
com.unboundid.ldap.sdk.DeleteRequest — This can be represented as an LDIF delete change record.
com.unboundid.ldap.sdk.ModifyRequest — This can be represented as an LDIF modify change record.
com.unboundid.ldap.sdk.ModifyDNRequest — This can be represented as an LDIF modify DN change record.

All of these objects provide the following methods that allow them to be represented in LDIF form:

toLDIF() — This returns a string array whose elements comprise the lines of the LDIF representation of the associated record.
toLDIFString() — This returns a string containing the LDIF representation of the associated record, including line breaks.

In addition, the above objects except Entry provide a toLDIFChangeRecord() method that allow them to be converted to the appropriate type of LDIF change record.

Using LDIF Records to Process Operations

The LDAP SDK also provides support for processing operations based on the LDIF representation. There are two primary ways that this may be accomplished:

All types of the LDIF change records provide a processChange(LDAPConnection) method which can be used to directly process an operation from that change record. Each change record type also provides a method for converting that change record object to an LDAP request (e.g., the LDIFAddChangeRecord class provides a toAddRequest() method to convert the change record to an equivalent add request).
Entries can be created from their LDIF representation using the Entry(String...) constructor, where the provided string array contains the lines that comprise the LDIF representation of that entry. Add requests can also be created from the LDIF representation of the entry to add using the AddRequest(String...) constructor. Modify requests can also be created from the LDIF representation of that modify request using the ModifyRequest(String...) constructor.

dc=nawilson,dc=com

LDAP, programming, security, movies, and other stuff

LDAP SDK Features: LDIF Processing

Reading LDIF Records

Writing LDIF Records

Parallel LDIF Processing

Transforming Entries Read from LDIF

Obtaining LDIF Representations of LDAP SDK Objects

Using LDIF Records to Process Operations

One thought on “LDAP SDK Features: LDIF Processing”

Reading LDIF Records

Writing LDIF Records

Parallel LDIF Processing

Transforming Entries Read from LDIF

Obtaining LDIF Representations of LDAP SDK Objects

Using LDIF Records to Process Operations

Share this:

One thought on “LDAP SDK Features: LDIF Processing”