0

I have an LDIF file. I want to extract information from it, such as return all objects where a certain attribute has a specific value, or return the value of a specific attribute of all objects. I want this to be efficient, even if the LDIF file's size is a gigabyte.

The obvious thing would be to import it in OpenLDAP, however, I seem to be missing some schema information. My LDAP skills are limited, but I always get this error:

$ slapadd -n 0 -F /etc/ldap/slapd.d -l config.ldif
5b1b9f46 PROXIED attributeDescription "DC" inserted.
5b1b9f46 <= str2entry: str2ad(instanceType): attribute type undefined
slapadd: could not parse entry (line=1)
_                       0.06% eta   none elapsed            none spd   7.1 M/s
Closing DB...

My research suggests that it is non-trivial to import this LDIF file with OpenLDAP.

Another idea would be to somehow import it to SQL. There are apparently several ways to store hierarchical data, but none of them seem to yield cheap queries. I have the luxury of it being essentially read-only, so I don't care if inserts are expensive.

I'm using Linux and I'd prefer solutions in python.

Any ideas?

Volker
  • 200
  • 1
  • 9
  • I may simply use python's `ldif.LDIFRecordList`... unless the performance is too poor on large LDIF files. I need to get my hands on a huge one first to test htis. – Volker Jun 09 '18 at 17:38

1 Answers1

0

Since I'm the author of python-ldap's LDIF parser I'd say it's too slow and would also consume too much memory for such large data.

I'd recommend to examine the schema needed and load the data to OpenLDAP with slapadd. It's probably a bit of work but definitely not rocket science.

Michael Ströder
  • 1,248
  • 8
  • 12