I have an LDIF file. I want to extract information from it, such as return all objects where a certain attribute has a specific value, or return the value of a specific attribute of all objects. I want this to be efficient, even if the LDIF file's size is a gigabyte.
The obvious thing would be to import it in OpenLDAP, however, I seem to be missing some schema information. My LDAP skills are limited, but I always get this error:
$ slapadd -n 0 -F /etc/ldap/slapd.d -l config.ldif
5b1b9f46 PROXIED attributeDescription "DC" inserted.
5b1b9f46 <= str2entry: str2ad(instanceType): attribute type undefined
slapadd: could not parse entry (line=1)
_ 0.06% eta none elapsed none spd 7.1 M/s
Closing DB...
My research suggests that it is non-trivial to import this LDIF file with OpenLDAP.
Another idea would be to somehow import it to SQL. There are apparently several ways to store hierarchical data, but none of them seem to yield cheap queries. I have the luxury of it being essentially read-only, so I don't care if inserts are expensive.
I'm using Linux and I'd prefer solutions in python.
Any ideas?