iterate linux file ldif

Question

Im trying to retrieve some user for a ldif file containing some specific attribute.

The input file will look like:

# entry-id: 2
dn: uid=xxx,ou=xx,cn=xx,o=xx,c=xx,o=xx
uid: xxx
cn: Paul
SUKsoft: Windows
SUKsoft: Linux
...
# entry-id: 3
dn: uid=yyy,ou=yy,cn=yy,o=yy,c=yy,o=yy
uid: yy
cn: Jones
SUKsoft: Windows
...

# entry-id: 3
dn: uid=zzz,ou=zz,cn=zz,o=zz,c=zz,o=zz
uid: zz
cn: John
SUKsoft: Linux
...
# entry-id: 4
dn: uid=www,ou=ww,cn=ww,o=ww,c=ww,o=ww
uid: ww
cn: John2

...
# entry-id: 5
dn: uid=mmm,ou=mm,cn=mm,o=mm,c=mm,o=mm
uid: mm
cn: John3
SUKsoft: Linux
...

The result file should filter for the user with the SUKsoft: Windows attribute:

uid|cn
xx|Paul
yy|Jones

I dont have much experience with linux shell bash, i am trying to read iterate the file first to obtain suksoft and uid attributes ad then reprocess it again to compone the final file getting just the uid with the SUKsoft below:

cat 1.txt | while read line
do
   egrep -w  '^uid|SUKsoft' $line > output.txt
done

Now the output looks like:

uid: xxx
SUKsoft: Windows
SUKsoft: Linux
uid: yy
SUKsoft: Windows
uid: zz
SUKsoft: Linux
uid: ww
uid: mm
SUKsoft: Linux

Now i would like to process the file get the uid line ultil i have one SUKsoft: Windows and copy them to the final file.

Could you help me here please?

Thanks

Regards

sjnarv · Answer 1 · 2015-07-14T15:37:33.277

For something very quick and dirty, I'd go for awk instead:

#!/usr/bin/env bash

awk -F ': ' '
    BEGIN               { print "uid|cn" }
    $1 == "uid"         { uid = $2 }
    $1 == "cn"          { cn  = $2 }
    /SUKsoft: Windows/  { print uid "|" cn }
' "$@"

But the above is crude, and makes assumptions about structure of the lines in the input ldif file: fields appearing in a usable order (uid and cn entries before SUKsoft, etc.).

Hardening left as a further exercise, I guess.

Edit: one such hardening. Track a bit of state, clearing uid and cn variables at the start of entry ("dn"), printing a SUKsoft: Windows entry only if both uid and cn have been seen.

#!/usr/bin/env bash

awk -F ': ' '
    BEGIN               { print "uid|cn" }
    $1 == "dn"          { uid = cn = "" }
    $1 == "uid"         { uid = $2 }
    $1 == "cn"          { cn  = $2 }
    /SUKsoft: Windows/  { if (uid != "" && cn != "") { print uid "|" cn } }
' "$@"

Note that if crude approaches like this are expected to handle arbitrary LDIF, these approaches should simply be abandoned and an LDIF parser be used.

Thanks for the answers, as you said the order in not always the same so the solution is not valid :( Any idea how can we sort the file before than processing it? — Endika, Jul 14 '15 at 12:08
The first solution is "valid" in that it produces your expected output on the sample input you gave. If general input will be more varied, you need to give examples. If your input could be arbitrary LDIF, you should say so: simple regexp-based approaches will not be sufficient. That said, I've edited the answer to track slightly more state. — sjnarv, Jul 14 '15 at 15:38

iterate linux file ldif

1 Answers1