-5

I'm trying to parse .ldif file but failed to get desired output. Any help is much appreciated.

Here's is what I'm doing using python:

lines = open("use.ldif", "r").read().split("\n")
for i, line in enumerate(lines):
   if not line.find(":"):
      lines[i-1] = lines[-1].strip() + line
      lines.pop(i)

open("user_modified.ldif", "w").write("\n".join(lines)+"\n")

use.ldif (input file)

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: cdsUser
objectclass: organizationalPerson
objectclass: Person
objectclass: n
objectclass: Top
objectclass: cd
objectclass: D
objectclass: nshd shdghsf shgdhfjh jhghhghhgh
 hjgfhgfghfhg
street: shgdhgf

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: hjgfhgfghfhg
street: shgdhgf kjsgdhgsjhg shdghsgjfhsfsf
 jgsdhsh
company: xyz

user_modified.ldif (Output from my code)

I am getting the same output, nothing is modified. I feel it's because I'm doing split("\n") but I'm not getting an idea of what else can be done.

desired output

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: cdsUser
objectclass: organizationalPerson
objectclass: Person
objectclass: n
objectclass: Top
objectclass: cd
objectclass: D
objectclass: nshd shdghsf shgdhfjh jhghhghhghhjgfhgfghfhg
street: shgdhgf

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: hjgfhgfghfhg
street: shgdhgf kjsgdhgsjhg shdghsgjfhsfsfjgsdhsh
company: xyz

As you can see in my output file user_modified.ldif the object class in first entry and street in second entry gets to the next line. How can I have them in same line, like in the desired output.

Thanks in advance

j.doe
  • 662
  • 4
  • 19
Radha
  • 61
  • 1
  • 10
  • I cannot understand what is your goal. Can you put your desire output for that input? – OSainz May 13 '19 at 06:13
  • I'm having difficulty in properly understanding your question. Can you please simply post `use.ldif` content and your `user_modified.ldif` (which you're getting now) and lastly the text which you actually want. This way we can directly understand what's the input, what's the required output and what your code is doing wrong. :) – xxbinxx May 13 '19 at 06:13
  • Now i mentioned the desired output – Radha May 13 '19 at 06:18
  • You forgot to add what's the output your code is giving, also Fix your first two lines of your question. I believe by "I have a like like in this format" you meant "I have a file with this content" – xxbinxx May 13 '19 at 06:19
  • I have edited your question. Please check, I have left space to add the output you're getting now. Add the text there and submit. – xxbinxx May 13 '19 at 06:25
  • its giving the same output as input file,nothing is being modified – Radha May 13 '19 at 06:29
  • Now remember how you ask questions in SO. This is how you explain. – xxbinxx May 13 '19 at 06:34
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/193235/discussion-between-xxbinxx-and-radha). – xxbinxx May 13 '19 at 06:36

2 Answers2

2
lines = open("use.ldif", "r").read().split("\n")
for i, line in enumerate(lines):
   if len(line) > 0 and not (":" in line):
       lines[i-1] = lines[i-1].strip() + line
       lines.pop(i)

open("user_modified.ldif", "w").write("\n".join(lines)+"\n")
customcommander
  • 17,580
  • 5
  • 58
  • 84
j.doe
  • 662
  • 4
  • 19
  • Its working, thanks. How do you separate two documents by an empty line,because they are continuous – Radha May 13 '19 at 07:56
  • @Radha the code is clear and obvious I just write and run this code, I didn't do any thing else – j.doe May 13 '19 at 08:08
  • No, like i want to leave a blank line after street and company. basically after an ldap entry, but here i am getting it continuous without separation – Radha May 13 '19 at 08:23
1

Okey here my approach:

import re

pattern = re.compile(r"(\w+):(.*)")

with open("use.ldif", "r") as f:
    new_lines = []

    for line in f:
        if line.endswith('\n'):
            line = line[:-1]

        if line == "":
            new_lines.append(line)
            continue

        l = pattern.search(line)
        if l:
            new_lines.append(line)
        else:
            new_lines[-1] += line

with open("user_modified.ldif", "wt") as f:
    f.write("\n".join(new_lines))

Looking a bit your code I suggest you to get documented a bit about iterating over files. Maybe you are still beginner with Python, but in your code shows you are processing whole file 3 times, at read(), at split('\n') and finally at the for statement. When you open a file, what you get is called descriptor, and as you can see in my code you can use it to iterate over the file getting a line on each step. For larger files this will become a important performance trick.

OSainz
  • 522
  • 3
  • 6
  • How to separate each entry with an empty line, like shown in expected output – Radha May 13 '19 at 08:13
  • @Radha just add a way to skip the empty lines of the input file. I edited my solution to cover that. – OSainz May 13 '19 at 09:20
  • @Radha don't forget to mark as a solution to be helpful for other users with the same kind of problem. – OSainz May 13 '19 at 10:03