Python replace 0d in txt

Question

Have source file txt (download from accounting program) with 0a in line, when it's not needed (it makes line break). And have 0d and 0a in the place when it's needed. I need to open it in Excel ( I have another opportunity to download it in csv) When I download almost the same data in xml I encounter the same problem when getting data with python, but I've solved it by

for i in range(1,16):
                    lstFile.append(str(file))
                    lstAmount.append(str(amount))
                    lstKey.append(str(keys[i-1]))
                    if accPay.find(keys[i-1]) is None:
                        lstValue.append("none")
                    else:
                        lstValue.append(accPay.find(keys[i-1]).text.replace(u'\u000d',' '))

But I can't replace 0a separately.

when I write

    with open(file, 'r') as file :
  filedata = file.read()
filedata = filedata.replace(u'\u000a', ' ')
with open('Konten_last5.txt', 'w') as file:
  file.write(filedata)

I get all 0a and 0d 0a replaced by 20 (space).

When I write

    with open(file, 'r') as file :
  filedata = file.read()
filedata = filedata.replace(u'\u000d', ' ')
with open('Konten_last5.txt', 'w') as file:
  file.write(filedata)

I get everywhere 0d 0a in both places please help))

I tried to replace separately (u'\u000d\u000a', 'any') but it doesn't work, this combination isn't found.

Tried solution, but it doesn't work.. couldn't attach picture in comment

Apply _negative Lookbehind_. `import re; x='A\u000aB\u000d\u000aC\u000aD'; x; re.sub("(?<!\u000d)\u000a", ' ', x)` returns `'A\nB\r\nC\nD'` and `'A B\r\nC D'`. Please [edit] your question to share a [mcve] - how do you get your data (I'd guess that you read a `csv` file)? — JosefZ, Dec 28 '21 at 20:54
Sorry haven't fully understood you( how can I change my code? I download txt. When I download csv I have the same problem in Excel. When I get all the data I need from the xml files I have the same problem,but for i in range(1,16): if accPay.find(keys[i-1]) is None: lstValue.append("none") else: lstValue.append(accPay.find(keys[i-1]).text.replace(u'\u000d',' ')) helps me — Elina, Dec 28 '21 at 23:30
I almost caught your idea but not realisation)))) I need NOT to replace when there is 0a with 0d...? — Elina, Dec 28 '21 at 23:44
Sounds like it is an issue of LF (Line Feed) VS CRLF (Carriage Return Line Feed). The former normally used a line break in *nix and latter in Windows. The txt file somehow has them mixed together. You can try some online converter, for simplicity's sake - https://app.execeratics.com/LFandCRLFonline/?l=en Alternatively, you can do it yourself using JosefZ's solution. It is almost a one-liner. Just need to tip your toes in regex (useful skill) — edd, Dec 28 '21 at 23:57
I see it's like unified symbol, will try solution in another question and open it in binary mode — Elina, Dec 29 '21 at 00:11

score 1 · Accepted Answer · answered Dec 29 '21 at 00:18

1

Open the file in binary mode (open(file,'rb')). Then, you can read and deal with byte strings, instead of mucking with Unicode translations. That's especially important on Windows, where writing a '\x0a' to a text file results in the file system writing '\x0d\x0a'.

answered Dec 29 '21 at 00:18

Tim Roberts

48,973
4
21
30

thank you! the method worked with xml, I haven't understood there's any difference in opening, thanx! – Elina Dec 29 '21 at 00:23
Traceback (most recent call last): File "C:\Program Files\Sublime Text 3\replace.py", line 13, in filedata = filedata.replace('\x0a', '\x0a') TypeError: a bytes-like object is required, not 'str' I can't use replace? – Elina Dec 29 '21 at 00:28
it's ok filedata = filedata.replace(b'\x0a', b'\x00') that's worked – Elina Dec 29 '21 at 00:36

Python replace 0d in txt

1 Answers1