-1

I'm trying to parse a text file, and am using almost the same line of code in two places. Both are essentially the following:

for line in scratch:
    line = line.lstrip(u' ¶.1234567890')

The version where this works is reading from a .txt file, and the version where it doesn't is reading from a block of unicode. Anyone have any idea why it doesn't work on the text?

EDIT: To clarify, here's what I mean.

Working:

text = u'¶3. Foo Bar\n¶4. Foo Bar'
for line in text:
    line = line.lstrip(u' ¶.1234567890')
print (text)
*Foo Bar
Foo Bar*

Not working:

text = u'¶3. Foo Bar\n¶4. Foo Bar'
for line in text:
    line = line.lstrip(u' ¶.1234567890')
print (text)
¶3. Foo Bar
¶4. Foo Bar
martineau
  • 119,623
  • 25
  • 170
  • 301
Ben Forde
  • 81
  • 1
  • 1
  • 4
  • 1
    make sure `scratch = block_of_unicode.splitlines()` – Joran Beasley Jul 29 '14 at 23:42
  • How do you define "not working"? Does it throw an error? If so, what error? Have you verified that each line contains what you assume it does? – Bryan Oakley Jul 29 '14 at 23:44
  • Without a [MCVE](http://stackoverflow.com/help/mcve), problems like this are hard to debug. Show us a complete program, the input, the expected output, and the actual output and we can probably find the problem; just tell us "it doesn't work" for data you've only vaguely described and code we've only got a fragment of isn't enough. – abarnert Jul 29 '14 at 23:49
  • Meanwhile, what encoding did you save this source file in? Do you have a valid encoding declaration so Python knows the right encoding? Because if not, it's possible that it only happens to work because the mojibake in your source file is canceled out by mojibake in the text file… – abarnert Jul 29 '14 at 23:49
  • @BenForde: You clearly haven't run that code, because it doesn't do what you claim it does. Which means it's not an MCVE. (Also, the two examples are identical, and yet you're claiming they produce different results?) – abarnert Jul 29 '14 at 23:50
  • 1
    Also, if `text` is a string, `for line in text:` means that `line` holds each character one by one, not each line. See @JoranBeasley's very first comment. – abarnert Jul 29 '14 at 23:52
  • 2
    Even more importantly, just reassigning a new value to `line` doesn't affect `text` in any way. You're just saying "forget about the string you got from the text and named `line`, remember this new string and name it `line` instead." – abarnert Jul 29 '14 at 23:53
  • @abamert: See, now that's helpful. I realized that when it does work elsewhere in the program, it's because all the code that relies on it is actually manipulating `line` itself. So what I need now is a way to transfer the adjusted line back to the text block. I'd initially used `text = text.split(u'\n') for i in range(0, len(text): text[i] = text[i].lstrip(u'¶.1234567890') text = u'\n'.join(text)`, but that didn't work either. – Ben Forde Jul 30 '14 at 00:00

1 Answers1

0
text = u'¶3. Foo Bar\n¶4. Foo Bar'
text = "\n".join([line.lstrip(u' ¶.1234567890') for line in text.splitlines()])
print text

maybe? you could probably do it cooler with regex ... im not sure

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179