Python's .strip() doesnt save outside of if statement?

Question

I have scrapy pulling data from a web page. An issue Ive run across is it pulls alot of whitespace and Ive elected to use .strip() as suggested by others. Ive run into an issue though

if a.strip():
    print a
if b.strip():
    print b

Returns:

a1
b1
.
.
.

But this:

if a.strip():
    aList.append(a)
if b.strip():
    bList.append(b)
print aList, bList

Returns this:

Im trying to simulate the whitespace that I remove with .strip() here, but you get the point. For whatever reason it adds the whitespace to the list even though I told it not to. I can even print the list in the if statement and it also shows correctly, but for whatever reason, when I decide to print outside the if statements it doesnt work as I intended.

Here is my entire code:

# coding: utf-8
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from scrapy.contrib.exporter import CsvItemExporter
import re
import csv
import urlparse
from stockscrape.items import EPSItem
from itertools import izip

class epsScrape(BaseSpider):
        name = "eps"
        allowed_domains = ["investors.com"]
        ifile = open('test.txt', "r")
        reader = csv.reader(ifile)
        start_urls = []
        for row in ifile:
                url = row.replace("\n","")
                if url == "symbol":
                        continue
                else:
                        start_urls.append("http://research.investors.com/quotes/nyse-" + url + ".htm")
        ifile.close()

        def parse(self, response):
                f = open("eps.txt", "a+")
                sel = HtmlXPathSelector(response)
                sites = sel.select("//div")
#               items = []
                for site in sites:
                        symbolList = []
                        epsList = []
                        item = EPSItem()
                        item['symbol'] = site.select("h2/span[contains(@id, 'qteSymb')]/text()").extract()
                        item['eps']  = site.select("table/tbody/tr/td[contains(@class, 'rating')]/span/text()").extract()
                        strSymb = str(item['symbol'])
                        newSymb = strSymb.replace("[]","").replace("[u'","").replace("']","")
                        strEps = str(item['eps'])
                        newEps = strEps.replace("[]","").replace(" ","").replace("[u'\\r\\n","").replace("']","")
                        if newSymb.strip():
                                symbolList.append(newSymb)
#                               print symbolList
                        if newEps.strip():
                                epsList.append(newEps)
#                               print epsList
                        print symbolList, epsList
                for symb, eps in izip(symbolList, epsList):
                        f.write("%s\t%s\n", (symb, eps))
                f.close()

What does the *documentation* say [`strip`](http://docs.python.org/2/library/string.html) does? — user2864740, Nov 19 '13 at 18:27
Strings are immutable; `.strip()` can not alter the value, so it returns a *new* stripped string object. — Martijn Pieters, Nov 19 '13 at 18:28
@Matjin See that is what I read so I thought of trying to assign it to another variable, but that didnt change anything. — Alcaeus, Nov 19 '13 at 18:45

bogatron · Answer 1 · 2013-11-19T18:32:26.060

8

strip does not modify the string in-place. It returns a new string with the whitespace stripped.

>>> a = '    foo      '
>>> b = a.strip()
>>> a
'    foo      '
>>> b
'foo'

edited Nov 19 '13 at 18:32

answered Nov 19 '13 at 18:27

bogatron

18,639
6
53
47

I believe I tried something similar to that where I did this: if a.strip(): b = a if c.strip(): d = c But when I tried to print this outside of the if statements it printed with all the whitespace. – Alcaeus Nov 19 '13 at 18:38
As a side-note Python has several built-in types (numbers, booleans, strings, tuples, frozensets) are immutable. – crownedzero Nov 19 '13 at 18:41
1

@Resin As I said, `a.strip()` returns a new string - `a` is not modified. So when you write `if a.strip(): b = a`, it will set `b` to the original (unstripped) variable `a`. – bogatron Nov 19 '13 at 19:05

score 0 · Answer 2 · answered Nov 19 '13 at 20:22

I figured out what it was that was causing the confusion. Its the location which I declared the variable/list. I was declaring it inside the for loop so everytime it iterated it rewrote and a blank list or variable is the same outcome of false for my if statement.

Python's .strip() doesnt save outside of if statement?

2 Answers2