-1

right now its set up to write to a file, but I want it to output the value to a variable. not sure how.

from BeautifulSoup import BeautifulSoup
import sys, re, urllib2
import codecs


woof1 = urllib2.urlopen('someurl').read()
woof_1 = BeautifulSoup(woof1)
woof2 = urllib2.urlopen('someurl').read()
woof_2 = BeautifulSoup(woof2)

GE_DB = open('GE_DB.txt', 'a')

for row in woof_1.findAll("tr", { "class" : "row_b" }):
  for col in row.findAll(re.compile('td')):
    GE_DB.write(col.string if col.string else '')
GE_DB.write("   ")
GE_DB.write("\n")
GE_DB.close()
for row in woof_2.findAll("tr", { "class" : "row_b" }):
  for col in row.findAll(re.compile('td')):
    GE_DB.write(col.string if col.string else '')
GE_DB.write("\n")
GE_DB.close()
Pevo
  • 55
  • 2
  • 6
  • 2
    It would help enormously if you explained (1) what you understand "output the value to a variable" to mean and once that's accomplished (2) what your script is going to do with the "variable" -- just falling off the end of the script doesn't seem worth the effort of step 1. – John Machin Mar 04 '10 at 23:50
  • ok so when you run the above script on a site with a table. it takes whats in between td tags. I'd like it to store the value of that as a variable. – Pevo Mar 04 '10 at 23:53
  • 5
    What's up with the mass -1 votes? – Nick Presta Mar 05 '10 at 00:17

4 Answers4

-1
values = []
for row in woof_1.findAll("tr", { "class" : "row_b" }):
  for col in row.findAll(re.compile('td')):
    if col.string:
      values.append(col.string)
result = ''.join(values)
Li0liQ
  • 11,158
  • 35
  • 52
  • I'm getting an invalid syntax for | if (col.string) | on the ) not sure why. =/ Something I did? – Pevo Mar 04 '10 at 23:49
  • @Pevo, sorry for that, I've missed a colon after if statement. Corrected it. – Li0liQ Mar 04 '10 at 23:54
  • Your correspondent omitted a necessary `:` but included redundant `(` and `)` ;-) – John Machin Mar 04 '10 at 23:56
  • can the difference between how this one would work and | Jonathan Feinberg's answer| this one works be explained to me? – Pevo Mar 04 '10 at 23:59
  • @John Machin, it's a kind of habit while dealing with other languages. Corrected that also :). – Li0liQ Mar 05 '10 at 00:00
  • StringIO (http://docs.python.org/library/stringio.html) is a module that helps you read and write strings like a file therefore it's a natural replacement for the file input/output operations. My solution will be more suitable if you might want to perform some processing on the values (values list particularly) retrieved from the table before merging them into one big string. – Li0liQ Mar 05 '10 at 00:04
  • when I use this code my output is [u' '] why is the [u' and '] present? – Pevo Mar 05 '10 at 00:06
  • Yes I'm much more interested in this solution. as you read my mind. I'll update my code with what i'd like to do. – Pevo Mar 05 '10 at 00:07
  • @Pevo, single quote marks the beginning and the end of the string, u means unicode string. The brackets may indicate that you are looking at the list of strings (i.e. values list), not at the resulting string (i.e. result). – Li0liQ Mar 05 '10 at 00:09
  • @li0liQ thank's for the solution. wondering if i add a lot of urls to parse the tables from will I need to worry about any problems? – Pevo Mar 05 '10 at 00:13
  • @Pevo, you are welcome. A good manner is to accept the solution that helped you. Well, just don't mix the data from different urls :). – Li0liQ Mar 05 '10 at 00:18
-1

maybe like this.

gedb = "";
for row in woof_1.findAll("tr", { "class" : "row_b" }):
  for col in row.findAll(re.compile('td')):
    if col.string:
      gedb += col.string

Phil Rykoff
  • 11,999
  • 3
  • 39
  • 63
  • String concatenation like that is generally frowned upon in Python. It's better (style- and efficiency-wise) to build up a list of strings and then `join` them (or, if the OP wanted to continue using file-like objects, use `StringIO`). See http://wiki.python.org/moin/PythonSpeed/PerformanceTips#StringConcatenation and http://www.skymind.com/~ocrow/python_string/ for more. – Will McCutchen Mar 05 '10 at 17:43
-1

Get rid of all mentions of GE_DB.

Do a outputtext = "" towards the beginning.

Replace GE_DB.write(col.string if col.string else '') with outputtext += col.string if col.string else ''

prestomation
  • 7,225
  • 3
  • 39
  • 37
-2
import cStringIO as StringIO   # or import StringIO if on a fringe platform
buf = StringIO.StringIO()
for row in woof_1.findAll("tr", { "class" : "row_b" }):
  for col in row.findAll(re.compile('td')):
    buf.write(col.string if col.string else '')

result = buf.getvalue()
Jonathan Feinberg
  • 44,698
  • 7
  • 80
  • 103