2
<table class="gridtable">
<tbody>
<tr>
<th>Store #</th><th>City Name</th><th>Orders</th></tr>
<tr><td>1</td><td style="text-align:left">Phoenix</td><td>70</td></tr>
<tr><td>2</td><td style="text-align:left">Columbus</td><td>74</td></tr>
<tr><td>3</td><td style="text-align:left">New York</td><td>112</td></tr>
<tr><td></td><td>TOTAL</td><td>256</td></tr></tbody>
</table>

I have played around with variations of the following but can't:

1) Show all the rows

2) Elegantly display the results as it would look if I looked at it on the actual page

import requests
from bs4 import BeautifulSoup

req = requests.get('Page.html')
soup = BeautifulSoup(req.content, 'html.parser')
tables = soup.find_all('table')
table = tables[0]
print(table.text)
John S
  • 85
  • 10

1 Answers1

4

Gather your text data into a flat array of individual rows and cells. Transpose this, so everything per column is gathered into a row. Create an array containing the length of the longest item per (originally) column. Use this data to space out each cell, while printing rows. In code:

from bs4 import BeautifulSoup

content = '''
<table class="gridtable">
<tbody>
<tr>
<th>Store #</th><th>City Name</th><th>Orders</th></tr>
<tr><td>1</td><td style="text-align:left">Phoenix</td><td>70</td></tr>
<tr><td>2</td><td style="text-align:left">Columbus</td><td>74</td></tr>
<tr><td>3</td><td style="text-align:left">New York</td><td>112</td></tr>
<tr><td></td><td>TOTAL</td><td>256</td></tr></tbody>
</table>
'''

def print_table_nice(table):
    cells = [[cell.text for cell in row.find_all(['td','th'])] for row in table.find_all('tr')]
    transposed = list(map(list, zip(*cells)))
    widths = [str(max([len(str(item)) for item in items])) for items in transposed]
    for row in cells:
        print (' '.join(("{:"+width+"s}").format(item) for width,item in zip(widths,row)))

soup = BeautifulSoup(content, 'html.parser')
tables = soup.find_all('table')
table = tables[0]
print_table_nice(table)

Result:

Store # City Name Orders
1       Phoenix   70    
2       Columbus  74    
3       New York  112   
        TOTAL     256

which seems about as elegant as you can do on a console. (To add vertical lines, just join the rows with a | instead of a space.)

I inlined the table data because I don't have access to your Page.html, but getting access to the table data does not seem to be the problem here.


Oh let's add lines all around. Just because I can:

def print_table_nice(table):
    header = [cell.text for cell in table.select('tr th')]
    cells = [[cell.text for cell in row.select('td')] for row in table.select('tr') if row.select('td')]
    table = [header]+cells
    transposed = list(map(list, zip(*table)))
    widths = [str(max([len(str(item)) for item in items])) for items in transposed]
    print ('+'+('-+-'.join('-'*int(width) for width in widths))+'+')
    print ('|'+(' | '.join(("{:"+width+"s}").format(item) for width,item in zip(widths,header)))+'|')
    print ('+'+('-+-'.join('-'*int(width) for width in widths))+'+')
    for row in cells:
        print ('|'+(' | '.join(("{:"+width+"s}").format(item) for width,item in zip(widths,row)))+'|')
    print ('+'+('-+-'.join('-'*int(width) for width in widths))+'+')

It turned out to be an interesting complication because this requires the th to be separated from the td rows. Won't work as-is for multi-line rows, though. Result, then, is:

+--------+-----------+-------+
|Store # | City Name | Orders|
+--------+-----------+-------+
|1       | Phoenix   | 70    |
|2       | Columbus  | 74    |
|3       | New York  | 112   |
|        | TOTAL     | 256   |
+--------+-----------+-------+
Jongware
  • 22,200
  • 8
  • 54
  • 100
  • @JohnS You can use [`pymysql`](https://pymysql.readthedocs.io/en/latest/). Just modify the `print_table_nice` function to return a nested list and insert each item. – t.m.adam Oct 26 '18 at 04:02
  • 1
    @JohnS Yes, here: [Inserting a list holding multiple values in MySQL using pymysql](https://stackoverflow.com/questions/42039794/inserting-a-list-holding-multiple-values-in-mysql-using-pymysql) you will find an example of SQL query with a nested list, and [here](https://pymysql.readthedocs.io/en/latest/modules/cursors.html#pymysql.cursors.Cursor.executemany) you can find the documantation. – t.m.adam Jul 30 '19 at 20:02