0

Currently I have the code below for a discord bot to display active fire calls in Toronto there are situations where alot of fire trucks are dispatched and when its scraped to a tabulate the dispatched units seem to overlap other columns and rows I want it be organized underneath each other in a column.

import discord
import requests
from bs4 import BeautifulSoup
from tabulate import tabulate

client = discord.Client()

@client.event
async def on_ready():
print('We have logged in as {0.user}'.format(client))

@client.event
async def on_message(message):
if message.author == client.user:
    return

if message.content.startswith('$hello'):
    await message.channel.send('Hello!')


if __name__ == '__main__':
endpoint = "https://www.toronto.ca/data/fire/livecad.xml?i4sqso"
header = [
    "Prime Street", "Cross Street", "Dispatch Time", "Incident Number",
    "Incident Type", "Alarm Level", "Area", "Dispatched Units"
]

page = requests.get(endpoint).text
events = BeautifulSoup(page, "lxml").find_all("event")

event_table = []
for event in events:
    row = event.getText(separator="|").split("|")
    if len(row) == 7:
        row.insert(1, "")
    event_table.append(row)
meowulf
  • 367
  • 1
  • 5
  • 14
  • 1
    please fix the code indentation – meowulf May 13 '21 at 00:03
  • Does this answer your question? [Beautiful Soup/Panada Table Parsing only parsing headers](https://stackoverflow.com/questions/67497488/beautiful-soup-panada-table-parsing-only-parsing-headers) – baduker May 13 '21 at 06:47

1 Answers1

1

There is no need to split as you are working with XML. Just iterate over each event and take the text from each:

import requests
from bs4 import BeautifulSoup
import pandas as pd

if __name__ == '__main__':
    endpoint = "https://www.toronto.ca/data/fire/livecad.xml?i4sqso"
    header = [
        "Prime Street", "Cross Street", "Dispatch Time", "Incident Number",
        "Incident Type", "Alarm Level", "Area", "Dispatched Units"
    ]

    page = requests.get(endpoint).text
    events = BeautifulSoup(page, "lxml").find_all("event")
    event_table = []
    
    for event in events:
        event_table.append([e.get_text(strip=True) for e in event])

    df = pd.DataFrame(event_table, columns=header)
    print(df)

This would give you a table as follows:

        Prime Street             Cross Street        Dispatch Time Incident Number                 Incident Type Alarm Level Area                           Dispatched Units
0  LYNEDOCK CRES, NY  FENSIDE DR / CLIMANS RD  2021-05-12 10:40:37       F21044501            Fire - Residential           2  233  R115, P233, P245, P123, A244, C11, C24...
1    DUFFERIN ST, NY  SPARROW AVE / RANEE AVE  2021-05-13 03:26:13       F21044810  Fire - Commercial/Industrial           2  145  P145, R133, P132, P143, A341, C13, R34...
2                M1R                           2021-05-13 04:10:49       F21044814                       MEDICAL           0  233                                       P233
3                M2N                           2021-05-13 04:31:47       F21044816                       MEDICAL           0  114                                       P114
Martin Evans
  • 45,791
  • 17
  • 81
  • 97