I run in an encoding Problem, when a response is put in beautifulsoup.
The readible-output of the response is formated in a proper way like Artikelstandort: Österreich
, but after running beautifulsoup it will be transformed to Artikelstandort: Österreich
. I'll provide you the changed code:
def formTest (browser, formUrl, cardName, edition):
browser.open (formUrl)
data = browser.response().read()
with open ('analyze.txt', 'wb') as textFile:
print 'wrinting file'
textFile.write (data)
#BS4 -> need from_encoding
soup = BeautifulSoup (data, from_encoding = 'latin-1')
soup = soup.encode ('latin-1').decode('utf-8')
table = soup.find('table', { "class" : "MKMTable specimenTable"})
data has the correct data, but the soup has the wrong encoding. I tried various encoding/decoding on the soup, but got no working result.
The page where I pull my data from is: https://www.magickartenmarkt.de/Mutilate_Magic_2013.c1p256992.prod
Edit: I changed the encoding with prettify like suggested, but now i'm facing following error:
TypeError: slice indices must be integers or None or have an __index__ method
What was changed with prettify? I plotted the new output and the table is still in the "soup" (<table class="MKMTable specimenTable">
)
Edit2:
New error is:
at: soup.encode ('latin-1').decode('utf-8')
Error: UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 518: invalid start byte
If I play with the encodings and decodings, errors with decoding some other byte will occur.