How to fix gibberish to Hebrew strings in python?

Question

I'm trying to automate an email sending service, which sends a person's bus station to his mail.

In order to do so I need to pull some data from a Hebrew website, but all I get is a file with gibberish in it.

I have tried encoding to utf8, but all I get is more gibberish.

import requests
import pandas as pd

url = 'http://yit.maya-tour.co.il/yit-pass/Drop_Report.aspx?client_code=2660&coordinator_code=2669'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
print(df)
df.to_csv('my data.csv')

I expected for the following:

רשימת פיזורים

שם הנהג סוג הרכב הערות תאור שעה

מוניות הקניון מונית A35 פיזור-שדרות 06:30

but got:

               ×©× ×× ×× ×¡×× ××¨××  ...               ×ª×××¨ ×©×¢×
0  ××× ×××ª ××§× ×××      ××× ××ª  ...  ×¤××××¨-×©××¨××ª  06:30

Alex · Accepted Answer · 2019-08-18T23:45:59.780

2

A response object's .content property gives you the data in bytes, try doing .text instead:

html = requests.get(url).text

More detail here: What is the difference between 'content' and 'text'

edited Aug 18 '19 at 23:45

answered Aug 18 '19 at 23:40

Alex

2,270
3
33
65

How to fix gibberish to Hebrew strings in python?

1 Answers1