A pdf for which I am trying to extract a table from, correctly identifies the table but the table data is extracted as unicode rather than string data.
from tabula import read_pdf
df = read_pdf('https://watermark.silverchair.com/fsab153.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAs0wggLJBgkqhkiG9w0BBwagggK6MIICtgIBADCCAq8GCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMOXfntjWl9L87SyaXAgEQgIICgMSxXbyEzl4Y3sDeaGncgcE9V93d46LWUAnMiKz0KtHAKJA1HpPuefZZzrhJlD_hNUzK9C4uWwF1EfAbe0aWG3c_sFLetD5kqOWXzuGARvCRWOvmAEKpgtx0Desj5MY9lH7Zp7XxbfLBLScOIK6X_qEZ3Low6GkQfm1iBCbVHUg9ueKxLaYghX--uHPqmx43RZHk8bAjoDdMDT9lPsVXqlZJkmS2UT6T3uzC1jPTz3eON93C5CaEpW4lG_zvzMMltlZZm04Zz1vWd7WsXa_Gvc1gwO1AwUNcBxrRrr7Af5U02SPMaFF8dL0cOqrpw24LPzrg8ibtBq9yKidnCM-B2z74goz41kzv2KNZoPYQLj5XYlbyTknoE-MDo6cq_tGMw7igxbsrKUbGzSGILZ-bDQAVTyGKlU1QudNbZd4lDOe36kdr6dlhWHe7aK6vQgczTOYvQ0v1G5HwouxwTO0WPVpxawld76AZLhathmV4fMmNAYFpZDOytT4YAZEj-jjkPvzJ7HeA_-7ifmtwqLiOSILbLuJgEhLQ5frm9YXSn3crSInflJEsMm6Bs8pE_5H8vdex2tXzL6ZmHiDkDMdB_YM8iOhJGdMfZWsCJ0TcrtZyWZv5t-M1NzhLutplX-mYInE1sXZSTLHcOD0YDhEeMPNJhdGvISG_IbwDfH9OKuGQ0x8UCoe2DPVKOd53PYghKf2Bk8q7tILs3WeHgItnvRbkevjYS287gh_5052TKJJbC8dYxkVlHn-JCsbaMfn_SlYSaWjOfVxvSHKsVlFj5ry-cfScH8ai1bra8LASgwg4y_vpNeeDiA0CwZaPy2l_TF1O_yFsaKItyDkCMJXqhjI', pages=3)[0]
df['Unnamed: 0']
What is the correct way to extract the data in UTF-8 or ASCII?
Edit: something on my system (Debian) is able to interpret these codes though (see below) and the question is, how do I get this information out?