Questions tagged [chardet]

chardet is a python module for encoding detection

chardet is a python module for encoding detection.

See pypi project page.

36 questions
2
votes
1 answer

Error while parsing a page with BeautifulSoup4, Chardet and Python 3.3 in Windows

I get the following error when I try to call BeautifulSoup(page) Traceback (most recent call last): File "error.py", line 10, in soup = BeautifulSoup(page) File "C:\Python33\lib\site-packages\bs4\__init__.py", line 169, in __init__ …
Lazik
  • 2,480
  • 2
  • 25
  • 31
2
votes
0 answers

Python fix a broken encoding

I have a small icecast2 home server with django playlist management. Also, i have a lot of mp3's with broken encodings. First, i've tried to find some encoding repair tool on python, but haven't find anything working for me (python-ftfy, nltk - it…
night-crawler
  • 1,409
  • 1
  • 26
  • 39
2
votes
1 answer

chardet run incorrect in python 3

I am using chardet 2.01 in python 3.2,the souce code like this site http://getpython3.com/diveintopython3/case-study-porting-chardet-to-python-3.html can download here…
eternalblaze
  • 57
  • 1
  • 6
1
vote
1 answer

chardet.detect return empty language

I'm using chardet.detect in order to detect the language of a string like in one of the solutions suggested here my code looks like this: import…
kaki gadol
  • 1,116
  • 1
  • 14
  • 34
1
vote
0 answers

RequestsDependencyWarning: urllib3 (1.23) or chardet (2.3.0) doesn't match a supported version

I get this below error when i try to run cisco nxapi code. /usr/local/lib/python2.7/dist-packages/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.23) or chardet (2.3.0) doesn't match a supported…
1
vote
1 answer

python chardet can not detect utf-8 correctly

#!/usr/bin/env python3 # -*- coding: utf-8 -*- import chardet s = '123'.encode('utf-8') print(s) print(chardet.detect(s)) ss ='编程'.encode('utf-8') print(chardet.detect(ss)) and results b'123' {'encoding': 'ascii', 'confidence': 1.0, 'language':…
alwayslz
  • 119
  • 1
  • 8
0
votes
1 answer

What difference between "urllib.request.urlopen" and simple "open" for CSV handling

I'm re-reading old code of mine and I wonder why I used several times : file_2_test = urllib.request.urlopen('file://' + file).read() when (to my mind) open(file) would have been sufficient. I couldn't find explanation anywhere. I suppose at this…
Abpostman1
  • 158
  • 1
  • 8
0
votes
0 answers

Identify in list what data is not in UTF-8 format

I am ingesting data using CSV files into my my postgres DB through AWS and I am encountering some issues whereby some of the data is not in UTF-8 format. I want to identify which rows in my data are causing the issue so that I can address at…
user2772056
  • 135
  • 1
  • 2
  • 9
0
votes
1 answer

Python: chardet.detect with a big binary object

I get a some big files from a web page. They're binary. I need to scan them to detect thier encode, because chardet.detect let be my script too slow. I tought to use readline but i can't 'cause i have only binary. It's possibile to do something like…
Sam
  • 51
  • 6
0
votes
1 answer

chardet on simple UTF-16-LE text file

I have tried detecting encoding of a simple UTF-16-LE text file in Python 3 using chardet package, using following code: rawdata = open(filename, 'rb').read() result = chardet.detect(rawdata) print(result['encoding'], result['confidence']) The…
Larrax
  • 87
  • 1
  • 8
0
votes
1 answer

Decode unknown string

I have one source of data, that I don't control, and that sends strings with different encodings, and I have no way to know the encoding in advance! I would need to know the format to be able to correctly decode and store properly in a format that I…
0
votes
1 answer

How to detect encoding of a file format

I have files in bucket of s3 and i am reading them as stream. I want to detect the encoding of the diffrent files. I used chardet library , i am getting this error: TypeError: Expected object of type bytes or bytearray, got:
Suhas Kashyap
  • 398
  • 3
  • 14
0
votes
1 answer

Package is installed but not recognized

I'm trying to use chardet package in python, on Visual Studio 2017 15.6.2 Even when I have the chardet package installed, it is not recognized. What could possibly be wrong? Here is a screen capture https://i.stack.imgur.com/vtCB5.png If I try to…
jonuko
  • 39
  • 1
  • 5
0
votes
1 answer

UTF-8 encoded file is picked by chardetect as ASCII

I am writing a single file combining all the files present inside the folder.I want the text file to be UTF-8 encoded.My code is as follows import os import codecs import re def file_concatenation(path): with…
Jayashree
  • 811
  • 3
  • 13
  • 28
0
votes
3 answers

Runtime Error when trying to launch Jupyter Notebook (Python)

I usually work with the Jupyter Notebook Interface when programming Python but recently I installed bioservices through pip (Bioservices) and when I tried to open the Jupyter Notebook I get the following RunTime error: ~$ jupyter…
mgrc
  • 1
  • 1
  • 2