ftfy is a Python library which attempts to resolve encoding issues with Unicode data. To attract a wider audience for your "ftfy" question it is helpful to also specify the "python" tag. Additional tags which may also be relevant for your question include "unicode", "mojibake", "utf-8" and "encoding".
Questions tagged [ftfy]
7 questions
2
votes
3 answers
Fix encoding errors in csv with mixed encodings
there are some other questions here regarding this problem, but none of them fixed my problem so far.
I have a large (40MB) CSV file. Most of the file is encoded in iso-8859-1 (latin1), but some entries (just entries!) are in utf-8.
If i try to open…

dibade89
- 33
- 4
1
vote
0 answers
Python ftfy not fixing mojibake characters if the string contains \xa0
I'm trying to use ftfy Python package to fix unicode errors in a csv file but it fails at lines that contains \xa0
I don't understand why this is happning and how should it be properly fixed!
Here is an example that is causing problem:
>>> txt =…

Ahmad Al-Shishtawy
- 11
- 2
0
votes
1 answer
Python fixing mojibake using Ftfy Issue
Few of the text files that I'm importing has mojibake, so I'm trying to fix them using the ftfy library prior to feeding them to Spacy (NLP). The code snippet relating to this issue:
import spacy
import classy_classification
import pandas as…

Sang
- 27
- 4
0
votes
1 answer
Python: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 3-4: ordinal not in range(256)
I found a site which fixes my mojibake, here that uses the python package ftfy. I tried reproducing the steps given, although it seems to pre-convert the string before running the steps it gives me.
The string I am trying to fix is EvðŸ’👸ðŸ»,…

AAA
- 361
- 1
- 5
- 19
0
votes
1 answer
Recursively transform dict leaves in Python
I'm having trouble applying a function to all leaves of a dict (loaded from a JSON file) in Python. The text has been badly encoded and I want to use the ftfy module to fix it.
Here is my function:
def recursive_decode_dict(e):
try:
if…

Paul Selle
- 318
- 1
- 2
- 9
0
votes
1 answer
How to plug in a specific validator for all cases of a built-in type?
I recently noticed that some of my entries in a database coming from users contain incorrectly encoded strings, such as ó when ó was clearly meant. It's coming from copy-pasting of other websites that aren't properly encoded, which is beyond my…

d33tah
- 10,999
- 13
- 68
- 158
-1
votes
1 answer
Running simple script meant to fix Mojibake with Python and ftfy gives "*** Remote Interpreter Reinitialized ***"
When I run it nothing happens except "*** Remote Interpreter Reinitialized ***".
# https://junschoi.github.io/posts/ftfy_guide/
import ftfy
def main(): # Added by pyscripter.
pass
ftfy.fix_text('This text should be in “quotesâ€\x9d.') #…

David Walden
- 3
- 3