2

I am trying to translate a text column using python which contain different text of different languages . nothing fancy with my code yet .

import pandas as pd
df = pd.read_excel('D:/path', head=None)

I used the following code :

from googletrans import Translator
translator = Translator()
df['Text to English'] = df['Text'].apply(translator.translate, src='id', dest='en')

but it gave me an error :

AttributeError: 'NoneType' object has no attribute 'group'

I search more for any other code and I came up with :

from textblob import TextBlob
df['Text to English'] = df['Text'].str.encode('ascii', 'ignore').apply(lambda x: TextBlob(x.strip()).translate(to='en'))

but it gave me an error of : TypeError: cannot use a string pattern on a bytes-like object

is there any solution for this ?? and thanks in advance

Mostafa Gafer
  • 63
  • 3
  • 14

1 Answers1

1

I think there are None or NaNs values, so is possible filter them by notna:

mask = df['Text'].notna()
df.loc[mask,'Text to English'] = df.loc[mask, 'Text'].apply(translator.translate, 
                                                            src='id', dest='en')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • actually there is no None nor Nan in the text , and still give me the error AttributeError: 'NoneType' object has no attribute 'group' – Mostafa Gafer Dec 13 '18 at 11:41
  • @MostafaGafer - It is google side problem :( - Check [this](https://stackoverflow.com/a/52456197) – jezrael Dec 13 '18 at 11:49
  • I tried all of the answers , but I couldn't find any solution for this :( Right now i am trying with this code: import goslate text = "Hello World" gs = goslate.Goslate() translatedText = gs.translate(text,'it') print(translatedText) the code is working perfectly and translate the text into Italian,but can you help me on how to apply it for a pandas column. and I would be truly grateful to you – Mostafa Gafer Dec 13 '18 at 13:26
  • @jezrael: Do you have any updated solution for this problem? I have a similar problem where I have a lot of blanks and then some entires that need to be translated. But if I use your above solution, it gives me error "list index out of range" – Django0602 Jun 15 '20 at 15:14
  • @jezrael: Were you able to look into my comment? I really need to solve this and totally out of ideas. I have posted a separate question as well if you would like to look at it? – Django0602 Jun 16 '20 at 11:26
  • @Django0602 - hmmm, also fail solution with sample data from question? – jezrael Jun 16 '20 at 11:28
  • @jezrael: https://stackoverflow.com/questions/62406660/jsondecodeerror-expecting-value-line-1-column-1-char-0-while-translating-tex this my question. Should we discuss there? – Django0602 Jun 16 '20 at 11:29