0

Problem- I have this Data Frame in Hindi and I want to convert it to English(But not a translation)

0   मोहनलाल
1   श्री बजरंग बाडी उधान रर्सरी फार्म कारी
2   श्री बालाजी कन्‍ट्रक्‍शन
3   श्री राम ईट उदयोग
4   साहिल पैड स्‍टोर

What I have tried

import pandas as pd
import googletrans
from googletrans import Translator

translator = Translator()
translations = {}
for column in df.columns:
    unique_elements = df[column].unique()
    for element in unique_elements:
        translations[element] = translator.translate(element).text
translations

And got this result

0   Mohan Lal
1   Shri Bajrang Body Farm Park Kari rarsari
2   Sri Balaji kantraksana
3   Rama briquette Udyog
4   Sahil pad Store

I need Something like this

0   MohanLal
1   Shri Bajrang Body Udhaan Rarsari Farm Park Kari
2   Sri Balaji Construction
3   Rama eit Udyog
4   Sahil pad Store
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
ETWAS
  • 83
  • 1
  • 6

1 Answers1

1

install following libraries

pip install -U git+https://github.com/aboSamoor/polyglot.git@master

pip install googletrans

%%bash

polyglot download embeddings2.hi

polyglot download transliteration2.hi

Try running this code


import polyglot
from polyglot.transliteration import Transliterator
t_hi = Transliterator(source_lang='hi', target_lang='en')
from googletrans import Translator
hi=[]
translatedList = []
temp_content = ['मोहनलाल', 'श्री', 'बजरंग', 'बाडी', 'उधान', 'रर्सरी', 'फार्म', 'कारी', 'बालाजी', 'कन्‍ट्रक्‍शन', 'राम', 'ईट', 'उदयोग', 'साहिल', 'पैड', 'स्‍टोर']
for index, row in enumerate(temp_content):
    t1 = Translator()
    try:
        hi.append(t_hi.transliterate((t1.translate(row, src='en', dest='hi')).text))
    except Exception as e:
        print(str(e))
        continue

# ['mohanlal', 'shree', 'bjrng', 'badi', 'udhan', 'rrsri', 'farm', 'kari', 'balaji', 'kntrkshn', 'ram', 'it', 'udyog', 'sahil', 'pad', 'store']

Use transliteration rather than translator.

try following this link Transliteration

Its not 100% precise but works.

SahilDesai
  • 512
  • 3
  • 6