0

The Python program below checks if there exists an alphabet in the string and if there is no alphabet it translates it to english using a custom API and writes it to a file. Since isalpha() checks for - 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'.

I'm not sure as to why the program enters the first loop for this string - '龙海德信机电有限公司'. When I ran the debugger it showed that the isalpha() function evaluates as an alphabet. I'm not sure as to why this happens.

def translate_function(file):
    filea = open(file,encoding = "utf8")
    fileb = open("lmao.txt", 'r+')
    count = 0
    for i in filea:
        state = 'false'
        count += 1
        for j in i :
            if (j.isalpha()):
                state = 'true'
                print(i, "This is English")
                break
        if (state == 'false'):
            trans = translate(i)
            fileb.write(trans)
            fileb.write('\n')
    return count
lmaololrofl
  • 103
  • 3
  • Why exactly `isalpha()` is returning `true` has been explained here https://stackoverflow.com/a/49598734/541591 and https://stackoverflow.com/a/3308844/541591. Maybe try stripping away unicode characters? a regex could easily do it. – James Wong Jun 22 '18 at 01:48

1 Answers1

1

You can try this, I have modified your code a little bit:

def translate_function(file):
    filea = open(file,encoding = "utf8")
    fileb = open("lmao.txt", 'r+')
    count = 0
    for i in filea:
        state = 'false'
        count += 1
        words = i.split(" ")
        for word in words:
            if not word.isalpha():
                trans = translate(i)
                fileb.write(trans)
                fileb.write('\n')
    return count
Anand Tripathi
  • 14,556
  • 1
  • 47
  • 52