0

I have UTF formatted strings, especially place names, in different languages such as:

A="København"
B="北京市"
C="Skåne län"

I would like to convert them to:

A="Kobenhavn"
B="Beijing"
C="Skane-lan"

Basically, to be able to do what unidecode does, but to do it in Bash. (I'm on Windows with MSys)

Can anyone point me somewhere?

noway
  • 2,585
  • 7
  • 41
  • 61
  • 3
    `echo 'København' | perl -Mutf8 -MText::Unidecode -ne 'print unidecode($_)'` ? – John1024 Jun 21 '15 at 22:05
  • 2
    possible duplicate of [this post](http://stackoverflow.com/questions/1975057/bash-convert-non-ascii-characters-to-ascii) – Alp Jun 21 '15 at 23:00

2 Answers2

2

If you have installed unidecode from python on your machine, you can directly pass argument in command line:

$ echo "København, 北京市, Skåne län" | unidecode

Output :

Kobenhavn, Bei Jing Shi , Skane lan

GLNB
  • 61
  • 5
0

There is a script for Russian letters only. Be careful: it does smth with files. The working part is

en=`echo $name|sed 'y/абвгджзийклмнопрстуфхыэе/abvgdjzijklmnoprstufhyee/'|sed 's/[ьъ]//g; s/ё/yo/g; s/ц/ts/g; s/ч/ch/g; s/ш/sh/g; s/щ/sh/g; s/ю/yu/g; s/я/ya/'`

NickKolok
  • 140
  • 6