Questions tagged [transliteration]

Transliteration refers to the process of mapping letters or glyphs from one character encoding to another

Transliteration is the conversion of letters from one alphabet to another one, like from Greek to Latin. But it may as well be just a simplification within one alphabet, for example omitting any diacritics found in that alphabet or substituting special characters with a sequence of characters without diacritics.

257 questions
6
votes
1 answer

How to transliterate Unicode to ASCII in Javascript?

How do I transliterate Unicode characters to ASCII in pure Javascript? input: 'Ǐńťęř' output: 'Inter' I need similar effect as shells iconv -f UTF-8 -t 'ASCII//TRANSLIT' but in vanilla Javascript.
Michał Šrajer
  • 30,364
  • 7
  • 62
  • 85
6
votes
1 answer

Removing accent marks (diacritics) from Latin characters for comparison

I need to compare the names of European places that are written using the Latin alphabet with accent marks (diacritics) on some characters. There are lots of Central and Eastern European names that are written with accent marks like Latin characters…
Oliver
  • 11,297
  • 18
  • 71
  • 121
6
votes
3 answers

Persian transliteration Java API

I need transliterate English to Persian and want to use it in offline mode. Is there any free API that I can use offline ?
stacker
  • 177
  • 1
  • 2
  • 12
5
votes
3 answers

Hindi text to english transliteration for C#

I am searching for a c# code that can be used to get Hindi text to English transliteration. Also tell me if there is any api available for c#.
Deepak
  • 97
  • 3
  • 5
5
votes
3 answers

Cyrillic transliteration in R

Are there packages for Cyrillic text transliteration to Latin in R? I need to convert data frames to Latin to use factors. It is somewhat messy to use Cyrillic factors in R.
Viktor
  • 472
  • 5
  • 14
5
votes
2 answers

Best way to translate UTF-8 to ISO8859-1 in Go

I'm trying to map UTF-8 characters to their "similar" ISO8859-1 representation. Removing diacritics, but also replacing characters like Ł with L or ı with i. Example: José Kakışır should become Jose Kakisir. I'm aware that removing diacritics can…
derFunk
  • 1,587
  • 2
  • 20
  • 31
5
votes
1 answer

google JS input tools (transliteration) not working

I am developing a website with Google input tools API. The transliteration api is not working on android devices.it is working fine on iOS devices and desktop devices. (did not test on windows mobile) Google input tools link even the above link…
Bhanuka Yd
  • 646
  • 8
  • 25
5
votes
0 answers

Can MeCab be configured / enhanced to give me the reading of English words too?

If I begin with a wholly Japanese sentence and run it through MeCab, I get something like this: $ echo "吾輩は猫である" | mecab 吾輩 名詞,代名詞,一般,*,*,*,吾輩,ワガハイ,ワガハイ は 助詞,係助詞,*,*,*,*,は,ハ,ワ 猫 名詞,一般,*,*,*,*,猫,ネコ,ネコ で 助動詞,*,*,*,特殊・ダ,連用形,だ,デ,デ ある…
Hakanai
  • 12,010
  • 10
  • 62
  • 132
5
votes
2 answers

Removing diacritical marks from a Greek text in an automatic way

I have a decompiled stardict dictionary in the form of a tab file κακός bad where signifies a tabulation. Unfortunately, the way the words are defined requires the query to include all diacritical marks. So if I want to search for ζῷον,…
marmistrz
  • 5,974
  • 10
  • 42
  • 94
5
votes
6 answers

Transliterate cyrillic symbols in string into latin in Ruby?

How do I transliterate Cyrillic symbols in string into Latin in Ruby? I can't find any docs on that. I thought there should be some standard function for that.
Gherman
  • 6,768
  • 10
  • 48
  • 75
5
votes
5 answers

UTF-8 to ASCII using ICU Library

I have a std::string with UTF-8 characters in it. I want to convert the string to its closest equivalent with ASCII characters. For example: Łódź => Lodz Assunção => Assuncao Schloß => Schloss Unfortunatly ICU library is realy unintuitive and…
GetFree
  • 40,278
  • 18
  • 77
  • 104
4
votes
1 answer

Google transliteration API in Java Swing project

I'm working on a transliteration project in Java Swing. It can access the internet, but I'm unable to set it up. How do we do that? Do we need to download the Google API libraries or just a function call will do? Here are some useful Transliteration…
Jaison Varghese
  • 1,302
  • 1
  • 13
  • 24
4
votes
2 answers

Solr, Special Chars, and Latin to Cyrillic char conversion

I am trying to setup a search engine using Solr (or Lucene) which could have text in both Latin with special chars, (special chars would include Ö or Ç as an example) or Cyrilic chars (examples include Б or б and Ж ж). Anyway, I am trying to find a…
TiernanO
  • 1,597
  • 14
  • 31
4
votes
1 answer

Convert cyrilic to latin - latin intruders/exception

I am using simple dictionary to replace Cyrillic letters with Latin ones and most of the time its working just fine but I am having issues when there are some Latin letters as an input. Most of the time its company names. Few examples: PROCRED is…
sosNiLa
  • 289
  • 6
  • 18
4
votes
2 answers

Convert unicode digits from english to devanagari

I was looking for some RegEx or any other method to convert digits 0-9 into ०-९ (devanagari script). I am using asp.net but could not find any method in globalization namespace which does this. Any help is greatly appreciated.
1 2
3
17 18