Questions tagged [transliteration]

Transliteration refers to the process of mapping letters or glyphs from one character encoding to another

Transliteration is the conversion of letters from one alphabet to another one, like from Greek to Latin. But it may as well be just a simplification within one alphabet, for example omitting any diacritics found in that alphabet or substituting special characters with a sequence of characters without diacritics.

257 questions
12
votes
3 answers

Slugify and Character Transliteration in C#

I'm trying to translate the following slugify method from PHP to C#: http://snipplr.com/view/22741/slugify-a-string-in-php/ Edit: For the sake of convenience, here the code from above: /** * Modifies a string to remove al non ASCII characters and…
Trav L
  • 14,732
  • 6
  • 30
  • 39
12
votes
3 answers

Transliterate/transpose the characters in the NSString

I want to transliterate a cyrillic string to it's closest latin equivalent. E.g. "матрешка" => "matreshka", "водка" => "vodka". So ideally I want some ready to use method on the NSString or somewhere else that already knows everything about the…
Dmitry Sokurenko
  • 6,042
  • 4
  • 32
  • 53
11
votes
5 answers

How to convert (transliterate) a string from utf8 to ASCII (single byte) in c#?

I have a string object "with multiple characters and even special characters" I am trying to use UTF8Encoding utf8 = new UTF8Encoding(); ASCIIEncoding ascii = new ASCIIEncoding(); objects in order to convert that string to ascii. May I ask…
Geo
  • 8,663
  • 13
  • 63
  • 93
11
votes
1 answer

Hindi to English Transliteration

Is there a python library for transliterating Hindi to English? e.g. "खाया" should be converted to "khaya"
Palash Kumar
  • 429
  • 6
  • 18
11
votes
3 answers

Transliteration in ruby

What is the simplest way for transliteration of non English characters in ruby. That is conversion such as: translit "Gévry" #=> "Gevry"
Selva
  • 655
  • 5
  • 11
8
votes
3 answers

Transliteration with Iconv in Ruby

When I'm trying to transliterate a Cyrillic utf-8 string with Iconv.iconv('ascii//ignore//translit', 'utf-8', string).to_s (see questions/1726404/transliteration-in-ruby) I'm getting everything but those symbols that have to be transliterated. For…
jibiel
  • 8,175
  • 7
  • 51
  • 74
8
votes
1 answer

Does .NET transliteration library exists?

Does .NET Transliteration library exists ? Note that this is not translation, something like this Perl lib : http://www.lingua-systems.com/transliteration/Lingua-Translit-Perl-module/ I just find : http://transliterator.codeplex.com/
Antonio Bakula
  • 20,445
  • 6
  • 75
  • 102
7
votes
2 answers

Efficient data structure/algorithm for transliteration based word lookup

I'm looking for a efficient data structure/algorithm for storing and searching transliteration based word lookup (like google do: http://www.google.com/transliterate/ but I'm not trying to use google transliteration API). Unfortunately, the natural…
Mehdi
  • 1,075
  • 1
  • 11
  • 24
7
votes
1 answer

How to use Google transliteration API in my java web application?

How to use Google Transliteration API in my Java application. If i give a String (either in English or Arabic) as input, the Google Transliterator API then it should translate it into the corresponding other language and give the transliterated…
IamIronMAN
  • 1,871
  • 6
  • 22
  • 28
7
votes
1 answer

ICU: What does NFD and NFC mean?

I found a snippet which reads Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower(); ...and is supposed to make arbitrary string well fitting into a URL. So, I guess the things between the semicolons are something like…
Adrian Föder
  • 770
  • 1
  • 7
  • 22
7
votes
1 answer

ICU: Transliterate and then remove all non-alphanumeric characters

Can it be done with ICU without falling back to regex? Currently I normalize filenames like this: protected function normalizeFilename($filename) { $transliterator = Transliterator::createFromRules( 'Any-Latin; Latin-ASCII;…
Vladislav Rastrusny
  • 29,378
  • 23
  • 95
  • 156
7
votes
3 answers

Convert accented text into ASCII characters?

I would like to convert accented letters and various encodings into the plain English ASCII one in Javascript and wonder what are the possible options. What I need is that: éclair ~becomes~ eclair bär ~becomes~ bar привет ~becomes~ privet こんにちは…
Aerodynamika
  • 7,883
  • 16
  • 78
  • 137
6
votes
6 answers

Convert filenames to use only latin characters

I have a ton of filenames in Russian (and some Slovenian & Greek). To play them in my car the song titles must use only Western European characters. Is there a program that can do this file renaming? If not, is there a list of what letter(s) to use…
David Thielen
  • 28,723
  • 34
  • 119
  • 193
6
votes
1 answer

Standard Algorithms to implement Transliteration and Transliteration Suggestion

I have constructed an algorithm to transliterate from English to multiple languages, Since we should show them appropriate suggestion for the words they have entered, I have made logic to search in that language dictionary. Logic I have implemented…
6
votes
1 answer

Which romanization standard should be used to improve ICU4j transliteration for Arabic-Latin?

We have a requirement to transliterate Arabic text to Latin characters(without diacritical marks) and display them to users. We are currently using IBM ICU4j for this. The API doesn't trasliterate well the Arabic text into proper readable latin…
Kamlesh Sharma
  • 222
  • 1
  • 7
  • 23
1
2
3
17 18