Questions tagged [cjk]

CJK stands for Chinese, Japanese and Korean and is used to label issues common to these East Asian languages and their large character repertoires.

CJK stands for Chinese, Japanese, and Korean: East-Asian languages covered by various character sets, including:

  • Big5
  • EUC-JP
  • EUC-KR
  • Shift-JIS
  • GB2312
  • GB18030
  • ISO 2022-JP
  • Unicode
1096 questions
17
votes
4 answers

Android default charset when sending http post/put - Problems with special characters

I have configured the apache httpClient like so: HttpProtocolParams.setContentCharset(httpParameters, "UTF-8"); HttpProtocolParams.setHttpElementCharset(httpParameters, "UTF-8"); I also include the http header "Content-Type: application/json;…
avendael
  • 2,459
  • 5
  • 26
  • 30
17
votes
2 answers

How can I determine Levenshtein distance for Mandarin Chinese characters?

We are developing a system to do fuzzy matching on over 50 international languages using the UTF-8, UTF-16, and UTF-32 Unicode character standard. So far, we have been able to use Levenshtein distance to detect misspellings of German Unicode…
Frank
  • 1,406
  • 2
  • 16
  • 42
16
votes
3 answers

Should my Android app default to simplified or traditional Chinese?

I'm writing a localized Android app; I'm confused about how to handle the differences between simplified and traditional Chinese. Thanks to this excellent answer I know that I should put simplified Chinese in values-zh-rCN and traditional Chinese in…
Dan Fabulich
  • 37,506
  • 41
  • 139
  • 175
16
votes
2 answers

Chinese language codes

We are updating an old .net 1.1 website to 2.0. The site currently supports Chinese (Traditional) & Chinese (Simplified) I'm getting a run time error when trying to detect the language & culture using the codes: zh-CHS (simified) & zh-CHT…
Dave K
  • 1,674
  • 4
  • 16
  • 22
16
votes
3 answers

how to use chinese and japanese character as string in java?

Hi I am using java language. In this I have to use some chinese, japanese character as the string and print using System.out.println(). How can I do that? Thanks
sjain
  • 1,635
  • 9
  • 25
  • 35
15
votes
5 answers

How to extract a stroke from a Chinese character

I've been trying many times to create an algorithm to extract stroke information from Chinese characters. I've tried various methods but none was very satisfying, probably because of my limited knowledge of graphics algorithms in…
laurent
  • 88,262
  • 77
  • 290
  • 428
15
votes
5 answers

How to classify Japanese characters as either kanji or kana?

Given the text below, how can I classify each character as kana or kanji? 誰か確認上記これらのフ To get some thing like this 誰 - kanji か - kana 確 - kanji 認 - kanji 上 - kanji 記 - kanji こ - kana れ - kana ら - kana の - kana フ - kana (Sorry if I did it…
alex2k8
  • 42,496
  • 57
  • 170
  • 221
14
votes
3 answers

How do I implement full text search in Chinese on PostgreSQL?

This question has been asked before: Postgresql full text search in postgresql - japanese, chinese, arabic but there are no answers for Chinese as far as I can see. I took a look at the OpenOffice wiki, and it doesn't have a dictionary for…
Mike Chamberlain
  • 39,692
  • 27
  • 110
  • 158
14
votes
2 answers

Remove space below the text baseline with CSS

Lately I've been working with Japanese text, and I've found a rather annoying property. In Japanese, unlike English, glyphs do not extend below the text baseline. This example should show what I mean: div { font-size: 72pt; display:…
Rose Kunkel
  • 3,102
  • 2
  • 27
  • 53
14
votes
2 answers

How do I insert Chinese characters into a SQLExpress text field?

How do I insert Chinese characters into a SQLExpress text field? I'm using SQL Express from VS 2008. When I add Chinese characters, either via an import app I wrote or by pasting them in from the data view inside Visual Studio, they end up as…
Martin Shoemaker
14
votes
11 answers

Split a sentence into separate words

I need to split a Chinese sentence into separate words. The problem with Chinese is that there are no spaces. For example, the sentence may look like: 主楼怎么走 (with spaces it would be: 主楼 怎么 走). At the moment I can think of one solution. I have a…
Peterim
  • 1,029
  • 4
  • 16
  • 25
14
votes
2 answers

How to make Haskell or ghci able to show Chinese characters and run Chinese characters named scripts?

I want to make a Haskell script to read files in my /home folder. However there are many files named with Chinese characters, and Haskell and Ghci cannot manage it. It seems Haskell and Ghci aren't good at displaying UTF-8 characters. Here is what I…
TorosFanny
  • 1,702
  • 1
  • 16
  • 25
13
votes
1 answer

Built-in iOS fonts with support for Chinese characters?

What fonts come bundled with iOS that have a unique set of Traditional Chinese characters? It seems the a list of fonts included in iOS 5 resides at iosfonts.com; however, it seems that most fonts (ex: "GillSans-Bold") will use a common typeface…
codeperson
  • 8,050
  • 5
  • 32
  • 51
13
votes
3 answers

Using SAPI is there a way to enter pinyin for Chinese pronunciation?

The goal is to be able to pronounce something like wo3. System.Speech can handle Chinese characters, but is there a way to input pinyin directly? It seems from http://msdn.microsoft.com/en-us/library/ms720566(v=vs.85).aspx that I should be able to…
tofutim
  • 22,664
  • 20
  • 87
  • 148
13
votes
3 answers

Detect if character is simplified or traditional Chinese character

I found this question which gives me the ability to check if a string contains a Chinese character. I'm not sure if the unicode ranges are correct but they seem to return false for Japanese and Korean and true for Chinese. What it doesn't do is tell…
thenengah
  • 42,557
  • 33
  • 113
  • 157
1 2
3
73 74