8

I have a touch screen keyboard in my WPF application and I would like to allow the users to write in chinese.

I saw that there is an IME in Windows that allows to write in chinese with Pinyin. It works great but I'd like to customize it for my WPF application. (Especially the candidate list). I didn't find any documentation for this.

The idea will be that the user write in Pinyin with the virtual keyboard and there will be a list of choice with chinese ideograms next to the textbox.

Do you have any advice to achieve it? Maybe there is a library (not from Microsoft) that can make it and in this case I won't use the IME from MS?

Makoto
  • 104,088
  • 27
  • 192
  • 230
Rodrigue Rens
  • 277
  • 3
  • 8
  • If it is touch-based input, wouldn't it be better to let them enter characters by "writing" on the screen instead of using pinyin conversion? – Szabolcs Mar 02 '12 at 15:04
  • The touchscreen is not really reactive and it won't be easy to write in small textboxes. It's for a medical application, so the physician s have the possibility to create new patients and after find them by their name, firstname etc. That's why can't implement your solution. – Rodrigue Rens Mar 02 '12 at 15:18
  • So the reason you want to customize the candidate list is to auto-complete the patient name? The way most programs I know implement this is by allowing to type pinyin (directly, not through the system IME), and autocomplete based on that. This problem is a lot easier than implementing a general and effective IME---a general IME must handle all characters and must suggest the most likely matches The newest MS pinyin IME even auto-updates from the internet with the latest statistics to improve predictions, and it learns from the user as well. – Szabolcs Mar 02 '12 at 15:24
  • So as a simple workaround you could let them type *either* in roman letters (direct pinyin), and suggest matched on the names in characters, and let them type through an IME as well, just like on a usual desktop OS. This is not as good as offering the auto-completion right in the IME of course, but it should be pretty usable. – Szabolcs Mar 02 '12 at 15:26
  • Thanks for your advices Szabolcs. The reason I want to customize the candidate list of the MS IME is to give it the same theme of the application. (The list will appear next to the keyboard). There is not just the firstname and the lastname but also comments about the patients. I think it would be better that the user can insert chinese ideograms instead of pinyin. I can get the list with the function ImmGetCandidateList but can I modify the list style with a function of imm32.dll? – Rodrigue Rens Mar 02 '12 at 16:00

2 Answers2

4

http://www.microsoft.com/downloads/zh-cn/details.aspx?FamilyID=7D1DF9CE-4AEE-467F-996E-BEC826C5DAA2

http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=15251

Microsoft in fact has good components/libraries for that, but they are hidden here in the Visual Studio International Feature Pack.

Note that You need 1.0 SR 1 which provides the basic libraries, while 2.0 adds many WinForms or WPF controls.

(Updated on Oct 26, 2017. Many guys have published NuGet packages on NuGet.org based on Microsoft's code, so you might also check those packages out.)

Lex Li
  • 60,503
  • 9
  • 116
  • 147
  • Thank you Lex Li, I've downloaded the package before but I didn't see anything about how to convert a pinyin string into Chinese ideograms. Which function shall I use? – Rodrigue Rens Mar 13 '12 at 10:46
  • I should have made it clearer that you need both 1.0 SR 1 and 2.0. CHSPinyinConv.msi in 1.0 SR1 is the one for Pinyin conversion. ChineseChar.GetChars(string pinyin) can be your starting point. – Lex Li Mar 14 '12 at 09:42
  • 1
    I think it's better to mention the whole namespace, which is `Microsoft.International.Converters.PinYinConverter`. Also, easier to get via NuGet today: https://www.nuget.org/packages/Microsoft.International.Converters.PinYinConverter/ – Yoav Feuerstein Oct 26 '17 at 06:55
  • Also, any idea what could possibly cause `ChineseChar.GetChars(string pinyin)` to return null, no matter what the input is? (I've tried different values that should be ok in pinyin) – Yoav Feuerstein Oct 26 '17 at 08:11
  • @YoavFeuerstein I hope you know Chinese and Pinyin, as well as the format of "pinyin" accepted by this library. Maybe you can start from the following code, `var han = new ChineseChar('获'); for (int i = 0; i < han.PinyinCount; i++) { var pinyin = han.Pinyins[i]; Console.WriteLine(pinyin); }` – Lex Li Oct 26 '17 at 16:50
  • @LexLi Thanks, but I was trying to do it the other way around - meaning, I want to input a pinyin string written in english letters (such as "feng") and get the Chinese character for it. Isn't that supposed to work with the GetChar() method? – Yoav Feuerstein Oct 26 '17 at 17:04
  • 1
    @YoavFeuerstein run the snippet I pasted to learn what exactly is the format of "pinyin" and then you know `feng` is indeed invalid and thus `null` is expected. You should use `FENG1` or other valid input. – Lex Li Oct 26 '17 at 17:05
  • @LexLi Thanks! But still, given that the user has typed only "feng" and I want to show all possible Chinese characters that could match it, should I just append a number (1, 2, 3 or 4) and see what I get from the GetChars() method for each of those numbers? (I assume there's no more than 4 possible numbers, according to this discussion: https://www.quora.com/Why-are-there-numbers-in-the-Chinese-Pinyin) – Yoav Feuerstein Oct 30 '17 at 12:25
  • 1
    @YoavFeuerstein I don't have the answer. You might review the source code of this assembly (and its resource files) to gain more insights. – Lex Li Oct 30 '17 at 13:07
  • @LexLi Thanks :) I assumed there should be a standard way of input for such scenarios, but couldn't find any relevant documentation. – Yoav Feuerstein Oct 30 '17 at 13:12
3

Not sure if there is any OS (Open Source) packages available. However, in theory, it is not too hard to build this kind of library. In Chinese, there are about 1300 single sounds: initial + final + tones. Each sound have group of Chinese characters, various number from 1 to 130 characters.

You may define an array of all Pinyin sounds:

string[] pinyins = new string[] {
  "a:c1c2c3...",      // pinyin 1 a: character1 character2...
  ...
  "zuo:z1z2z3z4z5..." // last pinyin (1300) zuo: character character...
};

The above array is a base for your mapping Pinyin to Chinese(Chinese characters and Pinyin tones are unicode strings). Then for each Pinyin input sound, a list of characters is obtained by a function like this:

string getCharacters(string aPinyin) {
   string characters = null;
   foreach(string item in pinyins) {
      string[] temp = item.split(':');
      if (temp[0].Equals(aPinyin)) {
          charaters = temp[1];
          break;
      }
   }
   return characters;
}

I wrote a JavaScipt long time ago, where I defined the relationship between Pinyin and Chinese characters. In my blog: Get Pinyin From Chinese Characters, the script can be found by view the source codes or Inspect Element in context menu. In my blog, the script is used to convert Chinese to Pinyin, but the relationship can be used as a reference.

enter image description here

To add smart Pinyin feature--displaying a list of words for Pinyin, this can be done by defining all the commonly used words in the similar pattern: pinyin:words.

David.Chu.ca
  • 37,408
  • 63
  • 148
  • 190
  • Hi David, thank you for your answer. Do you think that the users will be able to write everythink they want with this dictionnary? I mean basic stuff like their lastname, firstname, comments about a medical exam. – Rodrigue Rens Mar 13 '12 at 10:50