0

How can I copy unicode to clipboard in HTML format?

Because english works, but if I copy a different language to clipboard, it turns like this: ���

Here is my code (copyStringEnd function is same as strcat):

char *html = "가나";//This is Korean
char *buf = (char*)malloc(400 + strlen(html));

strcpy_s(buf, 400,
    "Version:0.9\r\n"
    "StartHTML:00000000\r\n"
    "EndHTML:00000000\r\n"
    "StartFragment:00000000\r\n"
    "EndFragment:00000000\r\n"
    "<html><body>\r\n"
    "<!--StartFragment -->\r\n");

copyStringEnd(buf, html);
copyStringEnd(buf, "\r\n");

copyStringEnd(buf,
    "<!--EndFragment-->\r\n"
    "</body>\r\n"
    "</html>");

char *ptr = strstr(buf, "StartHTML");
wsprintf(ptr + 10, "%08u", strstr(buf, "<html>") - buf);
*(ptr + 10 + 8) = '\r';

ptr = strstr(buf, "EndHTML");
wsprintf(ptr + 8, "%08u", strlen(buf));
*(ptr + 8 + 8) = '\r';

ptr = strstr(buf, "StartFragment");
wsprintf(ptr + 14, "%08u", strstr(buf, "<!--StartFrag") - buf);
*(ptr + 14 + 8) = '\r';

ptr = strstr(buf, "EndFragment");
wsprintf(ptr + 12, "%08u", strstr(buf, "<!--EndFrag") - buf);
*(ptr + 12 + 8) = '\r';



if (OpenClipboard(NULL)) {
    EmptyClipboard();

    HGLOBAL hText = GlobalAlloc(GMEM_MOVEABLE | GMEM_DDESHARE, strlen(buf) + 4);

    char *ptrs = (char *)GlobalLock(hText);
    strcpy_s(ptrs, strlen(buf) + 1, buf);
    GlobalUnlock(hText);


    SetClipboardData(RegisterClipboardFormat("HTML Format"), hText);
    CloseClipboard();
    GlobalFree(hText);

}

free(buf);

When i paste it it turns to �

Marcus Bitzl
  • 284
  • 4
  • 16
asdf
  • 3
  • 1
  • 2
    What is the encoding of your source code file? If it's not UTF-8, then the content of your `html` string (those Korean characters) will not be UTF-8, and so when you push that directly into the clipboard it will be misinterpreted by anything trying to read the data. – TheUndeadFish Jan 18 '14 at 23:45
  • [MSDN](http://msdn.microsoft.com/en-us/library/windows/desktop/ms649015%28v=vs.85%29.aspx) sais "The only character set supported by the clipboard is Unicode in its **UTF-8** encoding". So convert your string to UTF-8 before write it into buffer. – Denis Anisimov Jan 19 '14 at 02:21
  • @DenisAnisimov How can I convert it? – asdf Jan 19 '14 at 02:44
  • @wdeveloper You may try [WideCharToMultiByte](http://msdn.microsoft.com/en-us/library/dd374130%28v=vs.85%29.aspx) function. – Denis Anisimov Jan 19 '14 at 04:16
  • And change `char *html = "가나"` to `wchar_t *html = L"가나"` instead. The original will not work unless the source file itself is also Korean encoded. – Remy Lebeau Jan 19 '14 at 17:15
  • @RemyLebeau Thanks, but can you check this question? [link](http://stackoverflow.com/questions/21224548/cant-copy-unicodeused-wchar-t-in-html-format-to-clipboard) I still have problem. Thanks! – asdf Jan 20 '14 at 00:39
  • @wdeveloper: There seems to be a recurring theme: people tell you to use `WideCharToMultiByte` but you ignore it. – MSalters Feb 19 '14 at 14:51

0 Answers0