0

In Solaris 8, it looks like iconv*() family of functions is broken and only supports conversion between single-byte charsets and UTF-8, which can be verified using this code example:

#include <stdio.h>
#include <errno.h>
#include <iconv.h>

#if defined(__sun) && defined(__SVR4)
#define CP1251 "ansi-1251"
#define ISO_8859_5 "ISO8859-5"
#else
#define CP1251 "CP1251"
#define ISO_8859_5 "ISO-8859-5"
#endif

void iconv_open_debug(const char *, const char *);

int main() {
    iconv_open_debug(CP1251, CP1251);
    iconv_open_debug(CP1251, ISO_8859_5);
    iconv_open_debug(CP1251, "KOI8-R");
    iconv_open_debug(CP1251, "UTF-8");
    iconv_open_debug(CP1251, "WCHAR_T");

    iconv_open_debug(ISO_8859_5, CP1251);
    iconv_open_debug(ISO_8859_5, ISO_8859_5);
    iconv_open_debug(ISO_8859_5, "KOI8-R");
    iconv_open_debug(ISO_8859_5, "UTF-8");
    iconv_open_debug(ISO_8859_5, "WCHAR_T");

    iconv_open_debug("KOI8-R", CP1251);
    iconv_open_debug("KOI8-R", ISO_8859_5);
    iconv_open_debug("KOI8-R", "KOI8-R");
    iconv_open_debug("KOI8-R", "UTF-8");
    iconv_open_debug("KOI8-R", "WCHAR_T");

    iconv_open_debug("UTF-8", CP1251);
    iconv_open_debug("UTF-8", ISO_8859_5);
    iconv_open_debug("UTF-8", "KOI8-R");
    iconv_open_debug("UTF-8", "UTF-8");
    iconv_open_debug("UTF-8", "WCHAR_T");

    iconv_open_debug("WCHAR_T", CP1251);
    iconv_open_debug("WCHAR_T", ISO_8859_5);
    iconv_open_debug("WCHAR_T", "KOI8-R");
    iconv_open_debug("WCHAR_T", "UTF-8");
    iconv_open_debug("WCHAR_T", "WCHAR_T");

    return 0;
}

void iconv_open_debug(const char *from, const char *to) {
    errno = 0;
    if (iconv_open(to, from) == (iconv_t) -1) {
        fprintf(stderr, "iconv_open(\"%s\", \"%s\") FAIL: errno = %d\n", to, from, errno);
        perror("iconv_open()");
    } else {
        fprintf(stdout, "iconv_open(\"%s\", \"%s\") PASS\n", to, from);
    }
}

which only prints

iconv_open("UTF-8", "ansi-1251") PASS
iconv_open("UTF-8", "ISO8859-5") PASS
iconv_open("UTF-8", "KOI8-R") PASS
iconv_open("ansi-1251", "UTF-8") PASS
iconv_open("ISO8859-5", "UTF-8") PASS
iconv_open("KOI8-R", "UTF-8") PASS

to stdout and returns EINVAL for other pairs. Note that even conversion to the same charset (e.g. UTF-8 -> UTF-8) is not supported.

Questions

  1. Can anyone reference a document describing the limitations of Solaris version of iconv.h?
  2. How can I convert a wchar_t* to a single- or multibyte string w/o relying on GNU libiconv? wcstombs() would be fine except that it relies on the current locale's charset, while I want a wide string converted to a regular string using a particular charset, possibly different from the default one.
Bass
  • 4,977
  • 2
  • 36
  • 82
  • 1
    Two Solaris 8 documents that should help are here: http://docs.oracle.com/cd/E19455-01/816-3328/6m9k8pg1o/index.html and here: http://docs.oracle.com/cd/E19455-01/816-3321/6m9k23sb4/index.html – Andrew Henle Nov 18 '15 at 11:32
  • Your application can call [`setlocale`](http://docs.oracle.com/cd/E23824_01/html/821-1465/setlocale-3c.html) to change its locale. – Thomas Dickey Nov 18 '15 at 11:46
  • @ThomasDickey: Thanks, but `setlocale` is not thread-safe and is not suitable for multiple conversions from/to multiple (>2) charsets. – Bass Nov 18 '15 at 11:58
  • 1
    Your sample program does not threading, and this detail was not mentioned in the question. – Thomas Dickey Nov 19 '15 at 01:14

1 Answers1

0

Running sdtconvtool shows most legacy codepages are supported.

After re-running the same utility with truss -u libc::iconv_open, I learnt that conversion from one single-byte encoding to another single-byte one is done in two steps, with intermediate conversion to UTF-8.

Speaking of conversion from "WCHAR_T", iconv(3) also does support it, but "UCS-4" should be used as a source charset name since sizeof(wchar_t) is 4 on Solaris (for both x86 and SPARC).

Bass
  • 4,977
  • 2
  • 36
  • 82