2

I'm testing something that like to output content to file, which has a Chinese name.

The file would be created successfully with right content but not file name.

I take a look at the function writeFile^1 and it represents file name using String. So I suspect this might be root cause.

    file :: FilePath
    file = "上海万达影城.html"

    content :: String
    content = "<h1>hello</h1>"

    write2File :: IO ()
    write2File = writeFile file content

Thanks your help!

-Simon

--------------------- Updated

  1. GHC at my side is 7.0.2
  2. A workaround found before upgrade. see detail below and the code change like

    import qualified Codec.Binary.UTF8.String as UTF8
    file = UTF8.encodeString "上海万达影城.html"
    
Makoto
  • 104,088
  • 27
  • 192
  • 230
Simon
  • 2,990
  • 3
  • 20
  • 17

2 Answers2

3

String is a list of unicode code points in Haskell. The interpretation of that list of unicode code points is system dependent. (You also need a not too old GHC to support this).

Generally though, once you're locale is set correctly, things just work.


N.B. there have been caveats in the past -- e.g. the old bug: System.Directory.getDirectoryContents unicode support - which might involve workarounds.

Community
  • 1
  • 1
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
0

In the meantime, you can use the System.Posix.IO.ByteString module. This will let you specify file paths as byte strings, so you can do the encoding/decoding yourself.

This is a known bug, and was fixed in 7.2.1.

http://hackage.haskell.org/trac/ghc/ticket/3307

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Seems unix-2.5.* package has `System.Posix.IO.ByteString` but I'm working on unix-2.4. I'll verify it again after I got chance to upgrade GHC. Thanks Dietrich. – Simon Apr 17 '12 at 13:03