6

I an trying to process a file which contains russian symbols. When reading and after writing some text to the file I get something like:

\160\192\231\229\240\225\224\233\228\230\224\237

How can I get normal symbols?

Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
Anton
  • 2,535
  • 2
  • 25
  • 28

3 Answers3

8

If you are getting strings with backslashes and numbers in, then it sounds like you might be calling "print" when you want to call "putStr".

psmears
  • 26,070
  • 4
  • 40
  • 48
2

If you deal with Unicode, you might try utf8-string package

import System.IO hiding (hPutStr, hPutStrLn, hGetLine, hGetContents, putStrLn)
import System.IO.UTF8
import Codec.Binary.UTF8.String (utf8Encode)
main = System.IO.UTF8.putStrLn "Вася Пупкин"

However it didn't work well in my windows CLI garbling the output because of codepage. I expect it to work fine on other Unix-like systems if your locale is set correctly. However writing to file should be successfull on all systems.

UPDATE:

An example on encoding package usage.

YasirA
  • 9,531
  • 2
  • 40
  • 61
  • He's not dealing with unicode. According to firefox the page he linked is encoded in Windows-1251. – sepp2k May 15 '10 at 13:21
  • 2
    Then [encoding package](http://hackage.haskell.org/package/encoding) may be useful, it has [System.Encoding.CP1251](http://hackage.haskell.org/packages/archive/encoding/0.6.3/doc/html/Data-Encoding-CP1251.html). – YasirA May 15 '10 at 13:28
  • I have some problems to install this package on windows. Can not find library i try like this: cd c:\Users\test_8\Desktop\encoding-0.6.3 runhaskell Setup.hs configure --extra-include-dirs="c:\Users\test_8\Desktop\encoding-0.6.3" --extra-lib-dirs="c:\Users\test_8\Desktop\encoding-0.6.3" but get this: Setup.hs: Missing dependency on a foreign library: * Missing header file: system_encoding.h – Anton May 15 '10 at 14:21
  • required localinfo.h. I can not find it. – Anton May 16 '10 at 06:57
  • @Anton: Please, paste your sources somewhere, for example [here](http://freebsd.pastebin.com/) if they aren't so huge. – YasirA May 16 '10 at 07:15
  • import Text.HTML.TagSoup import Text.HTML.Download main :: IO () main = do tags <- fmap parseTags $ openURL "http://www.trade.su/search?ext=1" let r = partitions (~== "") tags !! 1 appendFile "out" (show r) – Anton May 16 '10 at 10:53
  • This is does not work: {-# LANGUAGE ImplicitParams #-} import Text.HTML.TagSoup import Text.HTML.Download import Prelude hiding (appendFile) import System.IO.Encoding import Data.Encoding.CP1251 main :: IO () main = do tags <- fmap parseTags $ openURL "http://www.trade.su/search?ext=1" let r = partitions (~== "") tags !! 1 let ?enc = CP1251 appendFile "out" (show r) – Anton May 16 '10 at 12:46
  • I've just given up installing encoding package on Windows, haven't GHC for Unix on hand. This is interesting how you managed to install it. – YasirA May 16 '10 at 13:24
  • I Could not install on Windows too. – Anton May 16 '10 at 13:28
  • @Anton: Mate, I can no longer help you, since I don't have GHC for *nix, sorry. I wish google will help you. And it would be nice if you answered yourself here. ;) – YasirA May 16 '10 at 13:37
  • Thanks. I can not understand reason of my problem. Hence i do not know what i need find in Google. – Anton May 16 '10 at 13:37
  • @Anton: Also make sure that `appendFile` you use is imported from `System.IO.Encoding` rather `System.IO`. Try to use `System.IO.Encoding.appendFile` call. – YasirA May 16 '10 at 13:45
2

I have got success.

{-# LANGUAGE ImplicitParams #-}

import Network.HTTP
import Text.HTML.TagSoup
import Data.Encoding
import Data.Encoding.CP1251
import Data.Encoding.UTF8

openURL x =  do 
        x <- simpleHTTP (getRequest x)
        fmap (decodeString CP1251) (getResponseBody x)

main :: IO ()
main = do
    tags <- fmap parseTags $ openURL "http://www.trade.su/search?ext=1"
    let TagText r  = partitions (~== "<input type=checkbox>") tags !! 1 !! 4
    appendFile "out" r
Anton
  • 2,535
  • 2
  • 25
  • 28