2

I got a serious problem regarding Unicode and utf8, I saved a paragraph of Arabic/Persian text file into notepad and saved it, now I saw my information like

Êæ Çíä ÓæÑÓ ÈÑäÇãå ÚÏÏ ÏáÎæÇåí Ñæ ÇÒ æÑæÏí ãííÑå æ Èå Øæá åãæä ÚÏÏ ãËáËí Ñæ ÑÓã ãí ˜äå 

my question is how to get back my data, it is important for me to get this data back, thanks in advance

Anthony Faull
  • 17,549
  • 5
  • 55
  • 73
  • The `open` box in Notepad has a dropdown called `Encoding` - just set it to `UTF-8`. P.S. If this question is actually about *writing a program* to read the UTF-8 data, edit the question and make that more clear. – Mark Ransom Oct 27 '13 at 18:09

2 Answers2

1

The paragraph was scrambled by saving as code page 1256 (Arabic/Persian), then interpreted as code page 1252 (Western Europe), and finally saved as Unicode text. You can use C# to reverse this procedure:

string scrambled = "Êæ Çíä ÓæÑÓ ÈÑäÇãå ÚÏÏ ÏáÎæÇåí Ñæ ÇÒ æÑæÏí ãííÑå æ " + 
                   "Èå Øæá åãæä ÚÏÏ ãËáËí Ñæ ÑÓã ãí ˜äå";
byte[] bytes = Encoding.GetEncoding("windows-1252").GetBytes(scrambled);
string plainText = Encoding.GetEncoding("windows-1256").GetString(bytes);
Console.WriteLine(text);

The plain text output is: "تو اين سورس برنامه عدد دلخواهي رو از ورودي ميگيره و به طول همون عدد مثلثي رو رسم مي کنه"

Anthony Faull
  • 17,549
  • 5
  • 55
  • 73
  • prefect ! can you tell me what you concatenate with: string scrambled = "Êæ Çíä ÓæÑÓ ÈÑäÇãå ÚÏÏ ÏáÎæÇåí Ñæ ÇÒ æÑæÏí ãííÑå æ " + "Èå Øæá åãæä ÚÏÏ ãËáËí Ñæ ÑÓã ãí ˜äå"; – ferdows shahryar Nov 02 '13 at 19:03
  • It is merely cosmetic. I split the string into two parts to prevent scrollbars from appearing in the answer. – Anthony Faull Nov 06 '13 at 07:47
1

On Linux you can use Gedit to open it as a 1256 encoded file:

gedit shahnameh.txt --encoding WINDOWS-1256

You can do the same work via gui. You just need select the correct encoding from "open" dialog box when opening a file. It should be at the bottom of the open dialog.

Padma Kumar
  • 19,893
  • 17
  • 73
  • 130
MortezaE
  • 378
  • 3
  • 13