0

I'm looking for any tool to detect the type of the string to it's Unicode !

I have data stored at the database in the format "سورة الناس" it equivalent to data in Arabic to "سورة الناس"

I 'm trying to do that because I'm working on Arabic framework and it uses a special kind of Unicode conversion that I don't know

Mustafa ELnagar
  • 482
  • 3
  • 13
  • 25
  • u have data stored in that format ... i dont think you can "reconvert" it once it goes to bunk, because that translation should have happened before insert. its like asking someone to unhash something. – PlantTheIdea Mar 16 '14 at 15:13
  • it's in a huge backup data , about 11000 record and it seems no one know how to convert it ! – Mustafa ELnagar Mar 16 '14 at 15:17
  • the problem is when I insert new data , so it becomes we have 2 format data , the Unicode data E.G "سورة الناس" and the new Arabic data E.G "أ ب" – Mustafa ELnagar Mar 16 '14 at 15:18
  • its not a matter of size man ... its a matter of it not being able to retrieve it. the original coding type couldnt read what it was so it invented characters, there is no "translation" back. – PlantTheIdea Mar 16 '14 at 15:19
  • this is another issue , so I'm asking how to detect the type of Unicode of that text ! other wise I will go on it with try and error but would be very hard ! – Mustafa ELnagar Mar 16 '14 at 15:21

1 Answers1

2

From the example, it seems that the data is simply UTF-8 encoded. The string “"سورة الناس” is what you get if you have the text “سورة الناس” as UTF-8 encoded and you misinterpret it as windows-1252 encoded.

So if all data is like that, you don’t need any conversions. You should simply do all the character processing on the basis of the UTF-8 encoding.

When working with PHP, the answers to the question UTF-8 all the way through are probably very useful.

Community
  • 1
  • 1
Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • thanks alot it works now with me I used mysql_query("SET NAMES 'windows-1252'"); mysql_query("SET CHARACTER SET windows-1252"); mysql_query("SET COLLATION_CONNECTION = 'windows-1252'"); before the sql query and it execute the same , thanks a million , you helped me too much by this Unicode :) thanks :) – Mustafa ELnagar Mar 17 '14 at 18:24