8

I know there are a lot of questions already posted with arabic language in php, but I was unable to get solution to my problem and hence I am posting this question:

I have a PhP site that is running perfectly in English language. Now, I want it to support multiple languages including French, Spanish, Arabic but I am unable to use them with one code. The problem is, I have used substr() in many places and the translated characters not work as intended with substr(). I also tried with mb_substsr(), but of no use :(

The field in DB is "utf8_general_ci" and I have already placed header('Content-type: text/html; charset=UTF-8'); in my code to allow rendering in UTF-8.

The problem is either I get "?????" in place of the exact words or I get incorrect words with substr()

Please help!!

Nitesh
  • 2,286
  • 2
  • 43
  • 65

3 Answers3

25

For Arabic word: I want get the character in position 2 only.

$name = "MYNAME";
$Char = substr($Name,2,2);
echo $Char; // print N this okay.

But for Arabic work like كامل, it returns question mark

case 1: $Char = substr($Name,0,1);     // Not Work and return question mark
case 2: $Char = substr($Name,1,1);     // Not Work and return question mark
case 3: $Char = substr($Name,0*2,1*2); // Work and return "ك"
case 3: $Char = substr($Name,1*2,1*2); // Work and return "ا"

So I find the solution:

$Char = mb_substr($Name,1,1,'utf-8');  // Work and return "ا".
Musa
  • 96,336
  • 17
  • 118
  • 137
Moe
  • 481
  • 4
  • 4
5

First of all, forget substr. If you are going to encode your strings in UTF-8 and splitting them up, then mb_substr is the only working solution.

You also need to make sure that the connection encoding of MySql is also UTF-8. Do this by calling

mysql_set_charset('utf8');

just after mysql_connect. There are equivalent ways to do this if you are using a data access layer other than the mysql extension (such as PDO).

Jon
  • 428,835
  • 81
  • 738
  • 806
  • Don't forget `mb_internal_encoding("UTF-8");` and for PDO you can use this: `$dbh = new PDO($dsn, $user, $password, array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));` – maaudet Jun 25 '11 at 18:18
  • @tech-programmer: Maybe you have tried using it the wrong way? – Jon Jun 25 '11 at 18:19
  • @tech-programmer: It is if you want to get UTF-8 data instead of question marks. – Jon Jun 25 '11 at 18:20
  • @tech-programmer: You are not telling the function what encoding the arabic string is in; you need to do that, or better follow Manhim's advice and do `mb_internal_encoding("UTF-8");` once at the beginning of the script. Also, we don't know if "arabic string" is really encoded as UTF-8. If you just `echo` the whole string, does it display correctly? – Jon Jun 25 '11 at 18:22
  • @jon..I will try manhim's advice now and let you know. I checked the Database and I find that the way I translated from google, the same way it is stored in the database. For ex - نتائج الاستخدام المنتظم في الجلد أملس ناعم مطواع. – Nitesh Jun 25 '11 at 18:24
  • hi @jon & @Manhim--Just tried using `mb_internal_encoding("UTF-8");` but of no luck. I tried it just after connecting to Db as well as just before querying, but noe of them worked. Please advise.. – Nitesh Jun 25 '11 at 18:41
  • @tech-programmer: You should review what we suggest and try to understand why we did it and what our suggestions do. Remote debugging of third party code via StackOverflow is not something people like doing. – Jon Jun 25 '11 at 18:56
  • @Jon--Sorry to bother you with so many questions..will try to find a solution...Btw, Thanks for your answers.. – Nitesh Jun 25 '11 at 19:09
0
  1. Check if you're using other multibyte (mb_*) functions for working with your data.
  2. Check if your scripts are saved as UTF-8 text files.
  3. Check if you send SET NAMES utf8 query to the database right after you connect to it.
Ondřej Mirtes
  • 5,054
  • 25
  • 36
  • Did not get you regarding item 2..pls explain – Nitesh Jun 25 '11 at 18:19
  • Well, it doesn't matter for working with data from database, but it saves you a lot of headaches in case you write some of these special characters directly into the source code. Just go all-UTF-8-in and you don't have to deal with any of these problems :) – Ondřej Mirtes Jun 25 '11 at 18:33
  • I am already all set to utf8-encode and working on same, but still getting the errors. – Nitesh Jun 25 '11 at 18:44