2

I work on a big website on a Linux server (PHP) and with a SQL Server database. The SQL Server database uses collation SQL_Latin1_General_CP1_CI_AS.

I have a table (let us call it table1) that stores a lot of addresses who are selected from nominatim via a PHP script (script1) and stored in table1. It seems like the format of the addresses is HTML. The Danish letters (æøå) look fine on the website when selected from table1.

However I have made a PHP script (script2) that selects these addresses and dump them into another table (table2) on the same MS SQL Server (still collation SQL_Latin1_General_CP1_CI_AS). But the Danish letters look weird on the website when selected from table2. When script1 gets addresses from nominatim it is in JSON format, which is afterwards decoded.

$addressdetails = json_decode ( $addressdetails, true );

No other encodes or decoding is made here.

The following may also be a help. When I run phpinfo() I can see that the server has these settings:

PHP version 5.3.3
content type: text/html; charset=UTF-8

What is the best way to handle letters in PHP and SQL Server, so Danish and other special letters are shown the correct way on any platform?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • If you are stuck with a Latin1 database character set, use HTML entities? – halfer Feb 07 '15 at 08:58
  • Ah, the problem might be that the PHP `json_` functions require UTF-8, so either you'll need to convert between charsets, or avoid these functions. – halfer Feb 07 '15 at 09:02

1 Answers1

2

Sounds like table1 has the address data in an NVARCHAR typed column whereas table2 has the address data in an VARCHAR typed column. NVARCHAR will happily handle æøå but VARCHAR with SQL_Latin1_General_CP1_CI_AS collation will not.

If both tables are using NVARCHAR to store the data then check that it's not getting converted to VARCHAR at any point between table1 and table2, for example, a VARCHAR stored procedure parameter .

Rhys Jones
  • 5,348
  • 1
  • 23
  • 44
  • Given that the O.P. only says that "the Danish letters look weird" and not specifically are coming through as "?", it is not certain that the issue is a `VARCHAR` in the mix. It _could_ have something to do with SQL Server using UTF-16 Little Endian and PHP using UTF-8. Mostly likely it is a `VARCHAR` issue, we just can't rule out something else without more info from the O.P. Still, +1 since everything here is accurate. – Solomon Rutzky Sep 26 '15 at 20:56