3

I have a Microsoft Access database which contains columns in Punjabi language using using Gurmukhi (ਗੁਰਮੁਖੀ) script. When I read the database in MATLAB the Punjabi words are displayed as ?????.

How can I read in the data correctly? here is the code:

slCharacterEncoding('UTF-8');

setdbprefs('DataReturnFormat', 'cellarray');

setdbprefs('NullNumberRead', 'NaN');

setdbprefs('NullStringRead', 'null');

%Make connection to database. Note that the password has been omitted. 

conn = database('Punjabi', '', '');

%Read data from database. 


curs = exec(conn, ['SELECT DICWEB2.ID'... ' ,   DICWEB2.gur'...

    ' , DICWEB2.Meaning'...
    ' , DICWEB2.Shah'...
    ' , DICWEB2.Type'...
    ' , DICWEB2.sFile'...
    ' FROM  DICWEB2 ']);
curs = fetch(curs);
close(curs);

%Assign data to output variable


Pun1 = curs.Data;

%Close database connection.

close(conn);

%Clear variables

clear curs conn

Here is snap shot of problem

das-g
  • 9,718
  • 4
  • 38
  • 80
  • what if you use this command before loading the database: `feature('DefaultCharacterSet', 'UTF8')`? – Benoit_11 May 05 '15 at 19:20
  • 1
    Do you know the encoding used in the data? Is it one of the Unicode Encodings, possibly `UTF8` or `UTF16`? Or is it 8bit ASCII with a Punjabi-specific code-page? – das-g May 05 '15 at 19:25
  • I don't know whether it matters for the solution. (It might if code-pages are involved.) ... Is the data in [Gurmukhi (ਗੁਰਮੁਖੀ) alphabet or Shahmukhi (شاہ مکھی) alphabet](http://www.omniglot.com/writing/punjabi.htm)? (Sorry, by Browser seems to drop the RTL control characters when pasting.) – das-g May 05 '15 at 19:30
  • 1
    @das-g the data is gurmukhi format. – Harsimran Singh Dhanju May 05 '15 at 20:49
  • Uh, **edit** your question and put the code there, please. In a comment it's not really legible, as all line breaks are lost. – das-g May 05 '15 at 20:56
  • Thanks. :-) I've marked it up as a code block for you. (Review pending.) – das-g May 06 '15 at 12:21

1 Answers1

0

There is a similar problem for Persian language. It's simple :

  1. Go to Control Panel > Region > Administrative and change Language for non-unicode programs
  2. Select Beta: Use Unicode UTF-8 for worldwide language support
  3. Run feature('DefaultCharacterSet', 'UTF-8'); in Matlab
Arash Hatami
  • 5,297
  • 5
  • 39
  • 59