Replacing Turkish characters with English characters

Question

I have a table which has 120 columns and some of them is including Turkish characters (for example "ç","ğ","ı","ö"). So i want to replace this Turkish characters with English characters (for example "c","g","i","o"). When i use "TRANWRD Function" it could be really hard because i should write the function 120 times and sometimes hte column names could be change so always i have to check the code one by one because of that.

Is there a simple macro which replaces this characters in all columns .

almost 100K but sometimes it could be more than a million rows — Doganay Yildiz, Jun 24 '16 at 08:31

Dominic Comtois · Accepted Answer · 2019-11-14T21:24:45.447

EDIT

In retrospect, this is an overly complicated solution... The translate() function should be used, as pointed by another user. It could be integrated in a SAS function defined with PROC FCMP when used repeatedly.

A combination of regular expressions and a DO loop can achieve that.

Step 1: Build a conversion table in the following manner

Accentuated letters that resolve to the same replacement character are put on a single line, separated by the | symbol.

data conversions;
  infile datalines dsd;
  input orig $ repl $;
  datalines;
ç,c
ğ,g
ı,l
ö|ò|ó,o
ë|è,e
;

Step 2: Store original and replacement strings in macro variables

proc sql noprint;
  select orig, repl, count(*)
    into :orig separated by ";",
         :repl separated by ";",
         :nrepl
    from conversions;
quit;

Step 3: Do the actual conversion

Just to show how it works, let's deal with just one column.

data convert(drop=i re);
  myString = "ç ğı òö ë, è";
  do i = 1 to &nrepl;
    re = prxparse("s/" || scan("&orig",i,";") || "/" || scan("&repl",i,";") || "/");
    myString = prxchange(re,-1,myString);
  end;
run;

Resulting myString: "c gl oo e, e"

To process all character columns, we use an array

Say your table is named mySource and you want all character variables to be processed; we'll create a vector called cols for that.

data convert(drop=i re);
  set mySource;
  array cols(*) _character_;
  do c = 1 to dim(cols);
    do i = 1 to &nrepl;
      re = prxparse("s/" || scan("&orig",i,";") || "/" || scan("&repl",i,";") || "/");
      cols(c) = prxchange(re,-1,cols(c));
    end;
  end;
run;

score 3 · Answer 2 · answered Jun 24 '16 at 10:37

3

When changing single characters TRANSLATE is the proper function, it will be one line of code.

translated = translate(string,"cgio","çğıö");

answered Jun 24 '16 at 10:37

data _null_

8,534
12
14

kl78 · Answer 3 · 2016-06-24T09:20:17.997

2

First get all your columns from dictionary, and then replace the values of all of them in a macro do loop.

You can try a program like this (Replace MYTABLE with your table name):

proc sql; 
select name , count(*) into :columns separated by ' ', :count
from dictionary.columns 
where memname = 'MYTABLE'; 
quit; 

%macro m;
data mytable;
set mytable;
%do i=1 %to &count;
%scan(&columns ,&i) = tranwrd(%scan(&columns ,&i),"ç","c");
%scan(&columns ,&i) = tranwrd(%scan(&columns ,&i),"ğ","g");
...
%end;

%mend;

%m;

edited Jun 24 '16 at 09:20

answered Jun 24 '16 at 09:14

kl78

1,628
1
16
26

i haven't tried yet but i will. I'm going to give a feedback – Doganay Yildiz Jun 24 '16 at 12:22

Replacing Turkish characters with English characters

3 Answers3