2

I am importing a dataset from excel using the built in import data wizard. However, when viewing the data in SAS, cells with newlines have all line feeds (alt+Enter) replaced with a period (.)

For example, in excel:
"Example text
with new line"

will be read in by SAS as:
"Example text.with new line"

Usually line feeds or carriage returns are replaced by spaces, where the hex code (if you format the text as hex) is 0A. When I convert the text in excel to hex in excel using a formula, the new line feeds also show up as 0A.

However, the hex code for the period in my text (what used to be a line return in excel) is 2E, rather than the expected 0A. This prevents me from differentiating them from normal full stops, which means there's no possible workaround. Has anyone else come across this issue? Is there an option to change/set the default line feed replacement character in SAS?

My import code (variables replaced with 'text' for simplicity) for reference:

data work.table;
    length
        text $ 50;
    label
        text = "Text"
    format
        text $CHAR50;
    informat
        text $CHAR50;
    infile 'path/to/file'
        lrecl=1000
        encoding='LATIN9'
        termstr=CRLF   
        dlm='7F'x
        missover
        dsd;
    input
        text $CHAR50;
run;
Pmmoks
  • 35
  • 3
  • 1
    I would not recommend using the Import Wizard; there are far better tools nowadays. It's possible it is corrupting the file, but if it is there's nothing we can do about it. Are you able to use `PROC IMPORT`? – Joe Jan 15 '21 at 05:36
  • And - just to clarify: this is a `.xls` or `.xlsx` file, and you're using the Base SAS Import Wizard, which generates code like the above? Or is it a delimited text file which you wrote the above code to read it in? – Joe Jan 15 '21 at 05:37
  • We used a csv file to import & export - never had an issue. – Solar Mike Jan 15 '21 at 06:24
  • Find out if the dot is realy a dot, i.e. `'2E'x` by printing it in `$HEX` format. – Dirk Horsten Jan 15 '21 at 06:28
  • Your import code is not reading an Excel file. Are you reading an export file made from Excel ? The wizard lets you save the import code that the wizard creates. Have you posted that code ? – Richard Jan 15 '21 at 18:11
  • What Import Wizard is it that you used? Are you using Enterprise Guide? SAS/Studio? Some other SAS tool? And what version of that interface tool are you running? – Tom Jan 15 '21 at 18:53
  • @DirkHorsten - Unfortunately, it really is a dot, printing in HEX format gives '2E'x – Pmmoks Jan 17 '21 at 21:49
  • @Joe I am using the SAS Enterprise Guide (v7.1) Import Wizard (to import a .xlsx file) which has generated code like the above. The way it has been set up at work means I can't use proc import (at least that's what I've been told before) – Pmmoks Jan 17 '21 at 22:16
  • 2
    That’s what I thought from the 7F delimiters. I think you’re just out of luck here - find a different option. Your work may need to license Access to PC Files. EG import wizard isn’t meant for serious work. – Joe Jan 18 '21 at 02:04
  • Can you upload the problem data file to https://pastebin.com/ ? – Richard Jan 18 '21 at 03:15
  • 1
    If you can't use `proc import`, that is usually because you run your code on a server and you don't have access to the folders on that server or visible from it. If you only need this once, they might be willing to put your file on the server for once. – Dirk Horsten Jan 18 '21 at 11:21
  • @DirkHorsten Certainly possible, but `PROC IMPORT` has to be licensed as well (while the import wizard doesn't). – Joe Jan 18 '21 at 17:47
  • Also - the Import Wizard will automatically upload the data to the server, even if you ask it to use the PC Files engine; I just verified that on my end (file saved to c:\, ran import, it correctly used PROC IMPORT, and had a fileref for a temporary file in my SASWORK folder). – Joe Jan 18 '21 at 17:49

2 Answers2

0

SAS Viewer will not render so called non-printables (characters <= '1F'x) and does not display carriage return characters as a line break.

Example:

Excel cell with two line breaks in the data value

enter image description here

Imported with

proc import datafile='sample.xlsx' out=work.have;
run;

and viewed in standard SAS data set viewer (viewtable) appear to have lost the new lines.

enter image description here

Rest assured they are still there.

proc print data=have;
  var text / style = [fontsize=14pt];
  format text $hex48.;
run;

enter image description here

Richard
  • 25,390
  • 3
  • 25
  • 38
0

I would not recommend using the Import Wizard; there are far better tools nowadays. EG's import wizard is unique in SAS tools in how it works, and really was meant only to supply a way for data analysts who were not programmers to quickly bring in data; it's not robust enough for production work.

In this case, what's happening is that SAS's method for reading the data in is very rudimentary. What it does is convert it to a delimited file, and it doesn't handle LF characters very cleanly there. Instead of keeping them, which would be possible but is riskier (remember, this has to work for any incoming file), what it does is convert those to periods.

You'll see that in the notes in the program it generates:

Some characters embedded within the spreadsheet data were translated to alternative characters so as to avoid transmission errors.

It's referring to the LF character in that case.

The only way to get around this that I'm aware of is to either:

  • Convert the file to CSV from Excel yourself, and then read it in
  • Use ACCESS to PC FILES (via PROC IMPORT, or the checkbox in the import wizard)

Either of those will allow you to read in your line feed characters.

Joe
  • 62,789
  • 6
  • 49
  • 67