1

I cannot understand how to work with the following files:

  • a .sas file containing the location -column- and length of each variable (I have 500+); and
  • a text file containing values of these variables (I have more than 120k observations).

I already posted a question about this and seems that the only way to merge variables' names and values is to do it manually. The problem is that I've been trying to figure this out for days now and often made mistakes.

Does anyone have an idea of how to solve this problem and automize importation with python pandas?

The .sas file (as opened in WPS workbench) appears as follow:

DATA YOUR_DATA;
INFILE 'C:\Users\...\file.txt';
    
 ***Column   VarName   Varlength      VarLabel***
    @18      PROFAM    6.             /*  progressivo famiglia univoco a livello indagine                                                                                                                                                           */                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
    @24      PROIND    2.             /*  progressivo individuo nell'ambito della famiglia                                                                                                                                                          */                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
    @29      NCOMP     2.             /*  n° dei componenti la famiglia attuale                            

Where Column VarName Varlength VarLabel are not in the real code and have been added here to explain the code.

LuizZ
  • 945
  • 2
  • 11
  • 23
Luca
  • 51
  • 7
  • Can you share some example of your requirement – Divyaansh Bajpai Nov 06 '20 at 18:32
  • 1
    Put an example of the SAS code in the question, and I'd suggest show an example of the code you want to write - the stuff you wrote by hand - to show what you'd want to generate. – Joe Nov 06 '20 at 20:47
  • @Joe I edited the question adding the beginning of the SAS file. I have no idea of how to code in sas, so I cannot share anything I wrote. The only thing I want to generate is a dataset (easily accessable in pandas) with values stored in columns and at the top their variable names. – Luca Nov 08 '20 at 12:44

1 Answers1

1

One way you can do it is:

  1. Download the WPS Workbench Community edition. WPS Workbench is a software that read sas files. If you are enrolled in some College/University, you can also download the academic edition (which is free of ads and have extra features). After you install and set it up, open it.

  2. In WPS Environment, open the .sas program: by clicking in File -> Open File and selecting your .sas file, which will open in WPS. Set three lines like that:

DATA YOUR_DATA;
INFILE 'C:\Users\...\path_to_your_file\YOURDATA.TXT' LRECL=978 MISSOVER;
INPUT
 @18  PROFAM  6.  /*  progressivo famiglia univoco a livello indagine */                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
 @24  PROIND  2.  /*  progressivo individuo nell'ambito della famiglia*/ 

Just choose the name you want for your data in the first line and put the correct path to your .txt file in the second line.

2.1 Correct the code by: Erasing the line ***Column VarName Varlength VarLabel*** that is between the INPUT statement and the first line of data.

2.2 Check the last two lines after the last variable, it should end something like that:

@235 NCOMPL  8. /*  n° rendimento la famiglia attuale */
;
run;

If it is not, add the last two lines above.

  1. Write the syntax to export your data to a .csv. Just write the following code at the end of the .sas program file:
proc export data=YOUR_DATA
     outfile="c:\myfiles\YourData.csv"
     dbms=csv 
     replace;
run;

Make sure to write again the correct name of your file (the one you chose) in the first line, and the correct path where you want the .csv file to be saved in the second line.

Press ctrl + R to run the code, or click on the run button. Click Ok in the Save and run WPS dialog box.

  1. Finally, open the .csv in Pandas.
LuizZ
  • 945
  • 2
  • 11
  • 23
  • Thanks @LuizZ! I still have some problems though. I did all what you said but run into this error: ```ERROR: Found "@" when expecting ; ERROR: Found "@" when expecting a statement ERROR: Found "18" when expecting a statement ERROR: The statement "PROFAM" is unknown in this context'``` I edited the question adding the beginning of the sas file so that you can have a better view of my issue. – Luca Nov 08 '20 at 12:33
  • @Luca, probably it is just a semicolon `;` missing. Check if at the end of the first and the second line there is a semicolon. If there isn't, add it there. – LuizZ Nov 08 '20 at 18:02
  • Thanks again. I did what you said: added a semicolon and updated the first two lines as follow: `DATA YOUR_DATA; INFILE 'C:\Users\file path\file_name.txt' LRECL=978 MISSOVER; ` However, the error changed but is still there: `ERROR: Found "@" when expecting a statement ERROR: Found "18" when expecting a statement ERROR: The statement "PROFAM" is unknown in this context `. I am sorry but I really have no idea of how SAS works. – Luca Nov 08 '20 at 18:07
  • 1
    Ok, just saw your editting. You are missing an Input statement too gonna add it to my answer – LuizZ Nov 08 '20 at 20:50
  • 1
    in italian we say Grazie mille (One thousand thanks)! You did solve my problem, cheers! – Luca Nov 08 '20 at 21:34