-3

I have 1000 folders in a directory named folder1, folder10, folder 25, .......folder10200. Each of these 1000 folders contain 100 csv files with names as file1, file2, ..... file100 The task I need to achieve is create a single dataset with derived variables from the values in file1, file2, etc of every folder. I need to end up with a dataset with 1 observation per file per folder. So a total of 100*1000 rows please suggest

Joe
  • 62,789
  • 6
  • 49
  • 67
  • are your folder names in some kind of Arithmetic progression - 1,10,25...10200? – in_user Dec 26 '14 at 08:58
  • You haven't provided enough information to answer your question. Do all files have same structure? What is your OS, what is your folder naming convention, and what 1 obs per file do you need. – Reeza Dec 26 '14 at 10:44

1 Answers1

2

I think the code is self explanatory, if you don't get it let me know.

  • Assumptions
  • Folder Names are like - Folder1, Folder2 , Folder3 etc(although you have mentioned otherwise, but that is not clear, what actually the pattern is, seems like an AP though)
  • File Names are like - file1 , file2, file3 etc
  • All the files have same exact structure

%macro read_all_files();

%do i=1 %to 1000;
   %do j=1 %to 100;
      data temp;
      infile "\path\folder&i.\file&j..csv";
      input var1 var2;
      run;

Proc append base=final data=temp force;
run;

    %end;
%end
%mend read_all_files();

%read_all_files;
in_user
  • 1,948
  • 1
  • 15
  • 22