I have 1000 folders in a directory named folder1, folder10, folder 25, .......folder10200. Each of these 1000 folders contain 100 csv files with names as file1, file2, ..... file100 The task I need to achieve is create a single dataset with derived variables from the values in file1, file2, etc of every folder. I need to end up with a dataset with 1 observation per file per folder. So a total of 100*1000 rows please suggest
Asked
Active
Viewed 178 times
-3
-
are your folder names in some kind of Arithmetic progression - 1,10,25...10200? – in_user Dec 26 '14 at 08:58
-
You haven't provided enough information to answer your question. Do all files have same structure? What is your OS, what is your folder naming convention, and what 1 obs per file do you need. – Reeza Dec 26 '14 at 10:44
1 Answers
2
I think the code is self explanatory, if you don't get it let me know.
- Assumptions
- Folder Names are like - Folder1, Folder2 , Folder3 etc(although you have mentioned otherwise, but that is not clear, what actually the pattern is, seems like an AP though)
- File Names are like - file1 , file2, file3 etc
- All the files have same exact structure
%macro read_all_files();
%do i=1 %to 1000;
%do j=1 %to 100;
data temp;
infile "\path\folder&i.\file&j..csv";
input var1 var2;
run;
Proc append base=final data=temp force;
run;
%end;
%end
%mend read_all_files();
%read_all_files;

in_user
- 1,948
- 1
- 15
- 22