-1

I'm currently using the following snippet to go through millions of files in a very large directory and then copy the ones that I need into another working directory. sNos is an int[] which holds some integers. I check if the filename contains one of these integers, if Yes, it is copied into my local directory.

string[] allFiles = System.IO.Directory.GetFiles(@"C:\ExampleFolder");
            foreach (string file in allFiles)
            {
                for (int i = 0; i < sNos.Count(); i++) 
                { 
                    if (file.Contains(sNos[i].ToString()))
                    {
                        File.Copy(file, "C:\\newFolder\\" + file.Substring(file.Length - 25), true);
                    }
                }
            }

Now, to be specific.. the filenames are in the format of XXXXXX_XX_XX_XX_XXX_XX. X denoting an integer. The first 6 numbers in the filename is what I try to match with values in my int array. The problem is this, there can be files with names like:

123456_33_42_56_234_44 (Size: 1 MB)
123456_33_46_34_992_23 (Size: 2 MB)

Now, since both files will match "123456" in my int array, both will be copied. However, I only want the larger file to be copied everytime there is a match with multiple files. There can be a match with 2 files, maybe 3 or even more. How can I go about doing this? Any help will be appreciated!

sparta93
  • 3,684
  • 5
  • 32
  • 63
  • Create a dictionary with int as your keys and the value the actual name file. So if you find a match again you compare the file size and replace accordingly. Then you go through the dictionary and copy the files – R Quijano Jul 24 '15 at 13:05
  • Are sNos always six digits long? Or might there be shorter ints in there that are zero-padded in the filenames? – Jerry Federspiel Jul 24 '15 at 13:28
  • @JerryFederspiel They are always 6 digits long and not zero padded – sparta93 Jul 24 '15 at 13:32

2 Answers2

0
var stringIds = sNos.Select(id=>id.ToString()).ToList();

// get the matching files and group them by id
var filesById = (from file in allFiles
                 let id = stringIds.FirstOrDefault(n => file.StartsWith(n))
                 where id != null
                 select new {file, id}).ToLookup(anon=>anon.id, anon.file);

// within each group, get the biggest file
var onlyBiggestById = filesById.ToDictionary(
                          fileGroup => fileGroup.key, 
                          fileGroup => fileGroup.Select(file => new {file, length = new System.IO.FileInfo(file).Length})
                                                .OrderByDescending(anon=>anon.Length)
                                                .Select(anon=>anon.file)
                                                .First())

// Actually copy the files 
onlyBiggestById.Values.ToList()
    .ForEach(file => File.Copy(file, "C:\\newFolder\\" + file.Substring(file.Length - 25), true));
Jerry Federspiel
  • 1,504
  • 10
  • 14
0

take the help of FileInfo to get the size of the file and apply condtions as per your requirement.

// Create new FileInfo object and get the Length.
    FileInfo f = new FileInfo(file);
    long s1 = f.Length;