0

I have one directory where all files with their different version are available. Like,

ABC.pdf ABC_1.pdf .......

XYZ.tif ..... XYZ_25.tif

MNO.tiff

I want to make n batches of m files as per used requirement.

Suppose, in folder I have ABC.pdf to ABC_24.pdf & XYZ.tif to XYZ_24.tif files. Total 50 files. I want to create two batches of 25 files each. So, first (do I/how to) need to make sure that all files in my list are sorted then I can perform some logic to divide the list into two proper batches.

1) ABC.pdf to ABC_24.pdf

2) XYZ.tif to XYZ_24.tif

But if I have 26 files (as described in beginning) then it would be like

1) ABC.pdf to ABC_24.pdf

2) XYZ.tif to XYZ_24.tif

3) ABC_25.pdf and XYZ_25.tif

So, I want proper/meaningful batch allocation of files here. I would prefer to perform in as less no of lines as possible. So, I tried lambda expression as below :

List<string> strIPFiles =  Directory.GetFiles(folderPath, "*.*").
Where(file => file.ToLower().EndsWith("tiff") || file.ToLower().EndsWith("tif") || file.ToLower().EndsWith("pdf")).ToList();

int batches = 2, filesPerBatch = 25; //for example

Do I need to use - strIPFiles.Sort(); will it be useful in anyway or I will always get sorted list of files?

How to create batches from the list - using lambda expression as I expressed above?

Thanks for your help.

sapatelbaps
  • 484
  • 2
  • 8
  • 19
  • 1
    You won't always get the files in sorted order. You can use the `OrderBy` extension method to sort, or you can call `List.Sort`. – Jim Mischel Feb 16 '14 at 23:53
  • Could you define what batch allocation you consider "proper" and "meaningful"? – Andrew Savinykh Feb 17 '14 at 03:50
  • @zespri, please check the two batch creation example. First is of 25-25 and the second is 25-25-2. In first example total 50 files. Here, note that ABC & ABC_n(1<=n<=24) is in First example first batch. Same way for XYZ & XYZ_n(1<=n<=24) in second batch. While in case of second example where no. of files are 52, ABC_25 & XYZ_25 will be third batch. Meaningful means when possible put different versions of file in same batch and group them accordingly as much as possible - as in second example. Hope it would make sense now. Thanks. – sapatelbaps Feb 17 '14 at 03:57

1 Answers1

3

Not sure if I entirely understand your question. I assume you are looking for a way to divide files into batches of specified size ( as in # of files) and you also want them to group based on file name.

Let me know if this is helpful:

    public static void CreateBatch(int batchSize)
    {
        string sourcePath = @"C:\Users\hari\Desktop\test";

        var pdfs = Directory.EnumerateFiles(sourcePath, "*.pdf", SearchOption.TopDirectoryOnly);
        var tiffs = Directory.EnumerateFiles(sourcePath, "*.tiff", SearchOption.TopDirectoryOnly);

        var images = pdfs.Union(tiffs);

        var imageGroups = from image in images
                          group image by Regex.Replace(Path.GetFileNameWithoutExtension(image), @"_\d+$", "") into g
                          select new { GroupName = g.Key, Files = g.OrderBy(s => s) };

        List<List<string>> batches = new List<List<string>>();
        List<string> batch = new List<string>();

        foreach (var group in imageGroups)
        {
            batch = batch.Union(group.Files).ToList<string>();

            if (batch.Count >= batchSize)
            {
                batches.Add(batch);
                batch = new List<string>();
            }
        }            
    }
Ravi M Patel
  • 2,905
  • 2
  • 23
  • 32