-1

Given a directory tree, with root folder name is “rootuploaded”, I need to combine files in this tree into some groups, using rules below:

  • Files in different subfolders cannot be grouped.
  • Files in a group may have same or different extension.
  • Each group must have minimum 2 files, maximum 5 files.
  • Grouping files based on 6 naming conventions (with top-down priority):

    1. FileName.ext, FileName_anything.ext, FileName_anythingelse.ext, ...
    2. FileName.ext, FileName-anything.ext, FileName-anythingelse.ext, ...
    3. FileName_1.ext, FileName_2.ext, ..., FileName_N.ext (maybe not continuous)
    4. FileName-1.ext, FileName-2.ext, ..., FileName-N.ext (maybe not continuous)
    5. FileName 1.ext, FileName 2.ext, ..., FileName N.ext (maybe not continuous)
    6. FileName1.ext, FileName2.ext, ..., FileNameN.ext (maybe not continuous)

Here is a simple example of program input and output:

Input is a directory tree look like this:

  • rootuploaded\samplefolder\1232_2342.jpg
  • rootuploaded\samplefolder\1232_234234_1.jpg
  • rootuploaded\samplefolder\1232_234234_2.bmp
  • rootuploaded\file-5.txt
  • rootuploaded\file-67.txt
  • rootuploaded\file-a.txt
  • rootuploaded\file1.txt
  • rootuploaded\file2.txt
  • rootuploaded\file5.txt
  • rootuploaded\filea.txt
  • rootuploaded\file_sample.txt
  • rootuploaded\file_sample_a.txt

Output:

Group 1

rootuploaded\samplefolder\1232_234234_1.jpg
rootuploaded\samplefolder\1232_234234_2.bmp

Group 2

rootuploaded\file1.txt
rootuploaded\file2.txt
rootuploaded\file5.txt

Group 3

rootuploaded\file-5.txt
rootuploaded\file-67.txt

Group 4

rootuploaded\file_sample.txt
rootuploaded\file_sample_a.txt

Cannot grouped

rootuploaded\samplefolder\1232_2342.jpg
rootuploaded\file-a.txt
rootuploaded\filea.txt
Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
Tien Nguyen
  • 324
  • 2
  • 9

2 Answers2

0

Use regular expressions to group. Linq's methods GroupBy and Take may be helpful for the rest.

Robert S.
  • 1,942
  • 16
  • 22
  • Thanks for reply! I've found the similar topic: http://stackoverflow.com/questions/16051029/find-files-with-same-names-but-different-extensions-in-a-folder but I don't known how to apply regex for my case (based naming conventions with top-down priority). Can you help me, provide an example code? – Tien Nguyen May 25 '15 at 18:13
0

Here is some example code without using Linq. CreateGroups will return a list of lists. Each outer list matches a specific regular expressions. The items inside the inner list are the inputs that matched the regular expression.

using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

public class Test
{
    public string[] TestInput = new string[]
    {
        @"rootuploaded\samplefolder\1232_234234_1.jpg",
        @"rootuploaded\samplefolder\1232_234234_2.bmp",
        @"rootuploaded\file-5.txt",
        @"rootuploaded\file-67.txt",
        @"rootuploaded\file-a.txt",
        @"rootuploaded\file1.txt",
        @"rootuploaded\file2.txt",
        @"rootuploaded\file5.txt",
        @"rootuploaded\filea.txt",
        @"rootuploaded\file_sample.txt",
        @"rootuploaded\file_sample_a.txt"
    };

    public string[] RegularExpressions = new string[]
    {
        "[A-Za-z_]+",
        "[A-Za-z-]+",
        "[A-Za-z]+_[0-9]+",
        "[A-Za-z]+-[0-9]+",
        "[A-Za-z]+ [0-9]+",
        "[A-Za-z]+[0-9]+"
    };

    public List<List<string>> CreateGroups(string[] inputs)
    {
        List<List<string>> output = new List<List<string>>(RegularExpressions.Length);

        for (int i = 0; i < RegularExpressions.Length; ++i)
            output.Add(new List<string>());

        foreach (string input in inputs)
        {
            string filename = Path.GetFileNameWithoutExtension(input);

            for (int i = 0; i < RegularExpressions.Length; ++i)
            {
                Match match = Regex.Match(filename, RegularExpressions[i]);

                if (match.Success && match.Length == filename.Length)
                {
                    output[i].Add(input);
                    break;
                }
            }
        }

        return output;
    }
}

Example for displaying the results:

Test test = new Test();
var output = test.CreateGroups(test.TestInput);

foreach (List<string> list in output)
{
    string group = "Group:\r\n";

    foreach (string item in list)
        group += "\t" + item + "\r\n";

    Console.WriteLine(group);
}
Robert S.
  • 1,942
  • 16
  • 22