0

I'm working on a problem related to nested groups. I need to determine all the groups a group is a member of and all the groups that are it's members. Not just immediate parents and children but everyone in the hierarchy up and down.

What I have done so far is the traversing logic from top-bottom ie, DFS using a stack to store the not visited notes and a hashset to store the visited notes. This is so that if there are any cycles in the resulting graph we wont go into infinite recursion.

static HashSet<string> visited = new HashSet<string>();
static Stack<string> notVisited = new Stack<string>();

static Dictionary<string, HashSet<string>> groupMembers = new Dictionary<string, HashSet<string>>
    {
        {  "G4", new HashSet<string> { "G5","G6","U1","U2"} },
        {  "G1", new HashSet<string> { "G2","G3","G6"} },
        {  "G3", new HashSet<string> { "G4"} },
        {  "G2", new HashSet<string> { "G4","G1"} },
        {  "G5", new HashSet<string> { "G2"} },
        {  "G6", new HashSet<string> { "U2","G5"} },
    };

static void Init(string start)
{
    notVisited.Push(start);
    while (notVisited.Count > 0)
    {
        string next = notVisited.Pop();
        HashSet<string> found;
        if (visited.Add(next) && groupMembers.TryGetValue(next, out found))
        {
            foreach (string member in found)
            {
                notVisited.Push(member);
            }
        }
    }
}

This part works. What I'm having trouble with is, figuring out how do I store the parents and children for each group during traversal. Remember that a group can have other groups or users as members and we don't want to store duplicate information.

Output should look like a list of groups where a group is as follow

private class MyGroup
{
    public string Identity { get; set; }

    public HashSet<string> MemberOf { get; set; }

    public HashSet<string> Members { get; set; }

    public HashSet<string> Users { get; set; }
}
Idle_Mind
  • 38,363
  • 3
  • 29
  • 40
user330612
  • 2,189
  • 7
  • 33
  • 64

1 Answers1

1

Not sure I'm understanding correctly because it seems like you've already solved this with your visited HashSet. Just make it a local variable in Init and return that as the result of the operation:

class Program
{

    static Dictionary<string, HashSet<string>> groupMembers = new Dictionary<string, HashSet<string>>
    {
        {  "G4", new HashSet<string> { "G5","G6","U1","U2"} },
        {  "G1", new HashSet<string> { "G2","G3","G6"} },
        {  "G3", new HashSet<string> { "G4"} },
        {  "G2", new HashSet<string> { "G4","G1"} },
        {  "G5", new HashSet<string> { "G2"} },
        {  "G6", new HashSet<string> { "U2","G5"} },
    };

    static void Main()
    {
        foreach(string start in groupMembers.Keys)
        {
            HashSet<string> result = Init(start);
            Console.WriteLine("Start @ " + start + ": " + String.Join(", ", result.ToArray()));
        }

        Console.Write("Press Enter to Quit");
        Console.ReadLine();
    }

    static HashSet<string> Init(string start)
    {
        HashSet<string> visited = new HashSet<string>();
        Stack<string> notVisited = new Stack<string>();

        notVisited.Push(start);
        while (notVisited.Count > 0)
        {
            string next = notVisited.Pop();
            HashSet<string> children;
            if (visited.Add(next) && groupMembers.TryGetValue(next, out children))
            {
                foreach (string member in children)
                {
                    notVisited.Push(member);
                }
            }
        }
        visited.Remove(start); // optionally remove "start" from the result?

        return visited;
    }

}

Output:

Start @ G4: U2, U1, G6, G5, G2, G1, G3
Start @ G1: G6, G5, G2, G4, U2, U1, G3
Start @ G3: G4, U2, U1, G6, G5, G2, G1
Start @ G2: G1, G6, G5, U2, G3, G4, U1
Start @ G5: G2, G1, G6, U2, G3, G4, U1
Start @ G6: G5, G2, G1, G3, G4, U2, U1
Press Enter to Quit

----- EDIT -----

Based on the new requirements, I ~think~ this is what you want:

class Program
{

    private class MyGroup
    {
        public string Identity { get; set; }

        public HashSet<string> MemberOf { get; set; }

        public HashSet<string> Members { get; set; }

        public HashSet<string> Users { get; set; }

        public override string ToString()
        {
            StringBuilder sb = new StringBuilder();
            sb.AppendLine("Identity: " + Identity);
            sb.AppendLine("MemberOf: " + String.Join(", ", MemberOf.ToArray()));
            sb.AppendLine("Members: " + String.Join(", ", Members.ToArray()));
            sb.AppendLine("Users: " + String.Join(", ", Users.ToArray()));
            return sb.ToString();
        }
    }

    static Dictionary<string, HashSet<string>> groupMembers = new Dictionary<string, HashSet<string>>
    {
        {  "G4", new HashSet<string> { "G5","G6","U1","U2"} },
        {  "G1", new HashSet<string> { "G2","G3","G6"} },
        {  "G3", new HashSet<string> { "G4"} },
        {  "G2", new HashSet<string> { "G4","G1"} },
        {  "G5", new HashSet<string> { "G2"} },
        {  "G6", new HashSet<string> { "U2","G5"} },
    };

    static void Main()
    {
        Dictionary<string, MyGroup> output = new Dictionary<string, MyGroup>();

        // First Pass: Figure out Children and Users
        foreach(string start in groupMembers.Keys)
        {
            MyGroup group = new MyGroup();
            group.Identity = start;
            HashSet<string> Users = new HashSet<string>();
            group.Members = GetChildrenAndUsers(start, ref Users);
            group.Users = Users;
            output.Add(start, group);
        }

        // Second Pass: Figure out the Parents:
        List<string> outer = output.Keys.ToList();
        List<string> inner = output.Keys.ToList();
        foreach (string outerKey in outer)
        {
            MyGroup group = output[outerKey];
            group.MemberOf = new HashSet<string>();
            foreach (string innerKey in inner)
            {
                MyGroup group2 = output[innerKey];
                if (group2.Identity != group.Identity)
                {
                    if(group2.Members.Contains(group.Identity))
                    {
                        group.MemberOf.Add(group2.Identity);
                    }
                }
            }
        }

        // Display the results:
        foreach(MyGroup group in output.Values)
        {
            Console.Write(group.ToString());
            Console.WriteLine("--------------------------------------------------");
        }
        Console.Write("Press Enter to Quit");
        Console.ReadLine();
    }

    static HashSet<string> GetChildrenAndUsers(string start, ref HashSet<string> Users)
    {
        HashSet<string> visited = new HashSet<string>();
        Stack<string> notVisited = new Stack<string>();

        notVisited.Push(start);
        while (notVisited.Count > 0)
        {
            string next = notVisited.Pop();
            HashSet<string> children;
            if (!groupMembers.ContainsKey(next))
            {
                Users.Add(next);
            }
            else
            {
                if (visited.Add(next) && groupMembers.TryGetValue(next, out children))
                {
                    foreach (string member in children)
                    {
                        notVisited.Push(member);
                    }
                }
            }
        }
        visited.Remove(start); // optionally remove "start" from the result?

        return visited;
    }

}

Output:

Identity: G4
MemberOf: G1, G3, G2, G5, G6
Members: G6, G5, G2, G1, G3
Users: U2, U1
--------------------------------------------------
Identity: G1
MemberOf: G4, G3, G2, G5, G6
Members: G6, G5, G2, G4, G3
Users: U2, U1
--------------------------------------------------
Identity: G3
MemberOf: G4, G1, G2, G5, G6
Members: G4, G6, G5, G2, G1
Users: U2, U1
--------------------------------------------------
Identity: G2
MemberOf: G4, G1, G3, G5, G6
Members: G1, G6, G5, G3, G4
Users: U2, U1
--------------------------------------------------
Identity: G5
MemberOf: G4, G1, G3, G2, G6
Members: G2, G1, G6, G3, G4
Users: U2, U1
--------------------------------------------------
Identity: G6
MemberOf: G4, G1, G3, G2, G5
Members: G5, G2, G1, G3, G4
Users: U2, U1
--------------------------------------------------
Press Enter to Quit
Idle_Mind
  • 38,363
  • 3
  • 29
  • 40
  • @idle-mind thanks for the reply. Is the output members of each group? I also want to determine the parents of each group...not just children..so output should be something like List where MyGroup has the following info. Makes sense? public class MyGroup { public string Identity { get; set;} public HashSet MemberOf { get; set; } public HashSet Members { get; set;} public HashSet Users { get; set; } } – user330612 Jul 23 '15 at 18:07
  • So what would you expect for a start of `G4`? When we traverse we determine that `G5` is a child of `G4`. Then we move to `G5`, we determine that `G2` is a child of `G5`, thus `G2` is also a child (grandchild?) of `G4`. **But if we started at `G2`, then `G4` would be a child of `G2`.** So which is `G2`, a **child** of `G4`, a **parent** of `G4`, or **both**? – Idle_Mind Jul 23 '15 at 18:23
  • It should be both. I have updated the question with excepted output format. If a group is child and member of a group due to a loop then it should be present in both lists. – user330612 Jul 23 '15 at 18:25
  • This works for a small dataset but throw OutOfMemoryException when ran against a sample dataset that has 20K groups. I have shared the sample dataset here https://filetea.me/t1sj4HzIYXGSrOV5FGN4R2xxQ You can read the file into your program as below to test it string json = File.ReadAllText(fileName); var groupMembers = JsonConvert.DeserializeObject>>(json); – user330612 Jul 23 '15 at 20:46
  • I'd consider writing the data out to a database so you aren't storing all of it in memory. The processing time will surely increase, however... – Idle_Mind Jul 23 '15 at 20:57
  • I don't have JsonConvert installed on my system either. – Idle_Mind Jul 23 '15 at 20:58
  • In the code I posted above in the question, I don't hit any out of memory exceptions because I have the visited and notVisited data structures globally instead of variables inside a function. My original approach makes sure that we don't iterate over the same group again and again when there are loops. Is it possible to modify your code to make these global variables and not look at the same node/group multiple times when traversing the graph? I dont think we really need to store this in a database :) – user330612 Jul 23 '15 at 21:01
  • We are using waaaay more memory because **EACH** MyGroup instance has its own set of parents and children. You weren't compiling that information, at all, in your original code. Your original code simply determined what nodes had been touched, but didn't store any **relational** information about those touched nodes. Additionally, you have to traverse starting from each node to determine what its children are because this is a **directed** graph. You might be able to reach "B" from "A", but not go from "B" back to "A". We may have already seen "B", but we won't know that additional info. – Idle_Mind Jul 23 '15 at 21:16
  • I hope that makes sense. There may be a more efficient way to do this based on the actual data you're using. I have no idea what this data represents, and whether there are additional constraints and inherent relationships between things that aren't conveyed thru the small sample set and the context of your question here. – Idle_Mind Jul 23 '15 at 21:19
  • I see, It does make sense. The expected output format of List is definitely not a strong requirement. The only requirement is to determine the parent groups, member groups and users of each group. I'm open to suggestions on how else can we store this info. We can even write out the information to a file for each group once we know all it's parents and children , so don't need to store this information in memory till we traverse the whole graph. But again as you said until we traverse the graph fully we don't know whose are each groups parents and children. There are no other relations – user330612 Jul 23 '15 at 21:23
  • BTW, you can use the builtin JavaScriptSerializer class to deserialize the json file if you dont have JSON.net installed var fileName = @"D:\\SampleData.txt"; JavaScriptSerializer serializer = new JavaScriptSerializer(); serializer.MaxJsonLength = Int32.MaxValue; var groupMembers = serializer.Deserialize>>(File.ReadAllText(fileName)).ToDictionary(x => x.Key, x => x.Value.ToHashSet()); – user330612 Jul 23 '15 at 21:45
  • I might be able to play with this one some more this evening with the actual data. – Idle_Mind Jul 23 '15 at 22:10
  • cool..here's a link to the file on google drive https://drive.google.com/file/d/0BxZ_QVIqJzwxZmUtTG9wenRIdEE/view?usp=sharing – user330612 Jul 23 '15 at 22:17