0

I'm creating a report builder-like application with the intent on making an extremely novice friendly front end.

The back end of the application will be managed by developers who can build a 'report model' which will specify which tables, fields and joins to include for the end user to use.

I am also looking to add functionality that will not require a report model. My application will scan the target SQL database, and created a virtual model with all of the joins and fields mapped.

After this i will need to be able to generate the most 'logical' or efficient a path between tables e.g with the minimal amount of joins. Kind of similar to the traveling salesman scenario.

I have decided to go about this by using tree's to map all joins from a specific table which will be the start node, and all other tables it could possibly connect to. This way i can do a breadth-first tree traversal to in theory find the most 'logical' path to join.

My issue with this is that not all databases will be set up in a particularly machine logic friendly manner. Meaning that a human may see a logical join because of specific table or field names, that my algorithm may not. (Below is a simple iteration of my algorithm in c# that does not yet record the path between tables)

 public Node<T> findClosestObjToNode(Node<T> node, T obj)
    {
        Boolean matchFound = false;
        List<Node<T>> toSearch = new List<Node<T>>();
        List<Node<T>> toSearchNext = new List<Node<T>>();
        toSearchNext.Add(node); //add proimary node to search list

        while(!matchFound){
            toSearch.AddRange(toSearchNext); //copy the searchnext list to search
            toSearchNext.Clear();

            foreach(Node<T> _node in toSearch){
                if (node.contains(obj)) //check for existance of object in the nodes children
                    return node.getChild(obj); //return the child node that contains the object
                else
                    foreach (Node<T> cNode in node.getChildren()) //no matching object found in child nodes
                        toSearchNext.Add(cNode); //so add each child node to the list of nodes to search
            }
            if(toSearchNext.Count == 0) //no match found
                return null;
        }
        return null;
    }

My question is really. Does the way i have planned above seem like a decent solution to the whole problem, or is there a better way of doing the above to get more accurate table joins.

Lex Webb
  • 2,772
  • 2
  • 21
  • 36

1 Answers1

1

If I have understood your requirements properly, then I question your approach to this problem. Normally, there are not many ways to get a certain piece of data from a database - there is usually one and only one way to get that specific piece of data. With TSP type problems there are multiple possible solutions and the ideal solution is based on some constraint on the system. I don't think you are going to get much gain from your solution as you will most often find that their is only one combination of table joins that will provide you with the data that you need.

  • That may be the case, if so, then my solution would still enable me to specifically find that join dynamically would it not? rather than having to create a static adjacency list. – Lex Webb Apr 03 '14 at 14:59
  • Yes theoretically it would. But have you considered the downsides to your approach? What if the database schema changes, you would have to know when to refresh your virtual model. If you intend to create the model every time then what happens in the long term when the number of tables and fields grow? You may end up with performance issues with your solution. –  Apr 03 '14 at 15:43
  • Yes i have thought of this. My solution would be caching the structure of that database, and re-scanning the target database on application load. I have created a query which is able to scan a substantial database in a short time. i can then compare this to the existing model. and choose i need to reload. However i would have to measure the performance of this compared to generating and traversing an adjacency list. – Lex Webb Apr 03 '14 at 16:07