2

A bit stumped here. The task in front of me is to take a SQL query stored in a string (lets assume that it is a valid query for now); store the "select list" in a string array with each array element correlated to a select list item; and finally to store the "from clause" in a string variable of its own. I feel like the exercise is write a SQL query parser of my own, but I don't know how SQL Server goes about parsing a query. Any direction to resources that could help would be very much appreciated.

I've been able to solve the problem using a brute force approach where I do the following:

1 - find the from clause by starting a loop which finds subsequent instances of the word " from" until everything that follows the index location can work with a "select *" select list

private string GetFromClause(string selectstatement,string connectionString)
{
    string teststatement = selectstatement;
    int startindex = GetFirstFromIndex(teststatement);
    while (startindex != -1)
    { 
        teststatement=teststatement.Substring(startindex);
        if (DoesSelectWork(string.Format("select * {0}", teststatement)
                                                    , connectionString))
            return teststatement;
        else
            startindex = GetNextFromIndex(teststatement);

     }
    throw new ReportException("Could not find From Clause");
}

2- remove the now found "from clause" from the passed query; place the remaining string in an array using .split(','). Now iterate through the array and test each array element using the found "from clause". If the test passes we have a valid select list item; if not we want to combine this element with the next array element, and continue to do so till the test passes (to deal with casting statements etc. that would introduce commas into the select list syntax)

private string[] GetSelectFields(string query, 
                                string fromclause,
                                string connectionString)
{
    int index = query.IndexOf(fromclause);
    string fromless = (index < 0)
        ? query
        : query.Remove(index, fromclause.Length);
    fromless = fromless.Substring(fromless.IndexOf("SELECT") + "SELECT".Length);
    List<string> finalselect = new List<string>();
    string[] selectFields = fromless.Split(',');
    index = 0;            
    string currentString = string.Empty;
    while (index<selectFields.Length)
    {
        currentString += selectFields[index];
        if (DoesSelectWork(string.Format("select {0} {1}"
                , currentString, fromclause), connectionString))
        {
            finalselect.Add(currentString);
            currentString = string.Empty;
        }
        else {
            currentString += ",";
        }
        index++;
    }
    return finalselect.ToArray();
}
JoanieBrar
  • 151
  • 4
  • 14

2 Answers2

3

As a general rule, you cannot parse SQL with a simple search method, the rules are way too complex for that. You need a full lexer and grammar. But since SQL Server 2012 you have the Transact-SQL Language Service option, which is what tools like Visual Studio use to parse T-SQL w/o a back-end server. You can leverage it from C# via Microsoft.SqlServer.Management.SqlParser.Parser.Parse().

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
  • I think the [`TSqlParser`](http://msdn.microsoft.com/en-us/library/microsoft.data.schema.scriptdom.sql.tsqlparser(v=vs.100).aspx) and the `ScriptDom` linked by @JhonAlx are actually a better option. Better documented. – Remus Rusanu Oct 03 '14 at 16:40
  • its actually `Microsoft.SqlServer.Management.SqlParser.Parser.Parser.Parse()`, where `Microsoft.SqlServer.Management.SqlParser.Parser` is the namespace and `Parser` is the (static) class – Cee McSharpface Oct 24 '19 at 09:31
3

If you want to validate SQL syntax without the use of a database, the TSql100Parser class will do well for this situation.

Disclaimer, code borrowed from this post here Code to validate SQL Scripts

Pretty straightforward to use though. If it returns null, then there were no errors in parsing it.

using Microsoft.Data.Schema.ScriptDom;
using Microsoft.Data.Schema.ScriptDom.Sql;

public class SqlParser
{
        public List<string> Parse(string sql)
        {
            TSql100Parser parser = new TSql100Parser(false);
            IScriptFragment fragment;
            IList<ParseError> errors;
            fragment = parser.Parse(new StringReader(sql), out errors);
            if (errors != null && errors.Count > 0)
            {
                List<string> errorList = new List<string>();
                foreach (var error in errors)
                {
                    errorList.Add(error.Message);
                }
                return errorList;
            }
            return null;
        }
}
Community
  • 1
  • 1
Jhon
  • 582
  • 7
  • 28