-2

I'm trying (it's an experiment) to develop a simple application that return data, given voice command (using iOs Siri, or Android voice recognition etc.). I would like to 'parse' it.

I'm using this approach but i'm pretty sure it isn't correct. Suppose user say "Give me 2015 Revenue of customer Duck"

if InStr(mystring, "Revenue")>0 then
 sql = sql " SELECT revenue FROM mytable "
end if
if InStr(mystring, "Give me Revenue")>0 then
 sql = sql " SELECT revenue FROM mytable "
end if
... 
...
if InStr(mystring, "April")>0 then
 sql = sql " Where month=4 "
end if
if InStr(mystring, "from April")>0 then
 sql = sql " Where month>=4 "
end if
... 
...

Is it a good approach ? Second question: how to parse customer names ? Customers may be 1 millions... What is the best approach ?

Thanks

stighy
  • 7,260
  • 25
  • 97
  • 157
  • 3
    Time to learn about http://en.wikipedia.org/wiki/Regular_expression in C# https://msdn.microsoft.com/en-us/library/hs600312(v=vs.110).aspx – Nikolay Shmyrev Mar 03 '15 at 16:50
  • I'm slightly confused. What are you asking about the best approach for? In your code you just seem to be searching a string for keywords but then you are talking about how to recognise customer names which is more sounds more like a problem with the voice recognition side of things. Is this `mystring` variable the speech that has already been converted to a string? – Chris Mar 03 '15 at 16:50
  • Yes mystring is the string 'correctly' recognized. – stighy Mar 03 '15 at 17:02
  • How do you expect iOS /Android to work with VB ? – Sam Axe Mar 10 '15 at 08:40
  • what does voice recognition have to do with string parsing? Regarding parsing customer names, it depends on what they are and what you want from them. Are they strings that you want to parse to ints? – default Mar 10 '15 at 08:45
  • Do you have rules set that specify the format these voice commands must be in? I mean the command "Give me 2015 Revenue of customer Duck" could also be said "Show me Revenue of customer Duck for April 2015" Do you need to cover multiple ways of asking the same thing? – Fred Mar 10 '15 at 08:46
  • show some full examples of your input string. – teo van kot Mar 10 '15 at 09:27
  • @Fred That's a good question, since this makes or breaks the system (there shouldn't be only OneTrueWay of saying things) and that's what I assumed in my answer. – samy Mar 12 '15 at 07:49

3 Answers3

3

First lets assume some things. According to the comment "mystring is the string correctly recognized" you are not asking about voice recognition for the commands, only about the user names.

For the user name you may need to compute a distance between what was understood by the VR mechanism and your user database. The VR system "hears" Duck, looks up the distance for all users to Duck and take the smallest distance. "Chuck" would be a much more likely contender than "Buckingham". I think you may have to tweak the mechanism to be able to work according to your business rules, but you can already create a proof of concept with the Levensthein distance between names.

I'm going to sweep the name recognition under the rug. You need to nail down your requirements, and your question is not precise enough to do just that yet.

Now regarding the parsing of the language, I think the simpler approach would be to create a Parsing Expression Grammar. It lets you define a grammar that is then turned into code by a generator. The code will then be able to parse the text obeying the grammar you are expecting. There is even a PEG library for C# that you can use to generate the parsing code, peg-sharp

PEGs are easy to build and quite readable. You can train online with PEG.js to build your first grammar. You need first to look at how a sentence would be translated into code. Let's say that

  • Give me 2015 revenue for Duck
  • Fetch Duck's revenue in 2015

should be equivalent commands, you would say that your system should translate this into an Operation (get) with parameters customer (Duck), dataFields (revenue) and timeWindow (year 2015). This could be translated into the following grammar:

start
  = command

command
= "Give me " year:year " " dataFields:dataFields " for " customer:customer 
{
    return {
        "operation" : "get"
        , "customer": customer
        , "year": year
        , "dataFields": dataFields
    };
}
/ 
"Fetch " customer:customer "'s " dataFields:dataFields " in " year:year
{
    return {
        "operation" : "get"
        , "customer": customer
        , "year": year
        , "dataFields": dataFields
    };
}

dataFields "dataFields"
 = "revenue"

year "year"
 = digits:[0-9]+ { return parseInt(digits.join(""), 10);}

customer "customer"
 = letters:[a-zA-Z]+ {return letters.join("");}

Of course this sample grammar is very naïve; the sentence structure is not able to take into account variations, but this is a matter of finding the general rules of your sentences parsing:

  • ignoring spaces and noise (such as the possessive 's in this example, or expletives if the user is frustrated :) )
  • creating rules where elements don't depend on order but can be picked up from anywhere in the sentence
  • extending possible dataFields and commands

If you try it on PEG.js you can see that it is able to translate the sentence into a JSON object (with peg-sharp that would be a C# object which behavior you control through the code)

With this and some planning of the variations you can have in your sentences you should be able to create a first approach to what you need

samy
  • 14,832
  • 2
  • 54
  • 82
1

If I've understood correctly your question, what you want is something like (this is just pseudo-code, I can't test code right now):

Hashtable keywords = new Hashtable();
keywords.add("revenue", 1);
keywords.add("from", 1);
//Add more keywords
...
string[] words = mystring.ToLower().Split(' ');
foreach (string word in words)
{
    if (keywords[word] == 1)
    {
        //Do something based on keywords found
        //E.g. assigning delegates to invoke functions
        //or preparing an SQL string that you can then execute
        //(BEWARE OF SQL-INJECTION IF YOU USE STRINGS)
        //or simply making a big switch-case
    }
}
ChatterOne
  • 3,381
  • 1
  • 18
  • 24
1

What you are describing is a rather complex problem of natural language processing. If you are OK with using an external API, then check out wit.ai (recently purchased by facebook), that does exactly what you want. You give it several examples of natural language commands that your app is expecting, and then you send it a voice recording, and it returns you a JSON with what it believes the intent of the voice command is.

Ishamael
  • 12,583
  • 4
  • 34
  • 52