1

In a project of mine, I am trying to identify file names in a given sentence. For example, "Could you please open abc.txt", so I need to fetch the keywords "open" in order to know the kind of action that is expected and I also need to identify the file name, for obvious reasons. A simple AIML tag for this is:

<aiml>
<category>
    <pattern>* OPEN *</pattern>
    <template>open <star index="2"/></template>
<category>
</aiml>

Here, in the template tag, I am just giving an information about the operation to be performed and the file name. My python code on the other hand takes care of performing the required action. Now the problem is the '.' character. Using that character divides the sentence into 2 parts, (in case of the example I mentioned above, the 2 sentences would be "Could you please open abc" and "txt") which are individually mapped to any of the aiml tags defined. But, in my case I don't want the '.' character to act as a delimiter. Basically, I want to identify file names that may or may not include an extension. Could anyone please help me out with this?

Thanks in advance!

1 Answers1

0

By default AIML allows multi sentence input. This means full stops, exclamation marks and question marks are treated as separators between sentences. For example if you asked:

Good morning. My name is George. How are you today?

this is interpreted as 3 separate inputs. Normally this is a good thing as it means the AIML interpreter can re-use existing patterns for GOOD MORNING, MY NAME IS *, HOW ARE YOU *.

But in your case that's not helping as the full-stop before the extension is causing unwanted splitting. Depending on your AIML interpreter, sentence splitting is done in a pre-processing stage before sending the input to the interpreter. Some AIML interpreters have a configuration file that lets you define the sentence splitting characters, so you may simply be able to remove the full stop from the list of separators.

A better approach may be to pre-process the filenames and replace the full stop with the word DOT, you can then detect this in your pattern * OPEN *

As a final comment, * OPEN * is a very wide ranging pattern, it will also be invoked if someone says WHAT TIME IS THE SHOP OPEN TODAY, or any other input with the word OPEN in it surrounded by text.

Ubercoder
  • 711
  • 8
  • 24