1

Is there a way to programmatically match plural words using Treetop. The Linguistics gem will pluralize a word, but how can that be inserted back into the parser.

Here's an example of what I'm trying to do:

#!/usr/bin/env ruby
require 'treetop'
require 'linguistics'
include Linguistics::EN
Treetop.load_from_string DATA.read

parser = RecipeParser.new

p parser.parse('cans')

__END__
grammar Recipe
   rule units
      unit &{|s| plural(s[0].text_value) }  
   end
   rule unit
      'can'
   end
end
Josh Voigts
  • 4,114
  • 1
  • 18
  • 43
  • It'd be helpful to see your full grammar, and what you're doing with it. Also: do you absolutely have to use Treetop? Could something as simple as a regexp accomplish your goals? – pje Oct 12 '12 at 00:08

1 Answers1

1

In general, the linguistics gem can't pluralize arbitrary Treetop rule definitions—they're not strings.

Using semantic predicates your recipe.treetop file could define all your valid singular unit strings in an array, pluralize them, and then create a rule that compares the token in question to each of those pluralized units:

require "linguistics"

grammar Recipe
  rule units
    [a-zA-Z\-]+ &{ |u|
      Linguistics.use(:en)
      singular_units = [ "can" ]

      singular_units.
        map(&:en).
        map(&:plural).
        include?(u[0].text_value)
    }
  end
end
pje
  • 21,801
  • 10
  • 54
  • 70
  • 1
    That's exactly what I was looking for. I was going to see if I could parse the word first and then examine its ending, but I guess that doesn't make sense for irregular plural words like (goose and geese). The only thing is, I would probably want to memoize `singular_units` since it would be used every time it hit that node. – Josh Voigts Oct 12 '12 at 17:02