2

How would I go about doing a multiple pattern search in Lua? (I have Lpeg set up).

For example, say I'm receiving strings in a row, I'm processing one at a time, captalizing them and calling them msg. Now I want to get msg and check if it has any of the following patterns: MUFFIN MOOPHIN MUPHEN M0FF1N for a start. How can I check if msg has any of those (doesn't matter if it's more than one) whithout having to write a huge if(or or or or)?

Bernardo Meurer
  • 2,295
  • 5
  • 31
  • 52

2 Answers2

3

One thing you could do is make a table of words you want to look for, then use gmatch to iterate each word in the string and check if it's in that table.

#!/usr/bin/env lua

function matchAny(str, pats)
    for w in str:gmatch('%S+') do
        if pats[w] then
            return true
        end
    end
    return false
end

pats = {
    ['MUFFIN']  = true,
    ['MOOPHIN'] = true,
    ['MUPHEN']  = true,
    ['M0FF1N']  = true,
}

print(matchAny("I want a MUFFIN", pats)) -- true
print(matchAny("I want more MUFFINs", pats)) -- false
Rena
  • 606
  • 7
  • 21
  • Hmm, interesting solution, however is it possible to mmake it so that for eg. `pats = {['UFFIN'] = true}` `print(matchAny("I want a MUFFIN", pats)) --true` That meaning, to check for that word in any part of the string like `string.match()` does? – Bernardo Meurer May 17 '15 at 02:00
  • Nothing really clever comes to mind; I think you'd have to iterate `pats` (using `ipairs`) and compare each entry using `string.match`. There's probably a more efficient way using LPEG, but I wouldn't know... – Rena May 17 '15 at 02:03
  • Exactly, thats why I got LPEG in the first place, but god it's confusing! I couldn't get almost any of it, also regular expressions give me nightmares. Wondering how slow would a string.match iterator be tho. – Bernardo Meurer May 17 '15 at 02:05
  • @Rena - Another variant: `function matchAny(str,pats) return #str:gsub('%S+',pats) < #str end`, initialization: `pats = {['MUFFIN'] = '', ...}` – Egor Skriptunoff May 17 '15 at 08:43
0

A late answer but you can construct a pattern to match all words case-insensitively (only if not followed by an alphanum), capture match position in subject and word index that is being matched with something like this:

local lpeg = require("lpeg")

local function find_words(subj, words)
    local patt
    for idx, word in ipairs(words) do
        word = lpeg.P(word:upper()) * lpeg.Cc(idx)
        patt = patt and (patt + word) or word
    end
    local locale = lpeg.locale()
    patt = lpeg.P{ lpeg.Cp() * patt * (1 - locale.alnum) + 1 * lpeg.V(1) }
    return patt:match(subj:upper())
end

local words = { "MUFFIN", "MOOPHIN", "MUPHEN", "M0FF1N" }
local pos, idx = find_words("aaaaa  bbb ccc muPHEN ddd", words)

-- output: 16, 3
wqw
  • 11,771
  • 1
  • 33
  • 41