3

I am implementing few string replacers, with these conversions in mind

'thou sittest' → 'you sit'
'thou walkest' → 'you walk'
'thou liest' → 'you lie'
'thou risest' → 'you rise'

If I keep it naive it is possible to use regex for this case to find & replace, like thou [a-z]+est

But the trouble comes in English verbs that end with e because based on the context I need to trim the est in some & trim just st in the rest

What is the quick-dirty solution to achieve this?

nehem
  • 12,775
  • 6
  • 58
  • 84

1 Answers1

4

Probably the most quick and dirty:

import nltk
words = set(nltk.corpus.words.words())
for old in 'sittest walkest liest risest'.split():
    new = old[:-2]
    while new and new not in words:
        new = new[:-1]
    print(old, new)

Output:

sittest sit
walkest walk
liest lie
risest rise

UPDATE. A slightly less quick and dirty (works e.g. for rotest → verb rot, not noun rote):

from nltk.corpus import wordnet as wn
for old in 'sittest walkest liest risest rotest'.split():
    new = old[:-2]
    while new and not wn.synsets(new, pos='v'):
        new = new[:-1]
    print(old, new)

Output:

sittest sit
walkest walk
liest lie
risest rise
rotest rot
Kirill Bulygin
  • 3,658
  • 1
  • 17
  • 23
  • 2
    Note that it also correctly removes the double consonant from "sittest"! – Leon Feb 07 '17 at 12:21
  • 1
    That is _really_ quick and dirty... I like it. – Chuck Feb 07 '17 at 12:24
  • 1
    Awesome so far, I was indeed hunting down if there is a method like word.is_verb(). This works the best. Accepting. – nehem Feb 07 '17 at 12:35
  • @Kirill Here is my work in https://github.com/nehemiahjacob/CKJV I am building a modernised bible translation based on KJV. You deserve kudos a lot. – nehem Feb 07 '17 at 12:41
  • 1
    @itsneo Thanks. Who knows, maybe I will use your project one day, as I'm doing a somewhat similar thing for word-to-word translation. – Kirill Bulygin Feb 07 '17 at 12:46