2

Let's say I have a database that contains 200,000 lines of poetry, and I want to randomly combine those lines in ways that generate grammatically correct and legible 3-line poems.

Is there a way to do that? I'm currently experimenting with language_tool_python but it's only helping with general spelling and a few formatting suggestions. I'm looking for something that will ensure that every 3-line poem assembled reads properly and makes grammatical sense.

For example, here's a three line array containing a poem generated by my current code base: ['OVER THE SEA, THE SEA HIS ROD OBEYS', 'IS DEAD', 'BUT LET MY LAST DAYS BE MY BEST'] I'd like to find a solution that allows me to detect that this poem isn't acceptable because it doesn't make grammatical sense to have "IS DEAD" coming after "OVER THE SEA, THE SEA HIS ROD OBEYS".

Another example of a poem I'd like to automatically detect as invalid: ['THERE NEVER LACKS A BONE OF THE BEST', 'BUT CRUEL IS SHE', 'THEN MAY YOUR QUEEN']

I need to solve this problem client-side; I can't use any online API-as-a-service.

Thanks in advance for any suggestions!

RobB
  • 43
  • 4

1 Answers1

3

This is a very difficult task as any system employed would have to have an understanding of text cohesion in order to produce coherent output. Things are further complicated as you want to apply these systems to poetry which is uniquely distinguished from traditional prose (almost by definition) by its lack of cohesion. To quote the Poetry Wikipedia page:

Poetry ... is a form of literature that uses aesthetic and often rhythmic qualities of language − such as phonaesthetics, sound symbolism, and metre − to evoke meanings in addition to, or in place of, a prosaic ostensible meaning.

To get you started on ways to solve what is fundamentally a sentence ordering problem, I'd recommend reading these papers by Chowdhury et al. [1] and Ghosal et al. [2] who have provided open source implementations of their STaCK and ReBART implementations which you could attempt to use. It is probable that you'll have to fine-tune particular models to better handle poetry, but you may be pleasantly surprised.

References

[1] Chowdhury, S.B.R., Brahman, F. and Chaturvedi, S., 2021. Is Everything in Order? A Simple Way to Order Sentences. arXiv preprint arXiv:2104.07064.

[2] Ghosal, D., Majumder, N., Mihalcea, R. and Poria, S., 2021. Stack: Sentence ordering with temporal commonsense knowledge. arXiv preprint arXiv:2109.02247.

Kyle F Hartzenberg
  • 2,567
  • 3
  • 6
  • 24
  • Thank you for suggesting these two open source implementations. I'll dig into them and see if I can adapt them to my purpose. – RobB Jan 23 '23 at 21:22