To explain in a clearer way my question I will start by explaining the real-life case I am facing.
I am building a physical panel with many words on it that can be selectively lit, in order to compose sentences. This is my situation:
- I know all the sentences that I want to display
- I want to find out [one of] the shortest set of ORDERED words that allows me to display all the sentences
Example:
SENTENCES:
"A dog is on the table"
"A cat is on the table"
SOLUTIONS:
"A dog cat is on the table"
"A cat dog is on the table"
I tried to approach this problem with "positional rules" finding for each UNIQUE word in the set of ALL the words used in ALL the sentences, what words should be at the left or at the right of it. In the example above, the ruleset for the "on" word would be "left(A, dog, cat, is) + right(the, table).
This approach worked for trivial cases, but my real-life situation has two additional difficulties that got me stuck and that have both to do with the need for repeating words:
- In-sentence repetitions: "the cat is on the table" has two "the".
- Circular references: In a set of three sentences "A red cat" + "My cat is on the table" + "That table is red", the rules would state that RED should be at the left of CAT, CAT should be at the left of TABLE and TABLE should be at the left of RED.
MY QUESTION THEREFORE IS:
What is the class of algorithms (or even better: what is the specific algorithm) that studies and solves this kind of problems? Could you post some reference or a code example of it?
EDIT: Level of complexity
From the first round of answers it appears the actual level of complexity (i.e. how different are the sentences one from the other) is an important factor. So, here comes some info on that:
- I have about 1500 sentences I want to represent.
- All of the sentences are essentially modifications of a restricted pool of ~10 sentences where only a few words change. Building on the previous example, it's a bit like all my sentences would speak about either "somebody's pet's position relative to a piece of furniture" or "a physical description of somebody's furniture".
- The number of unique words used to build all the sentences is <100.
- Sentences are 8 words long at most.
For this project I am using python, but any language reasonably readable (eg: NOT obfuscated perl!) will be fine.
Thank you in advance for your time!