This sounds like a "please do my homework for me" question. It is hard to conceive what real world problem this is applicable to and there is no standard solution. It is OK to ask for help with your homework here, but you should state that this is the case.
more than 40 MB for each 9 chars words
Your maths is a bit wonky, but indeed the storage does not scale well. OTOH leaving aside the amount of storage, in terms of the processing workload it does scale well as a solution.
You could simply brute-force a dynamic query:
function mkqry($word)
{
$qry="SELECT * FROM yourtable WHERE 1 ";
$last=strlen($word);
for ($x=0; $x<$last; $x==) {
$qry.=" AND word LIKE '%" . substr($word, $x, 1) . "%'";
}
return $qry;
}
However this will always result in a full table scan (slow) and won't correctly handle cases where a letter occurs twice in a word.
The solution is to use an indexing function which is independent of the order in which the characters appear - a non-cryptographic hash. An obvious candidate would be to XOR the characters together, although this only results in a one character identifier which is not very selective. So I would suggest simply adding the character codes:
function pos_ind_hash($word)
{
$sum=0;
for ($x=0; $x<$last; $x==) {
$sum+=ord(substr($word, $x));
}
return $sum;
}
function mkqry($word)
{
$qry="SELECT * FROM yourtable WHERE 1 ";
$last=strlen($word);
for ($x=0; $x<$last; $x==) {
$qry.=" AND word LIKE '%" . substr($word, $x, 1) . "%'";
}
$qry.=" AND yourtable.hash=" . pos_ind_hash($word);
return $qry;
}
Note that the hash mechanism here does not uniquely identify a single word, but is specific enough to reduce the volume to the point where an index (on the hash) would be effective.
Multiplying rather than adding would create fewer collisions but at a greater risk of overflowing (which would create ambiguity between implementations).
But both the hash and the single character LIKE only reduce the number of potential matches. To get the query to behave definitively, you need to go further. You could add an attribute to the table (and to the index with the hash)containing the string length - this would be more selective (i.e. improve effectiveness of the index) but still not definitive.
For a definitive method you would need to specify in your query that the data does NOT contain characters which are NOT in the word you are looking for.
The wrong way to do that would be to add a loop specifying "AND NOT LIKE....".
A valid way of doing that would be to add a test in the query which replaces all the letters in the table attribute which appear in the word you are searching for which results in a zero length string.