I have a paragraph of text scrambled by columns of two chars. The purpose of my assignment is to unscramble it:
|de| | f|Cl|nf|ed|au| i|ti| |ma|ha|or|nn|ou| S|on|nd|on|
|ry| |is|th|is| b|eo|as| | |f |wh| o|ic| t|, | |he|h |
|ab| |la|pr|od|ge|ob| m|an| |s |is|el|ti|ng|il|d |ua|c |
|he| |ea|of|ho| m| t|et|ha| | t|od|ds|e |ki| c|t |ng|br|
|wo|m,|to|yo|hi|ve|u | t|ob| |pr|d |s |us| s|ul|le|ol|e |
| t|ca| t|wi| M|d |th|"A|ma|l |he| p|at|ap|it|he|ti|le|er|
|ry|d |un|Th|" |io|eo|n,|is| |bl|f |pu|Co|ic| o|he|at|mm|
|hi| | |in| | | t| | | | |ye| |ar| |s | | |. |
My current approach to find the right order of columns is trying to recursively find each column's best position according to a word occurrence count criteria.
The pseudo-code of the algorithm's core I have in mind would be:
function unscramble(scrambledMatrix,indexOfColumnIveJustMoved)
for each column on scrambledMatrix as currentIndex=>currentColumn
if (currentIndex!=indexOfColumnIveJustMoved)
maxRepeatedWords=0;maxIndex=0;
for (i=0;i<numberOfColumnsOfScrambledMatrix;i++)
repWordsCount=countRepWords(moveFromToOn(currentIndex,i,scrambledMatrix))
if (maxRepeatedWords<repWordsCount)
maxRepeatedWords=repWordsCount;
maxIndex=i;
endif
endfor
if (maxIndex!=currentIndex)
return unscramble(moveFromToOn(currentIndex,maxIndex,scrambledMatrix),maxIndex); //recursive call
endif
endif
endfor
return(scrambledMatrix); //returns the unscrambled matrix;
endfunction
The algorithm stops when no column is moved after iterating on each one. I'm guessing it should work for any language (though I'm only interested on a solution for english) as long as the writing is based on words formed by letters and the sample is big enough.
Any suggestions on any other approaches or improvements? I would like to know the best solution for this problem (probably a dictionary based one looking for occurrences of common words instead? How about rebuilding the algorithm to avoid recursion, would it be much faster?).