An exact maximum likelihood estimate for the independent letter model
If we assume (unrealistically) that each letter in a given language appears with some known probability independently of nearby letters, then the exact maximum likelihood estimate can be found fairly efficiently. This is about the best you could hope for with this model.
Let's suppose there are n letters in the message and k letters in the alphabet.
The basic idea is that we somehow come up with a numeric score for each of the k^2 possible letter pairs, and then look for a maximum-weight bipartite perfect matching on the two alphabets -- that is, we choose pairs of letters in such a way that every letter in one alphabet is paired with exactly one letter in the other, and the sum of the scores of the chosen pairs is maximised. There are 2 main issues: (1) How to define the score of a given pair; (2) How to solve the general problem of finding the highest-sum subset of pairs once we have decided on the individual scores.
Scoring pairs
The score of pairing the encoded letter x with the source letter y should be log(p(y)^freq(x)), where p(y) is the known probability of letter y in the language, and freq(x) is the number of times letter x appeared in the encoded input. Why? Because under this simple model of the source language that ignores the effect of nearby characters, the probability of generating any given n-character source-language string is equal to the product of the probabilities of the letters that appear in it, which you can also compute as the product of p(y)^freq(y) over all the letters y in the source alphabet. So to compute the probability of generating that string given that each x is "really" a y (and nothing else is a y), we change freq(y) to freq(x). If we then take logs, what we have is the sum of log(p(y)^freq(x)) over all letters y in the source alphabet -- and this sum is exactly what the maximum bipartite perfect matching algorithm will maximise.
Finding the optimal perfect matching
There are algorithms that solve the closely related maximum-weight bipartite matching problem in O(V^2E) time, which amounts to O(k^4) time here. This version of the problem looks to maximise the total sum of all chosen pairs while obeying the constraint that no item is matched with two or more other items in the other set, but without the requirement that every item is matched with some other item. Fortunately, we can turn this algorithm into an algorithm that solves the "perfect" variation we want to solve very simply, by adding a large number M to each score: Intuitively, this causes the algorithm to try to add as many edges as possible (since omitting even a single edge has a large cost, making such a solution much worse than even a "stupid" solution that contains a full set of k edges), while still forcing the algorithm to choose the best among all the k-edge solutions (since all of these solutions include the same kM contribution from the added weight).