After reading a dit or dah, you have two options: terminate the letter or continue the current letter. This will lead to a lot of bifurcations in your code and a recursive approach might be a good way to implement this.
Keep a buffer of the possible string so far and print (or store) the result when you hit the string end and it coincides with the end of a letter.
Here's an implementation in C:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
static const char *letter = "**ETIANMSURWDKGOHVF*L*PJBXCYZQ**";
void morse_r(char *buf, int len, const char *str, int code)
{
if (*str == '\0') {
// end of string; print if staring new letter
if (code == 1) printf("%.*s\n", len, buf);
} else if (code < 16) {
if (*str == '.') code = 2 * code;
if (*str == '-') code = 2 * code + 1;
// continue letter
morse_r(buf, len, str + 1, code);
// start new letter
buf[len++] = letter[code];
morse_r(buf, len, str + 1, 1);
}
}
void morse(const char *str)
{
char buf[strlen(str)];
morse_r(buf, 0, str, 1);
}
int main()
{
morse("......-...-..---");
return 0;
}
This implementation is very simple. It uses a simplistic lookup mechanism and it doesn't check whether a letter actually exists. (The asterisks in the letter
array are valid Morse codes, but they are not Latin letters.)
This approach is also rather brute force: It recalculates the tails over and over. Memoization of the tails will save a lot of extra work for the processor for loner strings.
And, as you are aware, there will be a ot of nonsense results. The above code yields 20569 strings (some of them with asterisks, i.e. invalid). You can prevent many recursions when you do a plausibility or dictionary check on your way. For example, many dots in a row will yield a lot of nonsense words with repeated Es.
Edit: As Jim Mischel points out, an explanation of how the Morse code lookup works is in order. Yves Daoust mentions a trie in his answer. A trie is a tree structure for storing words; each node can have as many children as there are letters in the alphabet. Morse code has only two letters: dit (.
) and dah (-
). A Morse code trie is therefore a binary tree.
Tries are usually sparse; words are rather long and many letter combinations don't exist. Morse tries are dense: Morse letter encodings are short and nearly every cobmination is used. The tree can be stored as linear, "flat" array, similar to a heap. A node is represented by its index i
in the array. The left child is then 2*i
and the right child 2*i + 1
.
A better and more detailed explanation can be found in an answer I posted to another Morse-related question, from where I've taken the lookup code that I used in the example above.