Sea EDIT!! Below
I am coding a word ladder algorithm. The user enters a start word, an end word and a hash of all the words. The algorithm returns all the shortest paths (multiple if exist) from start word to the end word. Eg -> start_word = 'cold' , end_word = 'warm'
output = [[ cold -> cord-> card-> ward-> warm], [/If another path exists/]].
Every consecutive word from the previous is different by one character. I am using BFS search to solve this problem. My strategy was to return all the paths, and then select the shortest ones from the returned list. This is my code to return all the paths:
auto word_ladder::generate(std::string const& from, std::string const& to, absl::flat_hash_set<std::string> const& lexicon) -> std::vector<std::vector<std::string>> {
absl::flat_hash_set<std::string> visited = {};
std::queue<std::vector<std::string>> search_queue = {};
std::vector<std::vector<std::string>> paths = {};
search_queue.push(std::vector<std::string>{from});
while (!search_queue.empty()) {
auto word = search_queue.front();
search_queue.pop();
auto neighbours = generate_neighbours(word.back(), lexicon);
for (auto &i: neighbours) {
auto new_word = word;
new_word.push_back(i);
if (i == to) {
paths.push_back(new_word);
continue;
}
if (visited.find(i) != visited.end()) {
continue;
}
search_queue.push(new_word);
visited.insert(i);
}
}
return paths;
}
It does return multiple paths however the problem is that it doesnt return all the paths. One of the paths it returns is ->
1) awake, aware, sware, share, shire, shirr, shier, sheer, sheep, sleep
however it doesn't return the path -> 2) "awake","aware","sware","share","sharn","shawn","shewn","sheen","sheep","sleep"
I am pretty sure the reason is because the way I have coded it, it marks the word "share" as visited the first time it encounters it (in 1) ). Hence it doesn't go through the second path (in 2))
To solve this, I changed my for loop a bit:
for (auto &i: neighbours) {
auto new_word = word;
new_word.push_back(i);
if (i == to) {
paths.push_back(new_word);
continue;
}
for (auto &j: word) {
if (j == i) {
continue;
}
}
search_queue.push(new_word);
}
The idea was to check if the word has been visited in the path that you are keeping track of in the queue, and not globally. However, this solution for some reason gets stuck in a loop somewhere and doesn't terminate (I am assuming due to large dataset?).
Is there something wrong with my code in the second or it takes too long because of large dataset? How can I better achieve the solution?
EDIT!!!
I am now instead of finding all the paths, finding the length of shortest path and then performing BFS till that depth to get all the paths at that depth.
auto word_ladder::generate(std::string const& from, std::string const& to, absl::flat_hash_set<std::string> const& lexicon) -> std::vector<std::vector<std::string>> {
absl::flat_hash_set<std::string> visited = {};
visited.insert(from);
std::queue<std::vector<std::string>> search_queue = {};
std::vector<std::vector<std::string>> paths = {};
search_queue.push(std::vector<std::string>{from});
auto length = find_shortest_path_length(from, to, lexicon);
std::cout << "length is: " << length << "\n";
// auto level = 0;
std::unordered_map<std::string, int> level_track = {};
level_track[from] = 0;
while (!search_queue.empty() ) {
auto word = search_queue.front();
search_queue.pop();
// **
if (level_track[word.back()] <= length) {
auto neighbours = generate_neighbours(word.back(), lexicon);
const auto &parent = word.back();
for (auto &i: neighbours) {
auto new_word = word;
new_word.push_back(i);
if (i == to) {
paths.push_back(new_word);
std::cout << "The level at the path was " << level_track[parent] << "\n";
continue;
}
if (path_crossed(word, i)) {
continue;
}
search_queue.push(new_word);
level_track[i] = level_track[parent] + 1;
}
}
}
return paths;
}
The solution now terminates so definitely the problem earlier was the large number of searches. However my algorithm is still not giving me correct answer as the way I keep track of depth of my nodes (words) is somehow not correct.