I'm writing an algorithm which includes a Traveling salesman problem (TSP) and a maze solving problem. Basically there are points inside the maze and we need to find the most optimal path to all those points and eventually exit the maze.
We started using an ACO algorithm to find the exit of the maze which works fine. But how would one integrate the TSP into it.
Our first guess would be reinforcement learning. Any ideas?