I am dealing with a challenge that is similar to the Traveling Salesman Problem but with a degree of Optimization and Constraint Satisfaction. The algorithm has as input the following variables:
- A set of points, where each point object has the following (2000-20000 instances):
- Coordinates (e.g. Latitude and Longitude)
- Score/Reward of that point (Strictly positive)
- Time cost (time spent if you go to this point, e.g. "visit time") (Strictly positive)
- Travel time between any two given points (a matrix of nxn where n is the number of points). This is not a direct derivative of the distance between two points, it takes into account mean traffic and the road mesh through which you can go from one point to another. (Strictly non-negative)
- Set of restrictions, which are of these types:
- Must visit point "X"
- You can only travel up to a time "T"
Additionally I have restrictions of the form: If point "X" is visited, it has to be between hour "A" and hour "B" but let's not consider it for the moment...
I have to implement the algorithm that maximizes the sum of the scores (Optimization), taking into strict consideration the restrictions defined (CSP) and outputting a route (TSP) or set of points that I have to visit in a given order. The output will yield (due to the characterstics of the inputs and restrictions) a route that contains at most 15 points, usually around 10.
These are the approaches I have considered and explored:
Clustering - Compute an "epicenter" of high-score by getting the means of surrounding points' scores. With that "epicenter", apply a traveling salesman algorithm, to "K" surrounding points. Those "K" points will be obtained by a k-NN algorithm, increasing "K" by one (or more) until the max-travel-time condition has been reached.
Knapsack Problem - As I have a max-travel-time limit and a set of points which I can include as part of the solution, the problem is very similar to the knapsack problem. The only problem here is that I don't know the cost (in travel time) of a solution until I compute the whole route (solution to the TSP) and computing the route for each iteration of the knapsack algorithm may be too complex computationally. I need to know this cost in each iteration because it is the knapsack capacity.
Linear programming - Somehow model the problem as a linear programming problem, and get a solution in my lifetime. Probably impossible even with the best computer.
I have thought of setting the input into a weighted-graph-like structure, where each node is a point, and the edges contain the travel-time information. But it wasn't useful for me...
Do you have any other approaches that I've missed?
What do you think of the approaches I've proposed?
Any help would be appreciated, thanks in advance!