For an online algorithms course I am attempting to write a program which calculates the travelling salesman distance of cities using an approxomation algorithm:
- Start the tour at the first city.
- Repeatedly visit the closest city that the tour hasn't visited yet. In case of a tie, go to the closest city with the lowest index. For example, if both the third and fifth cities have the same distance from the first city (and are closer than any other city), then the tour should begin by going from the first city to the third city.
- Once every city has been visited exactly once, return to the first city to complete the tour.
I am trying to write a solution in Haskell, and I have it working on small data sets, but it runs out of memory on a large input (the course has an input of ~33000 cities)
-- Fold data: cities map, distances map, visited map, list of visited cities and each distance,
-- and current city
data TS = TS (M.Map Int City) (M.Map (Int,Int) Double) (M.Map Int Bool) ([(Int,Double)]) (Int)
run :: String -> String
run input = let cm = parseInput input -- cityMap contains cities (index,xPos,yPos)
n = length $ M.keys cm
dm = buildDistMap cm -- distanceMap :: M.Map (Int,Int) Double
-- which is the distance between cities a and b
ts = TS cm dm (M.fromList [(1,True)]) [(1,0.0)] 1
(TS _ _ _ beforeLast _) = foldl' (\ts _ -> exec ts n) ts [2..n]
completed = end beforeLast dm
in show $ floor $ sum $ map (\(_,d) -> d) $ completed
exec :: TS -> Int -> TS
exec (TS cm dm visited ordered curr) n =
let candidateIndexes = [(i)|i<-[1..n],M.member i visited == False]
candidates = map (\i -> let (Just x) = M.lookup (curr,i) dm in (x,i)) candidateIndexes
(dist,best) = head $ sortBy bestCity candidates
visited' = M.insert best True visited
ordered' = (best,dist) : ordered
in TS cm dm visited' ordered' best
end :: [(Int,Double)] -> M.Map (Int,Int) Double -> [(Int,Double)]
end ordering dm = let (latest,_) = head ordering
(Just dist) = M.lookup (latest,1) dm
in (1,dist) : ordering
bestCity :: (Double,Int) -> (Double,Int) -> Ordering
bestCity (d1,i1) (d2,i2) =
if compare d1 d2 == EQ
then compare i1 i2
else compare d1 d2
At first I wrote the function exec
as a recursive function instead of calling it via the foldl'
. I thought changing it to use foldl'
would solve my issue as foldl'
is strict. However it appears to have made no difference in memory usage. I have tried compiling my program using no optimisations and -O2
optimisations.
I know that somehow it must be keeping multiple loops in memory as I can sort 34000 numbers without issue using
> sort $ [34000,33999..1]
What exactly am I doing wrong here?
Just in case it is any use here is the parseInput
and buildDistMap
functions, but they are not the source of my issue
data City = City Int Double Double deriving (Show, Eq)
-- Init
parseInput :: String -> M.Map Int City
parseInput input =
M.fromList
$ zip [1..]
$ map ((\(i:x:y:_) -> City (read i) (read x) (read y)) . words)
$ tail
$ lines input
buildDistMap :: M.Map Int City -> M.Map (Int,Int) Double
buildDistMap cm =
let n = length $ M.keys cm
bm = M.fromList $ zip [(i,i)|i<-[1..n]] (repeat 0) :: M.Map (Int,Int) Double
perms = [(x,y)|x<-[1..n],y<-[1..n],x/=y]
in foldl' (\dm (x,y) -> M.insert (x,y) (getDist cm dm (x,y)) dm) bm perms
getDist :: M.Map Int City -> M.Map (Int,Int) Double -> (Int,Int) -> Double
getDist cm dm (x,y) =
case M.lookup (y,x) dm
of (Just v) -> v
Nothing -> let (Just (City _ x1 y1)) = M.lookup x cm
(Just (City _ x2 y2)) = M.lookup y cm
in eDist (x1,y1) (x2,y2)
eDist :: (Double,Double) -> (Double,Double) -> Double
eDist (x1,y1) (x2,y2) = sqrt $ p2 (x2 - x1) + p2 (y2 - y1)
where p2 x = x ^ 2
And some test input
tc1 = "6\n\
\1 2 1\n\
\2 4 0\n\
\3 2 0\n\
\4 0 0\n\
\5 4 3\n\
\6 0 3"