Solve TSP using Pilot Method

This post shows a simple implementation using R to solve a given TSP (Traveling Salesperson Problem) instance using the Pilot Method. The whole R script can be accessed here on my gist.

How does the Pilot Method work?

The Pilot Method is a construction heuristic to build (i.e. initial) solutions. The method tries to avoid the short-sightedness of greedy methods by estimating the total costs for each individual expansion step after developing an overall solution. In the case of a TSP problem, in each iteration an pilot construction heuristic would

  • consider every not yet visited point as a potential next stop and
  • for each potential next stop get an estimate for the total cost of based on a completed TSP tour gathered applying an appropriate heuristic
  • among all potential next stops choose the best next stop with the cheapest total cost

We’ll use the simple nearest neighbor construction heuristic here to complete a given tour. This is a very simple heuristic, which in every stop drives to the stop which can be reached with minimal distance, measured from the current location. This is a greedy heuristic, which does usually not lead to optimal tours but can be used to generate viable starting solutions.

Sample Implementation of the Pilot Method

First, let’s define a method, which takes a square distance matrix, providing the distances from location li to location lj in the i-th row and j-th column.

dima <- matrix(c(0,5,3,19,7,
                 13,0,1,18,6,
                 12,4,0,14,6,
                 11,9,8,0,10,
                 23,11,7,21,0), 
byrow = T, nrow = 5)

Now let’s generate the outmost script, which generates the tour. We keep track of the stops in the tour-vector which contains the integer numbers of the stops in the sequence they are visited. We also need a function to get the next best pilot stop:

get_nn_based_pilot_tsp <- function(dima) {
  # start at 1
  tour <- c(1)
  while (length(tour) < nrow(dima)) {
    next_stop <- get_best_pilot_stop(tour, dima)
    tour <- c(tour, next_stop)
  }
  # return to start
  tour <- c(tour, tour[1])
  return(tour)
}

The tour deliberately starts a location 1. In order to get the next best pilot stop, we need a function to complete a given tour applying the nearest neighbor heuristic, let’s call that complete_nn, and another one to calculate the total tour cost, let’s call it get_tour_cost:

get_best_pilot_stop <- function(tour, dima) {
  stops <- 1:nrow(dima)
  potential_stops <- stops[-tour]
  best_cost <- Inf
  best_stop <- NA
  for (stop in potential_stops) {
    temp_tour <- c(tour, stop)
    nn_tour <- complete_nn(temp_tour, dima)
    cost <- get_tour_cost(nn_tour, dima)
    if (cost < best_cost) {
      best_cost <- cost
      best_stop <- stop
    }
  }
  return (best_stop)
}

As you can see, we just determine the potential stops by removing all already visited stops from the stops and loop through all potential stops, completing each tour after adding the potential stop, applying nearest neighbor. We then calculate the resulting total cost and keep track of the best stop found so far, according the total cost.

Now let’s implement the function to complete a given tour applying nearest neighbor, based on the tour vector and the distance matrix provided:

complete_nn <- function(temp_tour, dima) {
  while (length(temp_tour) < nrow(dima)) {
    next_stop <- get_best_nn_stop(temp_tour, dima)
    temp_tour <- c(temp_tour, next_stop)
  }
  # return to start
  temp_tour <- c(temp_tour, temp_tour[1])
  return (temp_tour)
}

The function appends new nearest neighbor stops to ghe tour provided until each stop has been added exactly once. Then, to make the tour a valid TSP tour, at the end the tour returns to the start location.

Of course, to get the best nearest neighbor stop, we have only to consider the row corresponding to the current location. In R we can get the last entry of a vector using tail(temp_tour, 1). Then we only consider the stops not yet visited and pick the location with the smallest distance in the distance matrix. We can use which.min() for this purpose. If multiple minimal entries exist, the first one would be returned:

get_best_nn_stop <- function(temp_tour, dima) {
  stops <- 1:nrow(dima)
  potential_stops <- stops[-temp_tour]
  cur_pos <- tail(temp_tour, 1)
  best_id <- which.min(dima[cur_pos, potential_stops])
  best_next <- potential_stops[best_id]
  return(best_next)
}

Now the last function we need is the one to calculate the total tour cost, given a tour as well as the distance matrix:

get_tour_cost <- function(tour, dima) {
  cur_pos <- tour[1];
  tot_dist <- 0;
  for (i in 2:length(tour)) {
    tot_dist <- tot_dist + dima[cur_pos, tour[i]]
    cur_pos <- tour[i]
  }
  return(tot_dist)
}

The function just loops through all stops and adds the distance between them.

Now we have all we need. The full R script can be found here. For the given distance matrix

Example Distance Matrix

we will get the following solution, when executing our complete R script and calling get_nn_based_pilot_tsp(dima) providing the given above:

starting at:  1 
next best pilot stop is:  2 
next best pilot stop is:  5 
next best pilot stop is:  3 
next best pilot stop is:  4 
tour is:  1 2 5 3 4 1