Solve TSP Instance using Google OR-Tools

Google OR-Tools

If you happen to have to solve TSPs or some special type of VRPs – i.e. capacitated or with timewindows and other constraints – you might want to have a look at Google OR-Tools. Of course the library can be of use for a much broader set of problems, i.e. to solve Constriant Programming tasks or plain MIPs. In this post I’ll show you how easy it is, to solve some given TSP instances as provided by the tsplib files.

What you need

I’ll demonstrate the usage of google or-tools in python here, however, you can easily port the script provided here to any other platform you like – i.e. to Java following this link. Then make sure, you have:

  • A running Python Environment
  • A Python IDE of your choice, i.e. PyCharm from Jetbrains
  • Installed the ortools (binary) package for python, for installation details see here
  • Installed the packages pandas, scipy
  • Created an empty python project in your IDE, if working with an IDE and venvs (Virtual environments, the concept PyCharm follows to keep track of project dependencies)
  • Fetched some TSP-Instances from my github-account and placed them in a folder named TSP_Instances in your project root

Start Solving TSP Instances

In the next step, download the complete source from here or copy/paste it from this listing beneath:

# Source: https://developers.google.com/optimization/routing/tsp

# Modified to read instance from tsplib format

from __future__ import print_function
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp

import pandas as pd
import scipy.spatial as sp
import numpy as np


def create_data_model(dima):
    """Stores the data for the problem."""
    data = {}
    data['distance_matrix'] = dima
    data['num_vehicles'] = 1
    data['depot'] = 0
    return data


def print_solution(manager, routing, solution, dima):
    """Prints solution on console."""
    print('Objective: {} miles'.format(solution.ObjectiveValue()))
    index = routing.Start(0)
    plan_output = 'Route for vehicle 0:\n'
    route_distance = 0
    while not routing.IsEnd(index):
        plan_output += ' {} ->'.format(manager.IndexToNode(index))
        previous_index = index
        index = solution.Value(routing.NextVar(index))
        route_distance += routing.GetArcCostForVehicle(previous_index, index, 0)
    plan_output += ' {}\n'.format(manager.IndexToNode(index))
    plan_output += 'Route distance: {} miles\n'.format(route_distance)
    print(plan_output)


def main():
    # load tsp instance
    tsp = pd.read_csv("./TSP_Instances/bier127.tsp", sep='\t', skiprows=2, names=['nodeId', 'lat', 'lng'])
    tsp = tsp.sort_values(by='nodeId', inplace=False)

    A = tsp[['lat', 'lng']].to_numpy()
    dima = sp.distance_matrix(A, A)

    """Entry point of the program."""
    # Instantiate the data problem.
    data = create_data_model(dima)

    # Create the routing index manager.
    manager = pywrapcp.RoutingIndexManager(len(data['distance_matrix']),
                                           data['num_vehicles'], data['depot'])

    # Create Routing Model.
    routing = pywrapcp.RoutingModel(manager)

    def distance_callback(from_index, to_index):
        """Returns the distance between the two nodes."""
        # Convert from routing variable Index to distance matrix NodeIndex.
        from_node = manager.IndexToNode(from_index)
        to_node = manager.IndexToNode(to_index)
        return data['distance_matrix'][from_node][to_node]

    transit_callback_index = routing.RegisterTransitCallback(distance_callback)

    # Define cost of each arc.
    routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)

    # Setting first solution heuristic.
    search_parameters = pywrapcp.DefaultRoutingSearchParameters()

    search_parameters.first_solution_strategy = (
        routing_enums_pb2.FirstSolutionStrategy.AUTOMATIC)

    search_parameters.local_search_metaheuristic = (
           routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH)

    search_parameters.time_limit.seconds = 30

    # Solve the problem.
    solution = routing.SolveWithParameters(search_parameters)

    # Print solution on console.
    if solution:
        print_solution(manager, routing, solution, dima)

if __name__ == '__main__':
    main()

In the main-method, the TSP instance is loaded and the distance matrix is created. Then the problem is handed over to the ortools routing solver as a RoutingModel, together with the default search parameter set.

To tune the optimization, you can provide several configuration settings in order to get better results. Consider this reference here, to have a full overview on the Routing Options.

To configure the way, the routing problem shall be tackled, you can configure a construction heuristic as well as a local search strategy. In the above example I configured

  1. Construction Heuristic:
    search_parameters.first_solution_strategy = (routing_enums_pb2.FirstSolutionStrategy.AUTOMATIC)
  2. Local Search Strategy:
    search_parameters.local_search_metaheuristic = (routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH)
  3. Calculation Timeout:
    search_parameters.time_limit.seconds = 30

This way we get relatively close to the optimum of 7544:

Objective: 7526 miles
Route for vehicle 0:
 0 -> 48 -> 31 -> 44 -> 18 -> 40 -> 7 -> 8 -> 9 -> 42 -> 32 -> 50 -> 10 -> 51 -> 12 -> 13 -> 46 -> 25 -> 26 -> 27 -> 11 -> 24 -> 3 -> 5 -> 14 -> 4 -> 23 -> 47 -> 37 -> 36 -> 39 -> 38 -> 35 -> 34 -> 33 -> 43 -> 45 -> 15 -> 28 -> 49 -> 19 -> 22 -> 29 -> 1 -> 6 -> 41 -> 20 -> 16 -> 2 -> 17 -> 30 -> 21 -> 0

After a closer look, we realize, that the solution is even better than the current known optimum. This might look a bit suspicious to me, so let’s verify that.

For this purpose let’s calculate distances between all stops manually and sum them up:

The complete Excel spreadsheet can be downloaded here. The validation check shows, that the total distance does not exactly match the sum of the ArcCostForVehicle. The difference is due to the cost on the arcs which only takes integral values into account.

In order to have more exact results, we would have to scale up the distances in our distance matrix or calculate the total cost based on the exact dima values rather than the costs on the arcs of the solution provided by the model.

After scaling up the lat/long values by a factor of 100 we find the ‘same’ optimal solution as provided in the publication mentioned above. For this reason we add a factor of 100 to every coordinate value:

    A = tsp[['lat', 'lng']].to_numpy()*100

Now we get the following output within 30 seconds:

Objective: 754417 miles
Route for vehicle 0:
0 -> 21 -> 30 -> 17 -> 2 -> 16 -> 20 -> 41 -> 6 -> 1 -> 29 -> 22 -> 19 -> 49 -> 28 -> 15 -> 45 -> 43 -> 33 -> 34 -> 35 -> 38 -> 39 -> 36 -> 37 -> 47 -> 23 -> 4 -> 14 -> 5 -> 3 -> 24 -> 11 -> 27 -> 26 -> 25 -> 46 -> 12 -> 13 -> 51 -> 10 -> 50 -> 32 -> 42 -> 9 -> 8 -> 7 -> 40 -> 18 -> 44 -> 31 -> 48 -> 0
Route distance: 754417 miles

After transforming back the objective function value by division by 100 we get the “real” total distance: 7544.17 units.