In the world of Operations Research and Optimization, solving complex problems efficiently is a constant pursuit. One such problem that has intrigued researchers and practitioners alike is the Knapsack Problem. In this blog post, we’ll delve into the **fascinating realm of metaheuristics** and guide you through the step-by-step process of applying these technique to tackle the Knapsack Problem.

## What is the Knapsack Proboem about?

The **Knapsack Problem** involves selecting a combination of items, each with a specific weight and value, to maximize the total value while adhering to a predefined weight capacity constraint. The challenge is to optimize the contents of a “knapsack” to achieve the highest possible value within the given weight limit.

The **Knapsack Problem is known to be NP-hard**, meaning that finding an optimal solution becomes difficult for large instances and no algorithm exists, that solves the problem to optimality in reasonable (deterministic polynomial) time. **Metaheuristics offer a powerful approach** by providing **good-quality solutions in a reasonable time.**

## The Way Metaheuristics Work

In this post the metaheuristic starts search with a given initial solution to the knapsack problem. Alternatively, the initial solution could also be provided by a construction heuristic. Refer to this previous post, to learn how to develop a pretty good construction heuristic for the Knapsack Problem. The following figure shows the components involved in solving an OR problem with a metaheuristic approach:

The **meta heuristic solution process **is as follows:

- The
**Meta Heuristic**initializes the search:- create an
**initial solution** - initialize iteration counter

- create an
- The
**Local Search**- determines a set of neighborhood solutions
- return the best solution

- The
**Meta Heuristic**- stops, if , otherwise
- , go to step 2

As can be seen in step 2, the whole neigborhood is explored and the best candidate is evaluated. This ‘flavour’ of local search is called **best improvement** or steepest descent, compared to the **first improvement** or first descent approach, where only the first improving neighbor is explored (faster).

If the meta heuristic cannot improve the solution anylonger, we say to have identified a **local optimum**. Usually, metaheuristics do not find the global optimum, but often get stuck in non-optimal local optima. There are techniques like **Randomized Local Search** or **Simulated Annealing **to overcome this drawback.

Let’s now start the implementation phase, crafting the metaheuristic with all its involved components to bring our solution to life:

## Let’s get started!

Now let’s plan our coding approach. We need to consider the following steps:

- define the solution-datamodel
- clarify the neigbhorhood function
- define the objective function
- design the local search process
- define the meta heuristic’s logic

We will now take a closer look at these steps in turn and consider how they can be translated into python code.

### 1) Define the Solution-Datamodel

Initial considerations involve shaping the data model for our solution. Ensuring our solution can effectively determine the inclusion or exclusion of items in the knapsack, a binary vector proves to be the ideal choice.

`S = [0, 1, 1, 0, 0] # our initial solution`

The starting solution above represents a knapsack in which items 2 and 3 have been packed. We will introduce the other information on the Knapsack problem, such as the rucksack capacity, weights and values, as soon as we need it. For the sake of clarity, we will not use OO design here.

### 2) Clarify the Neigbhorhood Function

The design of the neighborhood function is very crucial for the quality and performance of the local search. If the neighborhood is too large, performance suffers and large problem instances can hardly be handled. If the neighborhood is too small, the search often gets stuck in poor local optima. The following points should be taken into account when designing neighborhoods.

- The computing time required to calculate the objective function values of all elements in the neighborhood should be kept within limits (typically in the (milli-) second range). Proven guideline: The number of solutions in the neighborhood grows linearly or quadratically with the problem size.
- Ideally, calculate the objective function value by means of a few additions/subtractions. A delta calculation with reference to the previous solution is often applicable.
- It should theoretically be possible to achieve any solution from any starting solution through repeated improvement steps.

A good neighborhood for our Knapsack problem should include the following “operations”:

**removal of an item:**

Remove an item currently packed**adding an item:**Add an item currently not packed**swapping an object:**

Remove an item currently packed and replace it by an item currently not packed

Let’s translate these operations stepwise into code. Assume, we start having the initial Solution provided in and collect our neighbor solution in a new List . The first two steps are done quickly. We loop through the individual bits in the initial solution and create a new candidate in each iteration. We just copy the initial solution and invert the bit in question.

```
N = []
for i in range(len(S)):
candidate = S.copy()
# a) Generate all neighbors obtained by adding an object
candidate[i] = int(not candidate[i])
N.append(candidate)
```

Now we have to address the swapping part of our neighborhood definition. To do this, we need to identify the zeros and ones in the initial solution and then run through all possible pairwise pairs. For each pair, we’ll then create a new candidate, removing the contained item and adding the other:

```
# c) Generate all neighbors obtained by swapping an object
# first determine 0/1 positions in start solution
zero_positions = [i for i in range(len(S)) if S[i] == 0]
one_positions = [i for i in range(len(S)) if S[i] == 1]
# then loop through all pairwise combinations of 0s/1s
# and swap 0s/1s
for zero_idx in zero_positions:
for one_idx in one_positions:
candidate = S.copy()
candidate[zero_idx] = 1
candidate[one_idx] = 0
N.append(candidate)
```

This gives us everything we need to create the neighborhood of a start-up solution. We pack these operations into the function nb_knapsack():

```
def nb_knapsack(S):
"""
Calculate the neighborhood of the starting solution S.
NB: All solutions that arise from S by either
a) adding an object
b) removing an object
c) removing a contained object AND adding a non-contained object
:param S: Starting solution (list of n binary elements)
:return: List of lists containing neighboring solutions
"""
# Variable to collect the neighbor solutions
N = []
for i in range(len(S)):
candidate = S.copy()
# a),b) Generate all neighbors obtained by adding an object
candidate[i] = int(not candidate[i])
N.append(candidate)
# c) Generate all neighbors obtained by swapping an object
# first determine 0/1 positions in start solution
zero_positions = [i for i in range(len(S)) if S[i] == 0]
one_positions = [i for i in range(len(S)) if S[i] == 1]
# then loop through all pairwise combinations of 0s/1s
# and swap 0s/1s
for zero_idx in zero_positions:
for one_idx in one_positions:
# jetzt einen Nachbarn erzeugen, wo 0 in 1 und 1 in 0 umgewandelt wird
candidate = S.copy()
candidate[zero_idx] = 1
candidate[one_idx] = 0
N.append(candidate)
N = remove_duplicate_candidates(N)
return N
```

### 3) Define the Objective Function

In addition to the binary solution vector S, the objective function only requires the value vector, which assigns each item its individual value. The target function value then corresponds to the scalar product of the solution vector and the value vector. The following function does the job:

```
def eval_func(S, values):
'''
calculate the scalar product of S and values
:param S: a given solution as binary vector
:param values: the values-vector to be multiplied with the bin-vector
:return: the scalar product
'''
value = 0
for i in range(len(S)):
value = value + S[i] * values[i]
return value
```

Since we can also use the function to calculate the capacity required by a given solution, we use a somewhat generic function name here instead of using the term objective function.

### 4) Design the Local Search Process

In contrast to the code in the meta heuristic, the logic of the local search process is problem-specific. Although most of the components will occur analogously when other problems are addressed, the check of the neighboring solutions for validity is definitively problem-specific. Before trying to calculate the value of a given candidate solution, we have to check, whether it does not exceed the knapsack’s capacity limit.

The logic of the (best improvement) local search checks all valid candidates in turn and evaluates them using the objective function. we remember the best solution and its corresponding objective function value. at the beginning, we initialize these by the initial solution and its objective function value. this ensures that a solution can be returned in every case:

```
# initialize best solution/value
best_solution = S
best_value = eval_func(S, values)
# determine the neighborhood of S
N = nb_knapsack(S)
print("Neighborhood (", S, ") is: ", N)
# loop through all candidate solutions
for candidate in N:
weight = eval_func(candidate, weights)
if weight <= capacity:
# if knapsack capacity not exceeded (valid candidate)
# evaluate obj. function
cand_value = eval_func(candidate, values)
# compare current solution with current best solution
if cand_value > best_value:
# found a better candidate: Save candidate in best_solution
best_solution = candidate
# update best value
best_value = cand_value
```

This way at the end of these loops, best_solution contains the best solution found in the neighborhood of the current local search iteration.

### 5) Define The Meta Heuristic’s Logic

Since the local search never returns a worse candidate than the one that was passed to it, the meta-heuristic can abort the search if the previous solution could not be outperformed in an iteration. This logic is generic and can be formulated as follows:

```
def meta_heuristic(S, weights, values, capacity):
'''
Apply a metaheuristic based on the starting solution S.
Calls local_search_knapsack multiple times, selecting
the best solution as the next starting solution, and
returns the local optimum in the end.
:param S: Starting solution
:param weights: Weights of the objects in the knapsack problem
:param values: Values assigned to the knapsack items
:param capacity: Capacity of the knapsack
:return: Tuple containing the local optimum and its objective value
'''
old_sol = S
stop_loop = False
while not stop_loop:
new_sol = local_search_knapsack(old_sol, weights, values, capacity)
if new_sol == old_sol:
stop_loop = True
old_sol = new_sol
obj_value = eval_func(new_sol, values)
return new_sol, obj_value
```

That’s it! With these elements in place, we have everything needed for programming the metaheuristic with all its components. Now, let’s unveil the complete program.

## Performance Test

The following figure visualizes how the meta heuristic performs when optimization of a given problem instance is triggered with different initial solutions. The problem instance is as follows:

**Knapsack Capacity:**15**Objects:**- Weights: 4, 12, 1, 2, 1
- Values: 10, 4, 2, 2, 1

We trigger optimization with four different initial solutions and visualize the objective function value for each iteration:

The value of the objective function gradually improves with each iteration until finally the local search reaches a local optimum from which no further improvement is possible under the neighborhood under consideration.

An important thing to keep in mind: No matter how good our neighborhood may be, there are always problem instances that do not lead to good/optimal results with a given neighborhood. Nevertheless, the neighborhood should not be chosen too large, otherwise the computational effort will quickly become immeasurable.

It is precisely for this purpose that randomized local search or simulated annealing are used in practice, so as not to allow only greedy improvements of the objective function value in every iteration and thus increase the chance of getting out of bad local optima. I’ll demonstrate the use of randomized local search in another post.

## The Complete Source

Below, you’ll find the source code for the entire Python program, also available on GitHub for your convenience.

```
def remove_duplicate_candidates(N):
'''
Accept a list of candidate solutions (being binary lists) and make sure, each (element)-list is contained only once.
:param N:
:return: revised list containing no duplicate candidates
'''
unique_set = set(tuple(lst) for lst in N)
unique_list_of_lists = [list(tpl) for tpl in unique_set]
return unique_list_of_lists
def nb_knapsack(S):
"""
Calculate the neighborhood of the starting solution S.
NB: All solutions that arise from S by either
a) adding an object
b) removing an object
c) removing a contained object AND adding a non-contained object
:param S: Starting solution (list of n binary elements)
:return: List of lists containing neighboring solutions
"""
# Variable to collect the neighbor solutions
N = []
for i in range(len(S)):
candidate = S.copy()
# a),b) Generate all neighbors obtained by adding an object
candidate[i] = int(not candidate[i])
N.append(candidate)
# c) Generate all neighbors obtained by swapping an object
# first determine 0/1 positions in start solution
zero_positions = [i for i in range(len(S)) if S[i] == 0]
one_positions = [i for i in range(len(S)) if S[i] == 1]
# then loop through all pairwise combinations of 0s/1s
# and swap 0s/1s
for zero_idx in zero_positions:
for one_idx in one_positions:
# jetzt einen Nachbarn erzeugen, wo 0 in 1 und 1 in 0 umgewandelt wird
candidate = S.copy()
candidate[zero_idx] = 1
candidate[one_idx] = 0
N.append(candidate)
N = remove_duplicate_candidates(N)
return N
def eval_func(S, values):
'''
calculate the scalar product of S and values
:param S: a given solution as binary vector
:param values: the values-vector to be multiplied with the bin-vector
:return: the scalar product
'''
value = 0
for i in range(len(S)):
value = value + S[i] * values[i]
return value
def local_search_knapsack(S, weights, values, capacity):
'''
perform one local search iteration in the neighborhood of the
solution S
:param S: Start solution
:param weights: Weights of the objects of the knapsack problem
:param values: Values assigned to the knapsack items
:param capacity: Capacity of the knapsack
:return: the best solution candidate found in this local search iteration
'''
best_solution = S
best_value = eval_func(S, values)
# determine the neighborhood of S
N = nb_knapsack(S)
print("Neighborhood (", S, ") is: ", N)
# loop through all candidate solutions
for candidate in N:
weight = eval_func(candidate, weights)
if weight <= capacity:
# if knapsack capacity not exceeded (valid candidate)
# evaluate obj. function
cand_value = eval_func(candidate, values)
# compare current solution with current best solution
if cand_value > best_value:
# found a better candidate: Save candidate in best_solution (for later..)
best_solution = candidate
# update best value
best_value = cand_value
# return best candidate
print("Best solution:", best_solution, ", val: ", best_value)
return best_solution
def meta_heuristic(S, weights, values, capacity):
'''
Apply a metaheuristic based on the starting solution S.
Calls local_search_knapsack multiple times, selecting the
best solution as the next starting solution, and returns
the local optimum in the end.
:param S: Starting solution
:param weights: Weights of the objects in the knapsack problem
:param values: Values assigned to the knapsack items
:param capacity: Capacity of the knapsack
:return: Tuple containing the local optimum and its objective value
'''
old_sol = S
stop_loop = False
while not stop_loop:
new_sol = local_search_knapsack(old_sol, weights, values, capacity)
if new_sol == old_sol:
stop_loop = True
old_sol = new_sol
obj_value = eval_func(new_sol, values)
return new_sol, obj_value
# Main program
# Knapsack problem definition
S = [0, 1, 1, 0, 0]
weights = [4, 12, 1, 2, 1]
values = [10, 4, 2, 2, 1]
capacity = 15
# init_sol = construct_knapsack_solution(weights, values, capacity)
# print(init_sol)
l_opt, l_val = meta_heuristic(S, weights, values, capacity)
print("Local Optimum: ", l_opt, ", Obj. val:", l_val)
```

The script to generate the plots can be found here.