Pages

Wednesday, July 11, 2018

0-1 Knapsack Problem: Introduction to the Problem and Inelegant Solution

Imagine, a new friend invites you to stay with her at her parents' home, her parents being conveniently out of town. You originally had other plans, but knowing that your friend's father has a nice collection of single malt Scotch whiskies, you decide to take her up on her offer. And, of course, you bring your extra-large knapsack along with you to make sure you get a nice haul of souvenirs.

While she's sleeping, you go to the basement to get what you came for. Having previously gotten information about the entire collection from here, the souvenir shortlist you made looks like this:
  • Macallan 18 Year -- 2 bottles for a total value of $360
  • Lagavulin 12 Year -- 3 bottles for a total value of $360
  • Springbank 21 Year -- 1 bottle for a total value of $310
  • Ballantine's 21 Year -- 4 bottles for a total value of $320
  • Johnnie Walker Blue Label -- 2 bottles for a total value of $300
In fact, you decided to challenge yourself by including a rule that if you take a certain expression of whisky, you have to take all available bottles. So, if you take the Lagavulin 12, you're taking all three bottles. And, in fact, you've already determined what you're going to take by using a program you made to optimize your haul.

In designing a smartphone app the day before the visit, you first considered using the power set function we created in a previous blog post. However, running this list of whiskies through the program gave you a set of all possible combinations like this:


Doesn't look very good, does it? With five whisky items there are $2^{5} = 32$ possible combinations. Of course, some of them aren't physically possible, due to your knapsack's limited capacity. In fact, you can carry 10 bottles, and no more. So, clearly you can't take all of the bottles on your list. You need to maximize the total whisky value (meaning profit you could make from selling the whiskies, if you decided to do so) in your whisky take, limited to 10 bottles.

In designing your app, you diagrammed the situation out. First, you made a table:

n
0
1
2
3
4
5
Whisky

Macallan 18
Lagavulin 12
Springbank 21
Ballantine’s 21
Johnnie Walker Blue
Value

360
360
310
320
300
Weight

2
3
1
4
2

(Don't worry about the '0' column; it just has to do with the way you set up your 'weights' and 'values' arrays, which is coming up soon.)

The first part of your decision tree looks like this:




(You only considered the n = 5 and n = 4 cases because you knew that for a recursive function, this would be enough to get you going.)

A = available carrying capacity after update (insertion of whisky)
P = profit expected from selling contents of knapsack; total value of inserted whisky
N = item number

Extra information: weights are given in terms of numbers of bottles for each whisky type. For this problem, if you take a type of whisky, you are taking all of the bottles of it. For example, if you take Ballantine’s 21-Year-Old, you are taking 4 bottles. (Yes, this is one big knapsack we’re talking about here.) Values are considered based on standard prices mid-year 2018.

Hopefully, this is fairly intuitive, but just to refresh your memory,... The first decision to be made is for the n = 5 case. n = 5 corresponds to the Johnnie Walker Blue Label whisky in the above table. We have two options, either to put it in our knapsack (take it), or to leave it out (not take it). Then, for each of these possible actions (take or not take), we have another decision to make, which takes us down to the n = 4 case, which corresponds to the Ballantine's 21 Year whisky. In the event that we took the Johnnie Walker Blue bottles for the first decision (n = 5), and we decide to take the Ballantine's 21 bottles for the next decision (n = 4), we have added 6 bottles to our knapsack, and the combined value (total profit expected if we sell) is P = $620. The available remaining space in our knapsack is A = 4 bottles. This is represented in the first box on the left in the second row (n = 4 decision level).

Now, the whole decision tree comes back to you, and you move on to the code...

n
0
1
2
3
4
5
Whisky
0
Macallan 18
Lagavulin 12
Springbank 21
Ballantine’s 21
Johnnie Walker Blue
values
0
360
360
310
320
300
weights
0
2
3
1
4
2

values = [0,  360, 360, 310, 320, 300]   #define two arrays as lists
weights = [0, 2, 3, 1, 4, 2]

    if n == 0 or C == 0:    #case with no bottles or no knapsack
        return 0
    elif weights[n] > C:
            result = kR(n-1, C)     #item weight exceeds capacity of knapsack (C)
    else:
        p1 = kR(n-1, C)    #case where we don't take nth whisky
        p2 = values[n] + kR(n-1, C - weights[n])    #case where we take nth whisky
        result = max(p1, p2)    #considers which case gives max profit
    return result

If we define a function called kR(n, C), we can use the above algorithm to generate the maximum total value we would get for that magic combination of whiskies. This value, of course, translates to how much we could hope to make selling all the bottles. Notice that the function is recursive. The base case of the algorithm considers the case where there are either no bottles, or there is no knapsack. (n == 0 or C == 0) Maybe, your friend's father hid the whiskies before going on vacation, or maybe you forgot your knapsack. The next case considers the case where your knapsack is too small to carry any of the bottles. (weights[n] > C) The last case contains within it the decisions we're faced with at each level of the decision tree. p1 is the value you get if you don't take the whisky under consideration at that level, and p2 is the value you get for taking it. Again, these values are based on further calls of the function, so the recursivity ensures that we will get an accumulation of values from decisions made at other decision levels. These are numerical values, so the returned quantity will be a number, which represents, as intended, the maximum possible value for a certain combination of whiskies we could fit into our knapsack.

This value isn't very meaningful if we don't know the combination of whiskies it corresponds to, though. So, we need a way to get that.  A simple, intuitive way to do this (though computationally inefficient) is to generate a power set of all of the whiskies in the list, and go through the subsets looking for ones which yield the max value we got from the above algorithm. The thinking goes like this:
  1. Generate the power set of the whisky set.
  2. Go through the elements of each subset (whiskies) and get their individual values.
  3. Sum those individual values in the subset, and if the sum equals or exceeds (hopefully not!) the max value we obtained earlier,
  4. Return the subset (combination of whiskies).
Again, if you're unsure of how to generate a power set, see this previous blog post.

Python code which will give us the optimal combination of whiskies with the corresponding maximum possible value looks like this:

  

However, we can do better in terms of computational complexity, since here we have Big-O complexity $O(2^{n})$. We'll tackle that problem in another post; this post was just to generate a solution to the 0-1 knapsack problem intuitively.

No comments:

Post a Comment