This table has 384 records. A big improvement over 1166 items! This will make our optimization models much smaller. Note that we make sure things are ordered by size (increasing). We will exploit this in one of the models below.

For small data sets we can easily enumerate all possible ways to use \(T\) different box sizes. Here we have \(n=5\) unique item sizes and \(T=3\) boxes:

We assume throughout that each box is used for at least one item. This makes sense: when adding a new box size, it is never a good idea not to use it. Remember: box sizes are variable: we determine the size of a box (it is not given).

such as: \[x_{i,j} = \begin{cases} 1 & \text{if item $i$ is assigned to box type $j=1,\dots\,T$}\\ 0 & \text{otherwise}\end{cases}\] Here \(T\) is the number of differently sized boxes. Furthermore let \[a_j = \text{area of box $j$}\] This can lead to a model like: \[\bbox[lightcyan,10px,border:3px solid darkblue]
{\begin{align}\min & \sum_{i,j} \> \mathit{count}_i \> a_j \> x_{i,j} \\ & \sum_j x_{i,j} = 1 & \forall i & \> \> \text{(assignment)} \\ & x_{i,j} = 1 \Rightarrow a_j \ge \mathit{size}_i^2 & &\>\>\text{(fit)} \\ & x_{i,j} \in \{0,1\} \\ & a_j \ge 0 && \>\>\text{(area of box type $j$)}\end{align}}\] This is problematic: we have a non-convex quadratic objective. This model can be linearized with some effort. First we introduce a variable \(y_{i,j}=a_j x_{i,j}\). As we are minimizing, we can get away with just \(y_{i,j} \ge a_j - M (1-x_{i,j})\). The linearized model can look like \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{i,j} \> \mathit{count}_i \> y_{i,j} \\ & a_j \ge \mathit{size}_i^2 \> x_{i,j} \\ & y_{i,j} \ge a_j - \mathit{Amax} (1-x_{i,j}) \\ & a_1 = \mathit{Amax} \\ & a_{j+1} \le a_j - 1 \end{align}}\] where \(\mathit{Amax}=864^2\). The last two constraints should help the solver a little bit: we reduce symmetry.

. After 20 minutes it still had a gap of 30%.

This model is organized around the variable \[x_{i,j} = \begin{cases} 1 & \text{if items $i$ through $j$ are put in the same size box (with size $j$)} \\ 0 & \text{otherwise}\end{cases}\] This is somewhat unusual, but we'll see this will make a rather nice MIP model. Here we also see why the ordering of items by size is a useful concept: if \(x_{i,j}=1\) then all items \(i,i+1,\dots,j-1,j\) are placed in the same size box.

The basic idea is to cover each item exactly once by a variable \(x_{i,j}\). In addition we want exactly \(T\) variables that have a value of one. Here is how we organize our variables \(x_{i,j}\):

The feasible solution has exactly \(T=3\) variables selected, and all items are covered exactly once.

The complete model is surprisingly simple: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{i\le j} c_{i,j} x_{i,j}\\ & \sum_{i \le k \le j} x_{i,j} = 1 & \forall k\\ & \sum_{i\le j} x_{i,j} = T \\ & x_{i,j} \in \{0,1\}\end{align}}\] The constraint \[\sum_{i \le k \le j} x_{i,j} = 1\>\>\> \forall k\] is typically a long summation. In the picture above, for \(k=2\), we have to add up the variables \(x_{1,2}\), \(x_{1,3}\), \(x_{1,4}\), \(x_{1,5}\), \(x_{1,6}\), \(x_{2,2}\), \(x_{2,3}\), \(x_{2,4}\), \(x_{2,5}\), and \(x_{2,6}\). The cost coefficients can be calculated as: \[c_{i,j} = \sum_{k=i}^j \mathit{count}_k \>\mathit{size}_j^2 \>\>\>\>\forall i\le j\] or if you prefer \[c_{i,j} = \mathit{size}_j^2 \sum_{i\le k\le j} \mathit{count}_k \ \>\>\>\>\forall i\le j\] This is like a **set covering** or better **set partitioning problem** (we want to cover exactly once: this is a set partitioning construct).

This leads to a large, but easy to solve MIP model: for \(T=4\), I see:

Indeed the total cost (area) goes up. Interestingly the smallest box size stays the same.

This problem can also be modeled as a network problem. See [2] for more details.

The network is not very simple. Consider again the 6 item, \(T=3\) example. We drew a picture of a feasible covering earlier. The same feasible solution for the network of this problem looks like:

Selecting a link from item \(i\) to item \(j\) means: put items \(i,i+1,\dots,j-1\) into a box with size \(j-1\). Similarly, a link from item \(i\) to the sink node means: put items \(i,i+1,\dots,n\) into a box with size \(n\) (where \(n\) is the largest item or box).

The cost coefficients are defined by: \[\begin{align}c_{i,j} &= \mathit{size}_{j-1}^2 \sum_{k=i}^{j-1} \mathit{count}_k & \forall i\le j\\c_{i,\mathit{sink}} &= \mathit{size}_n^2 \sum_{k=i}^n \mathit{count}_k & \forall i \end{align}\]

We want to solve a shortest path problem from item 1 to the sink node with a side constraint: the number of "blue" nodes we visit should be exactly \(T\). The side constraint makes this a cardinality constrained or hop constrained shortest path problem. We can solve this as an MIP problem: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{(i,j)\in A} c_{i,j} x_{i,j} \\ & \sum_{(j,i)\in A} x_{j,i} + b_i = y_i & \forall i & \>\> \text{(flow balance)}\\ & y_i = \sum_{(i,j)\in A} x_{i,j} & \forall i & \>\>\text{(outflow node $i$)}\\ &\sum_{i \ne \mathit{sink}} y_i = T &&\text{(cardinality)}\\ & x_{i,j},y_i \ge 0 \end{align} }\] Here \(b_i\) is the net exogenous supply of node \(i\). I.e. \[b_i = \begin{cases} 1 & \>\> \text{if $i=1$}\\ -1 & \>\> \text{if $i=\mathit{sink}$}\\ 0 & \>\>\text{otherwise}\end{cases}\]

This model solves very fast (the relaxed LP gives an integer solution for this data set). The solution for our real data with \(T=4\) looks like:

This means our optimal grouping is \(i_1 - i_{155}\), \(i_{156} - i_{268}\), \(i_{269} - i_{351}\), \(i_{352} - i_{384}\). This is the same solution as for our set partitioning model.

A dynamic programming algorithm is an alternative for this problem. If we define \[f_{i,b} = \text{minimum total area when we pack items $1,\dots,i$ (ordered and with unique size) in $b$ box sizes}\] then we can write down the recursion:\[ f_{i,b} = \min_{k=b-1,\dots,i-1} \left\{ f_{k,b-1} + \mathit{size}_i^2 \sum_{j=k+1}^i \mathit{count}_j \right\}\]

Note that the inner \(k\) loop is vectorized. The results look like:

156,162,168,178,178,180,185,185,190,192,193,194,195,195,197,197,198,198,199,200,202,206
206,210,210,210,212,212,214,215,217,217,217,217,217,218,220,220,220,220,220,220,220,220
220,220,222,223,223,224,225,225,225,225,225,225,225,226,226,226,227,228,228,228,228,230
230,230,230,230,230,230,230,230,230,230,230,230,232,232,232,233,233,234,234,235,235,235
235,238,238,238,238,240,240,240,240,240,240,240,240,240,241,242,242,242,242,243,244,244
244,245,246,247,247,247,249,250,250,250,250,250,251,252,252,252,252,253,254,254,254,255
255,255,255,255,256,256,257,257,257,258,258,258,258,258,259,260,260,260,260,260,260,260
260,260,262,262,262,262,262,262,262,264,264,264,265,265,265,265,265,266,267,267,267,267
268,268,268,268,268,268,269,270,270,270,270,270,270,270,270,270,270,271,272,272,272,272
272,273,273,273,273,274,274,274,274,274,275,275,275,275,275,275,277,277,277,278,278,278
278,278,278,280,280,280,280,281,282,282,284,285,285,285,285,285,285,287,287,287,288,288
288,288,288,289,290,290,290,290,290,290,290,290,290,290,290,292,292,293,293,294,294,294
295,295,295,295,295,295,295,295,296,296,297,297,298,298,298,298,300,300,300,300,300,300
300,300,300,300,300,300,300,300,302,302,303,303,303,303,304,305,305,305,305,305,306,306
307,308,308,308,310,310,310,310,310,310,310,312,312,312,312,313,315,315,315,315,315,315
315,315,315,317,317,317,317,318,318,318,318,320,320,320,320,320,320,320,320,320,320,320
320,320,322,323,323,325,325,325,325,326,326,327,327,328,328,330,330,330,330,330,330,330
332,333,333,334,334,334,334,335,335,335,335,335,336,336,336,338,338,339,339,340,340,340
340,340,340,340,342,342,342,342,343,345,345,345,345,345,345,345,346,346,346,347,347,348
348,348,349,350,350,350,350,350,350,350,350,350,350,350,350,350,350,350,350,350,350,352
352,353,353,353,353,354,354,354,356,357,357,357,357,358,358,360,360,360,360,360,360,360
360,360,360,360,360,361,362,362,362,363,363,364,364,364,364,365,365,365,365,365,365,365
367,367,367,368,370,370,370,370,370,370,370,370,370,372,372,372,373,375,375,375,375,375
375,377,377,377,378,380,380,380,380,380,380,380,380,380,380,380,380,381,382,383,383,383
383,384,384,384,384,385,385,385,386,386,386,386,387,387,387,387,388,388,388,388,388,389
390,392,392,393,394,394,395,396,397,397,398,398,398,400,400,400,400,400,400,400,400,405
405,408,410,410,410,410,410,410,412,412,412,412,412,413,415,415,416,416,417,417,418,418
419,419,419,420,420,420,420,420,420,420,420,423,423,423,423,424,424,425,425,425,428,428
430,430,430,430,430,430,430,430,430,430,430,430,431,432,432,432,432,433,433,433,433,433
434,434,435,435,435,435,437,438,438,438,438,440,440,440,440,440,440,440,441,442,443,444
444,445,445,445,445,445,445,445,445,446,447,448,448,450,450,450,450,450,450,450,450,450
452,453,453,455,455,455,455,456,457,458,458,458,458,459,460,460,460,460,460,460,460,462
462,463,464,465,465,465,465,465,465,465,466,466,467,467,467,467,467,468,468,468,469,469
470,470,470,470,470,470,470,470,470,472,472,472,474,475,475,476,478,478,478,478,480,480
480,480,480,480,480,480,480,480,480,480,482,482,482,482,482,482,482,485,485,485,485,488
488,488,488,490,490,490,490,490,490,490,490,491,491,492,492,492,492,492,492,492,493,493
495,495,495,496,497,498,498,498,498,498,498,498,499,500,500,500,500,500,500,502,502,502
505,505,505,505,505,506,507,507,510,510,510,510,510,510,510,510,510,510,510,510,512,512
513,514,515,515,517,517,518,520,520,520,520,520,520,522,523,523,523,524,525,525,525,525
525,525,528,528,528,528,528,528,528,530,530,530,530,530,530,531,532,534,535,535,535,535
535,536,538,538,538,539,540,540,540,540,540,540,540,540,540,540,540,541,543,543,543,544
544,545,545,545,546,546,546,547,547,548,548,550,550,550,550,550,550,550,550,550,550,550
550,550,552,552,555,555,555,555,555,555,555,557,557,557,557,560,560,560,560,560,562,562
563,563,564,565,565,565,565,565,565,567,567,568,568,569,569,569,570,570,570,571,572,572
573,574,574,574,576,576,578,580,580,580,584,585,585,587,588,590,592,593,596,597,600,600
600,600,602,602,602,602,602,603,604,605,605,605,605,605,605,605,607,607,607,608,610,610
610,610,610,610,612,612,612,612,615,615,615,615,618,618,620,622,624,625,625,625,627,627
628,628,628,630,630,632,633,635,635,637,638,638,640,640,640,640,640,642,643,647,648,649
650,650,655,655,655,656,660,660,660,660,660,662,662,663,664,664,664,664,665,665,665,670
670,672,677,677,679,680,680,680,680,680,685,685,687,688,690,690,692,694,695,697,698,700
700,701,704,704,705,705,705,705,705,705,705,705,705,710,710,712,715,717,720,720,722,723
727,730,733,733,734,734,734,735,735,735,738,740,743,744,754,755,755,755,755,755,758,760
766,766,780,780,780,780,780,780,780,782,783,785,795,805,806,820,823,827,827,855,855,864