Number of swaps in Quicksort
BDADDDBCCC -> ABBCCCDDDD #swaps:28ABBCCCDDDD -> ABBCCCDDDD #swaps:54
Python code
# measure number of swaps in Quicksort # ------------------------------------ # simple quicksort from https://www.w3schools.com/dsa/dsa_algo_quicksort.php # I added a swap counter. # # result: # BDADDDBCCC -> ABBCCCDDDD #swaps:28 # ABBCCCDDDD -> ABBCCCDDDD #swaps:54 # string to sort input = "BDADDDBCCC" def partition(array, low, high): swaps = 0 pivot = array[high] i = low - 1 for j in range(low, high): if array[j] <= pivot: i += 1 array[i], array[j] = array[j], array[i] swaps += 1 array[i+1], array[high] = array[high], array[i+1] swaps += 1 return swaps,i+1 def quicksort(array, low=0, high=None): if high is None: high = len(array) - 1 if low < high: n1,pivot_index = partition(array, low, high) n2 = quicksort(array, low, pivot_index-1) n3 = quicksort(array, pivot_index+1, high) return n1+n2+n3 else: return 0 def list2str(L): return ''.join(L) # sort characters in string # convert list of chars <-> string def strsort(s): L = list(s) n = quicksort(L) s2 = list2str(L) return s2,n # sort input s = input t,n=strsort(s) print(f"{s} -> {t} #swaps:{n}") # sort again (already sorted) t2,n=strsort(t) print(f"{t} -> {t2} #swaps:{n}")
The question comes to mind: can we go from BDADDDBCCC to ABBCCCDDDD using fewer swaps? There are other sorting algorithms that are doing better: selection sort [4] and cycle sort [5]. Here, I look at some formal optimization methods.
Minimum number of swaps: shortest path network approach
We can form a large, sparse network where the nodes are all possible unique configurations.
To compute all such configurations, we want to avoid generating duplicates. E.g. when considering AAB, the collection of nodes should be AAB, ABA, BAA. The Python function itertools.permutations gives: AAB, ABA, AAB, ABA, BAA, BAA. Instead, we should use sympy.utilities.iterables.multiset_permutations (not in itertools, as predictability is just boring). This will produce: AAB, ABA, BAA.
For our input BDADDDBCCC, we generate 12,600 nodes.
The arcs are formed by swapping two letters (only if they are different). In total, we have 441,000 arcs. The source node is BDADDDBCCC, and the sink node is ABBCCCDDDD. This network is not small. It is quite amazing that our little 10-character string can yield such a large graph.
I used an LP model to solve this [6]. The network creates 12,600 constraints and 441,000 variables. This is a very easy LP, so it solves very fast. Typical solution times are less than a second. Obviously, this can also be solved with a network code or Dijkstra's algorithm.
An optimal solution is:
step0.BDADDDBCCC
step1.BDADDCBCCD
step2.BDADCCBCDD
step3.ADBDCCBCDD
step4.ABBDCCDCDD
step5.ABBCCCDDDD
GAMS LP Model
$onText
Shortest Path: minimize number of swaps when sorting
$offText
*--------------------------------------------------------
* data: from data file
*--------------------------------------------------------
set
nodes 'all possible unique orderings'
arcs(nodes,nodes) 'formed by a swap'
;
$offlisting
$include networkdata.inc
sets
start(nodes) 'initial configuration' /BDADDDBCCC/
final(nodes) 'sorted configuration' /ABBCCCDDDD/
;
alias (n,n1,n2,nodes);
*--------------------------------------------------------
* summary of network
*--------------------------------------------------------
parameter counts(*) 'nodes/arcs';
counts('nodes') = card(nodes);
counts('arcs') = card(arcs);
option counts:0;
display counts;
*--------------------------------------------------------
* shortest path LP
*--------------------------------------------------------
positive variables f(n1,n2) 'flow along arcs';
variable z 'objective';
Equations
obj 'objective'
nodbal(n) 'node balance'
;
parameter
supply(n) 'exogenous inflow'
demand(n) 'exogenous outflow'
;
supply(start) = 1;
demand(final) = 1;
obj.. z =e= sum(arcs,f(arcs));
nodbal(n).. sum(arcs(n1,n),f(arcs)) + supply(n) =e= sum(arcs(n,n1),f(arcs)) + demand(n);
model spath /all/;
option limrow=0,limcol=0,solprint=off;
solve spath minimizing z using lp;
display f.l;
*--------------------------------------------------------
* reporting: path
*--------------------------------------------------------
sets
step 'for reporting' /step0*step50/
visit(step,nodes) 'path'
;
singleton set curr(nodes) 'current node';
curr(n) = start(n);
loop(step$card(curr),
visit(step,curr) = yes;
curr(n2) = f.l(curr,n2)>0.5;
);
option visit:0:0:1;
display start,final,visit;
A MIP Model
- \(\color{darkred}x_{k,i}\): the value of the i-th item at iteration \(k\),
- \(\color{darkred}s_{k,i}\in\{0,1\}\): indicates if item \(i\) has stayed the same (i.e., was not swapped). In each iteration, 2 items are swapped, and \(n-2\) items remain the same. Detail: we allow for the model to select 2 variables with the same value to swap. That is not very useful, and the objective will try to prevent this from happening. Note that in the shortest-path approach, we explicitly avoided these unnecessary swaps.
- \(\color{darkred}d_k\in\{0,1\}\) indicates we are done. We found a sorted configuration in the previous iteration. Once we are done, the remaining iterations will keep the same ordering.
| High-level MIP Model |
|---|
| \[\begin{align}\min&\sum_k \left(1-\color{darkred}d_k\right) \\ & \color{darkred}x_{0,i} = \color{darkblue}{\mathit{init}}_i \\ & \color{darkred}d_k=0 \implies \sum_i \color{darkred}s_{k,i}=n-2 \\ & \color{darkred}d_k=1 \implies \sum_i \color{darkred}s_{k,i}=n \\ & \color{darkred}s_{k,i}=1 \implies \color{darkred}x_{k,i} = \color{darkred}x_{k-1,i} \\ & \color{darkred}s_{k,i}=0\ \mathbf{ and }\ \color{darkred}s_{k,j}=0 \implies \color{darkred}x_{k,i} = \color{darkred}x_{k-1,j} && \forall i\ne j \\ & \color{darkred}d_k=1 \implies \color{darkred}x_{k-1,i} \ge \color{darkred}x_{k-1,i-1} &\\ &\color{darkred}s_{k,i},\color{darkred}d_k \in\{0,1\} \end{align}\] |
Notes:
- The objective minimizes the number of iterations \(k\) (i.e., swaps) required to obtain a proper ordering. We can also use \(\max \sum_k \color{darkred}d_k\), but that makes the solver progress log a bit more challenging to interpret.
- We initialize \(\color{darkred}x_{0,i} = \color{darkblue}{\mathit{init}}_i\). This will allow us to use lags \(\color{darkred}x_{k-1,i}\) in the rest of the model without complications.
- The swap equation is for all \(i \ne j\). This means it yields an equality for both \((i,j)\) and \((j,i)\). That is exactly what we need for a swap.
- Most, but not all, solvers and tools support implications (a.k.a. indicator constraints).
- The values are floating point numbers here. So A=1.0, B=2.0,...
The actual MIP model I used was:
| MIP Model Implementation |
|---|
| \[\begin{align}\min&\sum_{k\ge 1} \left(1-\color{darkred}d_k\right) \\ & \color{darkred}x_{0,i} = \color{darkblue}{\mathit{init}}_i \\ & \sum_i \color{darkred}s_{k,i}=n-2 \cdot (1-\color{darkred}d_k) && k \ge 1\\ & \color{darkred}x_{k,i} \le \color{darkred}x_{k-1,i} + \color{darkblue}M \cdot (1-\color{darkred}s_{k,i}) && k \ge 1 \\ & \color{darkred}x_{k,i} \ge \color{darkred}x_{k-1,i} - \color{darkblue}M \cdot (1-\color{darkred}s_{k,i}) && k \ge 1 \\ & \color{darkred}x_{k,i} \le \color{darkred}x_{k-1,j} + \color{darkblue}M\cdot \left(\color{darkred}s_{k,i} +\color{darkred}s_{k,j}\right) && k\ge 1, i\ne j\\ & \color{darkred}x_{k,i} \ge \color{darkred}x_{k-1,j} - \color{darkblue}M\cdot \left(\color{darkred}s_{k,i} +\color{darkred}s_{k,j}\right) && k\ge 1, i\ne j \\ & \color{darkred}x_{k-1,i} \ge \color{darkred}x_{k-1,i-1} - \color{darkblue}M \cdot \left(1-\color{darkred}d_k\right) && k\ge 1, i \ge 2 \\ & \color{darkred}d_k \ge\color{darkred}d_{k-1} && k\ge 1 \\ & \sum_i \color{darkred}x_{k,i} = \sum_i \color{darkblue}{\mathit{init}}_i \\ &\color{darkblue}M = \max_i \color{darkblue}{\mathit{init}}_i - \min_i \color{darkblue}{\mathit{init}}_i \\ & \color{darkred}s_{k,i},\color{darkred}d_k \in\{0,1\} \\ & \color{darkred}x_{i,k} \in \left[\min_i \color{darkblue}{\mathit{init}}_i,\max_i \color{darkblue}{\mathit{init}}_i \right] \end{align}\] |
Notes:
- The indicator constraints have been replaced by big-M constraints. This model can be solved with any MIP solver.
- The model size is: 2,122 constraints and 222 variables (111 binary variables).
- I added a few extra, optional constraints. They help a bit: solution times went from 1.22 seconds to 0.76 seconds and node counts from 5,861 to 4,922. (This was with Cplex. For slower solvers the impact is bigger.)
- We have to choose the dimensionality of set k a bit carefully. If too small, the model will be infeasible. If too large, we waste performance.
The solution looks like:
---- 99 PARAMETER trace item1 item2 item3 item4 item5 item6 item7 item8 item9 item10 iter0 2.000 4.000 1.000 4.000 4.000 4.000 2.000 3.000 3.000 3.000 iter1 1.000 4.000 2.000 4.000 4.000 4.000 2.000 3.000 3.000 3.000 iter2 1.000 2.000 2.000 4.000 4.000 4.000 4.000 3.000 3.000 3.000 iter3 1.000 2.000 2.000 4.000 3.000 4.000 4.000 4.000 3.000 3.000 iter4 1.000 2.000 2.000 3.000 3.000 4.000 4.000 4.000 4.000 3.000 iter5 1.000 2.000 2.000 3.000 3.000 3.000 4.000 4.000 4.000 4.000
As before, we need 5 swaps (although the actual sequencing is different).
GAMS Model
$onText
MIP model
Find a way to reorganize the list
BDADDDBCCC → ABBCCCDDDD
(sorting) using as few swaps as
passible.
We can use a numerical version:
2414442333 → 1223334444
$offText
*---------------------------------------------------------------
* data
*---------------------------------------------------------------
Set
i 'items' /item1*item10/
k 'max iterations' /iter0*iter10/
;
alias(i,j);
Parameter
init(i) 'initial ordering 2414442333' /
item3 1
(item1, item7) 2
(item8*item10) 3
(item2, item4*item6) 4
/;
display init;
*---------------------------------------------------------------
* linear MIP model
*---------------------------------------------------------------
variables
x(k,i) 'values'
itcount 'to minimize'
;
x.lo(k,i) = smin(j,init(j));
x.up(k,i) = smax(j,init(j));
x.fx('iter0',i) = init(i);
Scalar M 'big-M';
M = smax(i,init(i))-smin(i,init(i));
binary Variables
s(k,i) 's(k,i)=1 : x(k,i) stays the same'
d(k) 'd(k)=1 : we are done. No change needed anymore.'
;
s.fx('iter0',i) = 0;
equations
count_same(k) 'number of x(k,i) that should stay the same'
calc_itcount 'calc iteration count (obj)'
same1(k,i) 'big-M constraint for x(k,i) staying the same'
same2(k,i) 'big-M constraint for x(k,i) staying the same'
swap1(k,i,j) 'big-M constraint for x(k,i) being swapped'
swap2(k,i,j) 'big-M constraint for x(k,i) being swapped'
isSorted(k,i) 'big-M constraint for detecting x(k,i) is sorted'
* optional constraints
order(k) 'optional: d(k) is ordered'
extra(k) 'optional: x always adds up to the same number'
;
* counts
* done(k)=0 => sum(i, same(k,i)) = n-2
* done(k)=1 => sum(i, same(k,i)) = n
count_same(k)$(ord(k)>1).. sum(i, s(k,i)) =e= card(i)-2*(1-d(k));
* same
* same(k,i)=1 ==> x(k,i) = x(k-1,i)
same1(k,i)$(ord(k)>1)..x(k,i) =l= x(k-1,i) + M*(1-s(k,i));
same2(k,i)$(ord(k)>1)..x(k,i) =g= x(k-1,i) - M*(1-s(k,i));
* swap
* same(k,i)=0 & same(k,j)=0 ==> x(k,i) = x(k-1,j), x(k,j) = x(k-1,i)
* we will allow useless swaps
swap1(k,i,j)$(ord(i)<>ord(j) and ord(k)>1).. x(k,i) =l= x(k-1,j) + M*(s(k,i)+s(k,j));
swap2(k,i,j)$(ord(i)<>ord(j) and ord(k)>1).. x(k,i) =g= x(k-1,j) - M*(s(k,i)+s(k,j));
* done
isSorted(k,i)$(ord(i)>1 and ord(k)>1).. x(k-1,i) =g= x(k-1,i-1) - M*(1-d(k));
* optional constraints (they help performance)
order(k)$(ord(k)>1).. d(k) =g= d(k-1);
extra(k).. sum(i, x(k,i)) =e= sum(i,init(i));
* obj
calc_itcount.. itcount =e= sum(k$(ord(k)>1), 1-d(k));
model sort /all/;
*---------------------------------------------------------------
* solve
*---------------------------------------------------------------
solve sort minimizing itcount using mip;
display x.l, s.l, d.l, itcount.l;
* make a little report
parameter trace(k,*);
trace(k,i)$(d.l(k)<0.5) = x.l(k,i);
display trace;
Conclusion and extensions
We have seen that QuickSort is not very efficient in terms of the number of swaps, when we sort: BDADDDBCCC -> ABBCCCDDDD. To find the minimum number of swap operations needed, I formulated two different optimization problem: a shortest path network model and a MIP model. Both agree: we can sort this using just 5 swaps.
Some open questions:
- Can we use a version of Cycle Sort (or another sort algorithm) that gives us the optimal sorting strategy for this? Warning: some versions are not using the concept of swaps. I tried my example on some different versions (including AI generated code), but was unable to achieve our optimal solution.
- Can we find the initial configuration that requires the most swaps? That is like the "longest shortest distance". We should be able to find this with Dijkstra. I am not sure about a direct optimization model for this.
References
- Sorting algorithms that optimise for changes/swaps/moves, https://or.stackexchange.com/questions/13387/sorting-algorithms-that-optimise-for-changes-swaps-moves
- Quicksort, https://en.wikipedia.org/wiki/Quicksort
- DSA Quicksort, https://www.w3schools.com/dsa/dsa_algo_quicksort.php
- Selection Sort, https://www.geeksforgeeks.org/dsa/selection-sort-algorithm-2/
- Cycle Sort, https://www.geeksforgeeks.org/dsa/cycle-sort/
- A network model: Pyomo vs GAMS, https://yetanothermathprogrammingconsultant.blogspot.com/2021/08/a-network-model-pyomo-vs-gams.html
- networkdata.inc, https://raw.githubusercontent.com/aopt/OptimizationModels/refs/heads/main/Sorting%20and%20Swaps/networkdata.inc
No comments:
Post a Comment