This problem is described in http://comments.gmane.org/gmane.comp.lang.r.general/177553. An obvious MIQP formulation in GAMS could look like:
The results look like:$ontext
Given vector of numbers x, I wish to select an n-subset with sum closest to
fixed value s. Can anyone advise me how to approach this, in R?
I have considered Rcplex package, which handles integer/binary
linear/quadratic optimization problems, but have difficulty setting up the
quadratic form for [sum(x) - s]^2.
(Dynamic programming over [0, sum(x)]? A genetic algorithm? Can anyone
contribute a binary GA optimization sample?)
--------------------------------------------------------------------------
I need to pick 2-40 elements out of a 50-200-element-long vector.
$offtext
set i /i1*i200/;
scalars
k /40/
target /3.1415926535/
;
parameter p(i);
p(i) = uniform(-100,100);
binary variable x(i);
variable z,s;
equation e1,e2,obj;
e1.. s =e= sum(i, p(i)*x(i));
e2.. sum(i,x(i)) =e= k;
obj.. z =e= sqr(s-target);
model m /all/;
option optcr=0,miqcp=cplex;
solve m minimizing z using miqcp;
option x:1;
display p,x.l;
scalar value,sqdeviation,num;
value = sum(i, p(i)*x.l(i));
sqdeviation = sqr(value - target);
num = sum(i, x.l(i));
display target,value,sqdeviation,num;
---- 54 PARAMETER p i1 -65.651, i2 68.653, i3 10.075, i4 -39.772, i5 -41.558, i6 -55.189, i7 -30.034 ---- 54 VARIABLE x.L i1 1.0, i8 1.0, i11 1.0, i13 1.0, i31 1.0 ---- 60 PARAMETER target = 3.142 |
The model solves very quickly: 2 seconds, but that is very much dependent on the data. The results show one binary variable with a small value. The obvious repair in a MIP model would be to set the integer feasibility tolerance (option epint) to zero. This causes the solution time to increase to 11 seconds, but unfortunately does not really solve the problem for this MIQP problem:
---- 55 VARIABLE x.L i1 1.0, i8 1.0, i11 1.0, i13 1.0, i14 -7.57698E-7 |
We still have a tolerance issue. The log indicates there is some issue with scaling:
Warning: integer solution contains unscaled infeasibilities. |
I tried to disable scaling by scaind = -1, but that gave the same message. It looks like we have some numerical issues here (although the data looks quite benign).
This model can actually be solved as a MIP. The quadratic objective can be replaced by a linear one by minimizing the absolute deviation. The optimal solution should be the same for both objectives (apart from the issue of existence of multiple optimal solutions). The linear model can look like:
$ontext
Linear version
$offtext
*
* data
*
set i /i1*i200/;
scalars
k /40/
target /3.1415926535/
;
parameter p(i);
p(i) = uniform(-100,100);
*
* model
*
binary variable x(i);
variable z,s,d1,d2;
positive variable d1,d2;
equation e1,e2,e3,obj;
e1.. s =e= sum(i, p(i)*x(i));
e2.. sum(i,x(i)) =e= k;
e3.. d1-d2 =e= s-target;
obj.. z =e= d1+d2;
model m /all/;
option optcr=0, mip=cplex;
solve m minimizing z using mip;
*
* do some reporting
*
option x:1;
display p,x.l;
scalar value,sqdeviation,num;
value = sum(i, p(i)*x.l(i));
sqdeviation = sqr(value - target);
num = sum(i, x.l(i));
display target,value,sqdeviation,num;
The result are now:
---- 42 VARIABLE x.L i11 1.0, i19 1.0, i63 1.0, i92 1.0, i101 1.0, i102 1.0, i103 1.0, i104 1.0, i105 1.0 ---- 48 PARAMETER target = 3.142 |
This solves faster and seems numerically a little bit more stable.
No comments:
Post a Comment