Given a set of numbers with mean μOf course we try immediately to cast this as an optimization problem. The obvious MINLP formulation would be:_{0}find a subset of at most K numbers that has a mean μ that is as close as possible to μ_{0}.

There are a number of interesting aspects. Not all of them are completely trivial.

We added a lowerbound of 1 to

*cnt*(the size of the subset). This way we can safely divide by

*cnt*. We could replace the division by a multiplication (this is the usual approach to protect against 'division by zero'), but in this case this is not needed anymore. Also note that a mean or average of zero numbers is not really defined, so this lowerbound really makes sense.

This model contains two non-linearities: (1) the objective is quadratic and (2) we have this division in constraint

*avg*. Note that if we would only consider subsets of

*exactly*size K our problem becomes much simpler. In that case

*cnt*would not be a variable but rather a constant parameter and the

*avg*constraint would become automatically linear.

An obvious but interesting question is whether we can linearize this model. The quadratic objective can be approximated by using an absolute value measure. Actually there is some monotic transformation between the quadratic and absolute value criterion. This mans we will reach the same optimal solution using the absolution value method as using the quadratic objective.

The division is more complicated. However we can note that we can rewrite equations

*count*and

*avg*as follows:

This can be linearized as follows:

As usual it is important to find good, tight values for the big-M's. In this case we can find reasonably good values:

where μ

^{LO}and μ

^{UP}are bounds on μ, e.g. μ

^{LO}=min{

*v*

_{i}}, μ

^{UP}=max{

*v*

_{i}}.

Now lets put everything together:

Here are some results:

We see that for K=5 we actually just use 4 numbers.