Comments on Yet Another Math Programming Consultant: Linearizing an average

There is nothing complicated about that. Something...

2021-05-01T16:52:00.979-04:00

There is nothing complicated about that. Something like prob += z.

Hi Erwin, first of all, amazing work and thank you...

2021-05-01T16:28:34.493-04:00

Hi Erwin, first of all, amazing work and thank you for sharing all this knowledge!! I have a similar problem here, where i want to minimize the average. I understood most of your article, however i have no clue how to implement it on pulp (python), are you familiar with pulp coding structure? how would it be defined the objective function input for this example of yours ?

Oh my gosh, thank you! Super-helpful to see the m...

2020-10-05T23:09:19.700-04:00

Oh my gosh, thank you! Super-helpful to see the model!! (In the meantime, I think I've just about programmed my version in, but now I can check against your model to make sure that I am handling the variables correctly!!)

I have added the code (using the GAMS Modeling Lan...

2020-10-05T19:03:20.291-04:00

I have added the code (using the GAMS Modeling Language). Regarding your questions: (1) in my model mu(j) is a free variable but that can be tightened to what you suggest, (2) y(i,j) can be zero even if the lowerbound on mu(j) is non-zero. We need to be careful about this. (3) No.

Now that I have tried programming some of the abov...

2020-10-05T13:21:01.530-04:00

Now that I have tried programming some of the above into my own code, I realize how much of this algorithm I actually still don't fully understand! I have a couple of follow up questions, if you don't mind. My understanding starts falling apart shortly after the following equation is written, just above the blue box summarizing the complete linearized model:

sum(i, mu(j)*x(i,j)) = sum(i, p(i)*x(i,j))

At this point, you state that since mu(j)*x(i,j) is a product of a binary and continuous variable, we know how to linearize. At that point, I referred to your other blog: http://yetanothermathprogrammingconsultant.blogspot.com/2008/05/multiplication-of-continuous-and-binary.html

In that other blog post, I think that what you are calling "z" is what you call "y" in this post, and the "x" referenced there is actually "mu" here, and the delta referenced there would be "x(i,j)" here. "U" in this post represents the upper bound of mu. Using that other post for guidance, we should be able to rewrite y(i,j) = mu(j)*x(i,j) as:

(1) min{0, min_mu} <= y(i,j) <= max{0, U}
(2) min_mu * x(i,j) <= y(i,j) <= U * x(i,j)
(3) mu(j) - U * (1 - x(i,j)) <= y(i,j) <= mu(j) - min_mu * (1 - x(i,j))

The above are the equations in the box at the top of your other post ("Multiplication of a continuous and a binary variable").

But we're not talking about plain 'ole mu(j)*x(i,j) but rather the sum over this product, e.g., sum(i, mu(j)*x(i,j)) ... don't we need to sum the above inequalities over "i" to get the final constraints for the problem presented in this post? This was the first place where I got a bit confused (the complete linearized model presented in this post at the bottom in the blue box do not seem to have been summed over "i". But maybe an inequality summed over i is mathematically identical to the set of the constituent inequalities ... yeah, that's true as long as each thing being summed is equal to zero or a positive number, I think??).

The 2nd point of confusion is in comparing the above equations I just translated from your other blog post and the blue box equations in this post. Let me list them and label them below, to be crystal clear:

(A) y(i,j) <= U*x(i,j)
(B) y(i,j) <= mu(j)
(C) y(i,j) >= mu(j) - U*(1-x(i,j))

(A) is the right hand side of inequality #2 above;
(B) is the right hand side of inequality #3, BUT ONLY IF min_mu (lower bound of mu) is set to zero
(C) is the left hand side of inequality #3

My questions are:
* Are you indeed assuming that the lower bound of mu is zero?
* Why would the lower bound not instead be set to the lowest value within the set of prices, just like U is set by the highest value among the prices?
* Does using zero as the lower bound of mu cause any problems and/or have advantages over using the actual lowest value in the set of prices (which would be 0.671 in your example)?

Finally, I was wondering if you could verify that you used 8.563 as the value for U? And ... would be super-helpful if I could actually see the model/code that produced the results you show at the bottom of this post in the grey boxed. (Because I am still struggling with what is a constraint, what is something that gets minimized in the objective function ... all things that could be cleared up easily by seeing actual code). Just wondering if you would be willing to share the source code on your blog?

Thanks!

awesome! Thanks for the quick reply. I'll tr...

2020-10-04T20:54:56.530-04:00

awesome! Thanks for the quick reply. I'll try to "digest" your last suggestion and incorporate into the code. I'm not quite sure what "identical" in the context of bins means, but I am sure such will become obvious once I ponder your suggestion.

I again just want to express my gratitude for your writing this post! (back in 2017, a time when I would have had no clue what this post was talking about, lol!)

If all bins are identical, you can just use a para...

2020-10-04T20:26:10.081-04:00

If all bins are identical, you can just use a parameter (constant) used(j) with the first minused entries equal to 1 and the others 0.

I would write it as sum(i, x(i,j)) >= used(j), ...

2020-10-04T20:14:34.377-04:00

I would write it as sum(i, x(i,j)) >= used(j), sum(j, used(j)) >= minused with used(j) a binary variable. Note that some additional used bins may have used(j)=0.

Your approach should work just fine. Best thing to...

2020-10-04T19:50:45.068-04:00

Your approach should work just fine. Best thing to convince yourself, is to try out the formulation with a small data set. Just like I did for this post. Such an experiment should not take much time.

Thank you for a brilliant and straightforward disc...

2020-10-04T19:30:12.335-04:00

Thank you for a brilliant and straightforward discussion regarding how to approach this problem. I am so happy that I found your blog! As it happens, I've spent a solid 3+ days banging my head on the desk trying to figure out how to code the problem of minimizing the spread of averages across "buckets", and my approach was getting me nowhere (instead of treating the average value as a variable whose values are to be populated by the solver, I was trying to algorithmically compute the average values and then do a minimization on either the "excess" or "shortage" variables that are required for the "bucket sum" to match up to the average value. The problem I was wrangling with was how to deal with the fact that my average value computation (which used an incremental averaging/running sum technique to keep things nice and linear) was not suitable for computing the average of a set of numbers in which zeros are present and should not be allowed to "dilute" the mean value. Even that situation would have been OK, if all "buckets" were guaranteed to have the same number of nonzero values, but such was not the case for the problem I am solving. Your approach is sheer genius! I hope I can understand it for at least as long as it takes to transfer some of the fundamental methodology to my own problem!

You very helpfully show the result for when all buckets have at least 1 price, and the other extreme when no buckets are required to be filled (which results in a "trivial" solution of only 1 bucket getting filled and fully meeting all constraints, lol!)
*Question*: What if one were to require some *MINIMUM* number of buckets to be filled, but not necessarily *ALL* of them? So for example, require:

\sum_i x_{i,j} + SUMMEDPRICES * ( 1 - isBucketUsed{j} ) \ge 1

and

\sum_i x_{i,j} - SUMMEDPRICES * isBucketUsed{j} \le 0

where:

SUMMEDPRICES is the sum over all prices, eg SUMMEDPRICES = \sum_i p_{i}

isBucketUsed_{j} would be a binary variable equal to 1 when the bucket has one or more prices already assigned to it, and 0 otherwise

And to insure that some required minimum number of buckets (minBucketsUsed) were assigned prices:

\sum_j isBucketUsed{j} \ge minBucketsUsed

Would this strategy work for the case that falls somewhere in between the 2 extremes that you present? Or have I just messed up the linear aspect of the problem by introducing another binary variable and additional constraints? (I have only been working with this kind of modeling for about 2 weeks now, so I am still very new at working with this kind of problem!) Thanks for any advice you are willing to give on my approach for this "in-between" situation (which is actually the situation that I am struggling with in the problem that I am trying to solve for my own application!)