Saturday, December 7, 2019

Elementwise vs matrix multiplication


Introduction


In this post I want to demonstrate some interesting issues in how CVXR and CVXPY deal with elementwise vs matrix multiplication and compare that to the host languages (R and Python, respectively). CVXR and CVXPY deviate a bit from what their host languages do. When implementing optimization models with these tools, it is important to be aware of these details.

Elementwise and matrix multiplication 


There are two often used methods to perform the multiplication of matrices (and vectors). The first is simply elementwise multiplication: \[c_{i,j} = a_{i,j} \cdot b_{i,j}\] In mathematics, this is sometimes referred to as the Hadamard product. The notation is typically: \[C = A \circ B\] but sometimes we see: \[C = A \odot B\] This product only works when \(A\) and \(B\) have the same shape. I.e. \[\begin{matrix} C&=&A&\circ&B\\ (m \times n) &&(m \times n)&&(m \times n)\end{matrix}\]

The standard matrix multiplication \(C=A\cdot B\) or \[c_{i,j} = \sum_k a_{i,k} b_{k,j}\] has a different rule for conformance:  \[\begin{matrix} C&=&A&\cdot&B\\ (m \times n) &&(m \times k)&&(k \times n)\end{matrix}\] Most of the time, the dot operator is dropped and we write \(C = A B\).

In optimization modeling both forms are used a lot.

R, CVXR


R has two operators for multiplication:
  • * for elementwise multiplication
  • %*% for matrix multiplication

A vector is considered as a column vector (i.e. an \(n\)-vector is like a \(n \times 1\) matrix).  There is one special thing in R, as shown here:


> m <- 2
> n <- 3
> A <- matrix(c(1,2,3,4,5,6),m,n)
> A
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
> 
> x <- c(1,2,3)
>
 
#
# matrix multiplication
# 
> A %*% x
     [,1]
[1,]   22
[2,]   28
> 

#
# elementwise multiplication
#
> A * A
     [,1] [,2] [,3]
[1,]    1    9   25
[2,]    4   16   36

#
# but this also works
#
> A * x
     [,1] [,2] [,3]
[1,]    1    9   10
[2,]    4    4   18
>


The last multiplication is surprising as \(A\) is a \(2 \times 3\) matrix and \(x\) is different in shape. Well, R may extend and recycle vectors to make them as large as needed. In this case \(x\) is duplicated and then considered as a  \(2 \times 3\) matrix. More or less like:


> x
[1] 1 2 3
> matrix(x,2,3)
     [,1] [,2] [,3]
[1,]    1    3    2
[2,]    2    1    3


The modeling tool CVXR follows R notation and implements both * and %*% (for elementwise and matrix multiplication). However, CVXR is not implementing the extending and recycling of vectors that are too small.

Concept: recycling


Just to emphasize the concept of recycling. If an operation requires two vectors of the same length, R may make the shorter vector longer by recycling (duplicating). Here is an example:


> a <- 1:10
> b <- 1:2
> c <- a + b
> c
 [1]  2  4  4  6  6  8  8 10 10 12


In this example the vector a has elements 1 through 10. The vector b is too short, so it is recycled. When added to a, b is functionally equal to rep(c(1,2),5). When multiples of b do not exactly fit a, we effectively have a fractional duplication number. E.g. when we use b <- 1:3, we get a message:
Warning message:
In a + b : longer object length is not a multiple of shorter object length

This recycling trick is somewhat unique to R (I don't know about other languages doing this).

Example


In [2] the product: \[b_{i,j} = a_{i,j} \cdot x_i \] is implemented in R with the elementwise product:


> m <- 2
> n <- 3
> A <- matrix(c(1,2,3,4,5,6),m,n)
> A
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
> x <- c(1,2)
> B <- A*x
> B
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    4    8   12


This again uses recycling of \(x\). CVXR is not doing this automatically. We can see this here:


> library(CVXR)
> x <- Variable(m)
> B <- Variable(m,n)
> e <- rep(1,n)
> problem <- Problem(Minimize(0),
+                    list(x == c(1,2), 
+                         B == A * x ))
Error in sum_shapes(lapply(object@args, function(arg) { : 
  Incompatible dimensions


Here \(x\) is now a CVXR variable. As the vector \(x\) is not recycled, we end up with two different shapes and elementwise multiplication is refused. So how do we do something like this in CVXR?

The recycling operation:


> x
[1] 1 2
> matrix(x,m,n)
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    2    2


can be expressed in matrix notation as: \[x \cdot e^T\] where \(e\) is a (column) vector of ones. This is sometimes called an outerproduct. I.e. we can write our assignment as \[B = A \circ (x \cdot e^T)\] In a CVXR model this can look like:


> library(CVXR)
> x <- Variable(m)
> B <- Variable(m,n)
> e <- rep(1,n)
> problem <- Problem(Minimize(0),
+                    list(x == c(1,2), 
+                         B == A * (x %*% t(e))))
> sol <- solve(problem)
> sol$status
[1] "optimal"
> sol$getValue(x)
     [,1]
[1,]    1
[2,]    2
> sol$getValue(B)
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    4    8   12


Note: the constraint B == A * (x %*% t(e)) has both a matrix multiplication and a elementwise multiplication. This is rather funky.

Conclusion: if your matrix operations rely on recycling, you will need to rework things a bit to have this work correctly in CVXR. CVXR does not do recycling.

Python, CVXPY


In the previous section we saw that there are subtle differences between R's and CVXR's elementwise multiplication semantics.

Let's now look at Python and CVXPY.

Since Python 3.5 we have two multiplication operators:

  • * for elementwise multiplication 
  • @ for matrix multiplication 

CVXPY has different rules:

  • *, @  and matmul for matrix multiplication
  • multiply for elementwise multiplication

Example



import numpy as np
import cvxpy as cp

#
# In Python/numpy * indicates elementwise multiplication
#
A = np.array([[1,2],[3,4]])
B = np.array([[1,1],[2,2]])
C = A*B
print(A)
# output:
# [[ 1 2]
#  [ 3 4]]
print(C)
# output:
# [[ 1 2]
#  [ 6 8]]

#
# In CVXPY * indicates matrix multiplication
#
A = cp.Variable((2,2))
C = cp.Variable((2,2))
prob = cp.Problem(cp.Minimize(0),
                  [A == [[1,2],[3,4]],
                   C == A*B])
prob.solve(verbose=True)
print(A.value)
# output:
# [[1. 3.]
#  [2. 4.]]
print(C.value)
# [[ 7.  7.]
#  [10. 10.]]

Here we see some differences between Python/Numpy and CVXPY. First the interpretation of Python lists (the values for A) is different. And secondly: the semantics of * are different. This may cause some confusion.

References


  1. CVXR, Convex Optimization in R, https://cvxr.rbind.io/
  2. CVXR Elementwise multiplication of matrix with vector, https://stackoverflow.com/questions/59224555/cvxr-elementwise-multiplication-of-matrix-with-vector

No comments:

Post a Comment