set.seed(123) nO <- 10L ## number of observations nA <- 100L ## number of assets mData <- array(rnorm(nO * nA, sd = 0.05), dim = c(nO, nA)) #Creating sample stock observations library("quadprog") aMat <- array(1, dim = c(1,nA)) bVec <- 1 zeros <- array(0, dim = c(nA,1)) solQP <- solve.QP(cov(mData), zeros, t(aMat), bVec, meq = 1) #Minimize optimization solQP$solution
I added the line that explicitly sets the seed to make the runs reproducible.
The model is a bit difficult to decipher in this R code, but it is a simplified traditional MV model:
Mean-Variance Portfolio Optimization Model |
---|
\[\begin{align} \min\> & \color{darkred}x'\color{darkblue}Q\color{darkred}x - \color{darkblue}\lambda \bar{\color{darkblue}r}'\color{darkred}x\\ & \sum_i \color{darkred}x_i = 1 \\ & \color{darkred}x_i \ge 0\end{align}\] |
Instead of having the expected returns in the objective, we also see versions of this model with a linear constraint that puts a minimum value on the expected return of the portfolio.
> solQP <- solve.QP(cov(mData), zeros, t(aMat), bVec, meq = 1) #Minimize optimization Error in solve.QP(cov(mData), zeros, t(aMat), bVec, meq = 1) : matrix D in quadratic function is not positive definite! >
Solution #1
> Q <- cov(mData) > eigen(Q)$values [1] 3.977635e-02 3.789523e-02 3.258164e-02 3.093782e-02 2.605389e-02 2.466463e-02 2.127899e-02 [8] 1.897506e-02 1.627900e-02 1.871945e-17 1.454315e-17 8.114384e-18 5.109196e-18 4.392427e-18 [15] 3.680766e-18 2.728485e-18 2.649302e-18 2.192073e-18 1.937634e-18 1.914487e-18 1.856811e-18 [22] 1.449321e-18 1.391434e-18 1.211216e-18 1.196721e-18 1.127119e-18 9.776146e-19 9.670197e-19 [29] 8.257462e-19 7.572554e-19 6.283522e-19 6.030820e-19 4.056541e-19 3.610128e-19 3.604800e-19 [36] 2.770106e-19 2.530797e-19 2.263314e-19 2.133854e-19 2.089700e-19 2.024130e-19 2.006560e-19 [43] 1.688435e-19 9.412675e-20 7.033666e-20 7.012531e-20 5.942729e-20 5.673501e-20 4.444257e-20 [50] 4.259787e-20 3.124233e-21 -2.017011e-20 -2.269857e-20 -3.034668e-20 -3.075493e-20 -3.262642e-20 [57] -4.345710e-20 -6.351674e-20 -7.347496e-20 -7.739535e-20 -8.065131e-20 -9.318415e-20 -1.004645e-19 [64] -1.048992e-19 -1.185271e-19 -1.315539e-19 -1.362383e-19 -1.376705e-19 -1.456396e-19 -1.697860e-19 [71] -1.700317e-19 -1.745068e-19 -1.794480e-19 -1.835296e-19 -1.960933e-19 -2.017457e-19 -2.332259e-19 [78] -2.377082e-19 -3.232830e-19 -3.468225e-19 -3.844707e-19 -4.310383e-19 -4.467493e-19 -4.540707e-19 [85] -4.977900e-19 -6.159103e-19 -6.338311e-19 -6.484161e-19 -6.489508e-19 -6.551306e-19 -6.664641e-19 [92] -7.276688e-19 -7.422977e-19 -7.927766e-19 -1.106209e-18 -1.707077e-18 -1.742870e-18 -2.465438e-18 [99] -1.859346e-17 -3.473638e-17
Most of the eigenvalues are basically zero with quite a few just a bit negative. A positive-definite matrix would have only positive values. A positive semi-definite (PSD) matrix would allow positive and zero eigenvalues but no negative ones.
> Q <- nearPD(cov(mData))$mat > solQP <- solve.QP(Q, zeros, t(aMat), bVec, meq = 1) > solQP$solution [1] 0.007861808 0.004649581 0.012677143 0.009135964 0.011926439 0.013624409 0.007916747 0.004307772 [9] 0.011808486 0.011085302 0.010229029 0.013080240 0.010085504 0.010998879 0.011099784 0.008520520 [17] 0.007451730 0.012424371 0.012677330 0.007013515 0.009070013 0.010422999 0.011041407 0.008768456 [25] 0.007594051 0.010845828 0.008091472 0.010219700 0.017100764 0.005583947 0.011981802 0.007366402 [33] 0.013770522 0.009063828 0.010744898 0.007836608 0.009580356 0.007065588 0.008219370 0.010592597 [41] 0.011775598 0.011112051 0.005555992 0.014596684 0.016019481 0.008870919 0.010940604 0.005662200 [49] 0.007056997 0.012274299 0.006510091 0.012764160 0.008258651 0.009945869 0.009891680 0.007318763 [57] 0.014960558 0.008255708 0.009655097 0.010054760 0.007309219 0.008761360 0.011616297 0.010300254 [65] 0.012043733 0.011963495 0.009991699 0.010353212 0.013896332 0.011619403 0.007160011 0.007202327 [73] 0.008530819 0.010918658 0.004754215 0.008877062 0.008526937 0.010251250 0.008190446 0.006754629 [81] 0.013209816 0.014849748 0.012213860 0.008888433 0.018188014 0.009756690 0.014786487 0.014893706 [89] 0.009737739 0.013256532 0.009990737 -0.001155889 0.009264992 0.007733719 0.014505131 0.011040198 [97] 0.012859850 0.008987104 0.008458713 0.004497735
> eigen(Q)$values
[1] 3.977635e-02 3.789522e-02 3.258164e-02 3.093782e-02 2.605389e-02 2.466462e-02 2.127899e-02 1.897506e-02
[9] 1.627900e-02 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[17] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[25] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[33] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[41] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[49] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[57] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[65] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10
[73] 3.977635e-10 3.977635e-10 3.977635e-10 3.977635e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10
[81] 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10
[89] 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10 3.977634e-10
[97] 3.977634e-10 3.977634e-10 3.977633e-10 3.977633e-10
Solution #2
> Q <- cov(mData) + diag(1e-6,nA)
Fixing the model
- We want to pay attention to returns. So let's add a constraint that the expected portfolio returns are at least 0.035. This also has the effect that just a few assets will be in our solution. Which makes them easier to compare. The constraint can be stated as \[\sum_i \bar{r}_i \ge 0.035\] where \[\bar{r}_i = \frac{\sum_t r_{t,i}}{T}\] are the mean returns. Here \(T\) indicates the number of observations, and \(r_{t,i}\) are the returns.
- We also want to explicitly state that variables are non-negative: \(x_i \ge 0\). This is called "no short-selling",
Portfolio Model algebraic formulation | matrix notation |
---|---|
\[\begin{align} \min & \sum_{i,j} \color{darkblue}q_{i,j} \color{darkred}x_i \color{darkred}x_j \\ & \sum_i \color{darkred}x_i = 1 \\ & \sum_i \bar{\color{darkblue}r}_i \color{darkred}x_i \ge 0.035\\ & \color{darkred}x_i \ge 0\end{align}\] | \[\begin{align} \min \>& \color{darkred}x' \color{darkblue}Q \color{darkred}x \\ & \color{darkblue}e'\color{darkred}x = 1 \\ & \bar{\color{darkblue}r}' \color{darkred}x \ge 0.035\\ & \color{darkred}x \ge 0\end{align}\] |
> v <- c(0,0,3,0,0,0,1,0,0,0,2,0,0,0,0) > v [1] 0 0 3 0 0 0 1 0 0 0 2 0 0 0 0 > names(v) <- paste("v",1:length(v),sep="") > v[v>0] v3 v7 v11 3 1 2
> Q <- nearPD(cov(mData))$mat > aMat <- rbind( + c(rep(1,nA)), # sum(x)=1 + colMeans(mData), # sum(mu(i)*x(i))>=0.035 + diag(1,nA)) # for bounds x(i)>=0 > bVec <- c(1, + 0.035, + rep(0,nA)) > solQP <- solve.QP(Q, zeros, t(aMat), bVec, meq = 1) > xsol <- solQP$solution > names(xsol) = paste("x",1:100,sep="") > xsol[xsol>1e-4] x30 x75 x88 0.1465990 0.5314489 0.3219522
> library(CVXR) > x <- Variable(nA) > w <- Variable(nO) > prob <- Problem( + Minimize(quad_form(x,cov(mData))), + list(sum(x) == 1, + t(colMeans(mData)) %*% x >= 0.035, + x >= 0) + ) > result <- solve(prob, verbose=T) ----------------------------------------------------------------- OSQP v0.6.0 - Operator Splitting QP Solver (c) Bartolomeo Stellato, Goran Banjac University of Oxford - Stanford University 2019 ----------------------------------------------------------------- problem: variables n = 100, constraints m = 102 nnz(P) + nnz(A) = 5350 settings: linear system solver = qdldl, eps_abs = 1.0e-05, eps_rel = 1.0e-05, eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04, rho = 1.00e-01 (adaptive), sigma = 1.00e-06, alpha = 1.60, max_iter = 10000 check_termination: on (interval 25), scaling: on, scaled_termination: off warm start: on, polish: on, time_limit: off iter objective pri res dua res rho time 1 0.0000e+00 1.00e+00 1.00e+02 1.00e-01 2.88e-03s 200 8.7535e-04 5.96e-04 1.36e-06 1.75e-02 9.38e-03s 400 9.2707e-04 5.09e-04 1.16e-06 1.75e-02 1.41e-02s 600 9.7703e-04 4.36e-04 9.94e-07 1.75e-02 1.99e-02s 800 1.0239e-03 3.72e-04 8.50e-07 1.75e-02 2.58e-02s 1000 1.0671e-03 3.18e-04 7.27e-07 1.75e-02 3.11e-02s 1200 1.1062e-03 2.72e-04 6.21e-07 1.75e-02 3.76e-02s 1400 1.1413e-03 2.33e-04 5.31e-07 1.75e-02 4.35e-02s 1600 1.1726e-03 1.99e-04 4.54e-07 1.75e-02 4.90e-02s 1800 1.2001e-03 1.70e-04 3.88e-07 1.75e-02 5.51e-02s 2000 1.2243e-03 1.45e-04 3.32e-07 1.75e-02 6.20e-02s 2200 1.2455e-03 1.24e-04 2.84e-07 1.75e-02 6.81e-02s 2400 1.2639e-03 1.06e-04 2.43e-07 1.75e-02 7.33e-02s 2600 1.2799e-03 9.09e-05 2.07e-07 1.75e-02 7.83e-02s 2800 1.2938e-03 7.77e-05 1.77e-07 1.75e-02 8.51e-02s 3000 1.3058e-03 6.65e-05 1.52e-07 1.75e-02 9.20e-02s 3200 1.3161e-03 5.68e-05 1.30e-07 1.75e-02 1.00e-01s 3400 1.3251e-03 4.86e-05 1.11e-07 1.75e-02 1.08e-01s 3600 1.3327e-03 4.15e-05 9.48e-08 1.75e-02 1.21e-01s 3800 1.3394e-03 3.55e-05 8.11e-08 1.75e-02 1.28e-01s 4000 1.3450e-03 3.04e-05 6.93e-08 1.75e-02 1.35e-01s 4200 1.3499e-03 2.60e-05 5.92e-08 1.75e-02 1.43e-01s 4400 1.3541e-03 2.22e-05 5.07e-08 1.75e-02 1.50e-01s 4550 1.3568e-03 1.97e-05 4.50e-08 1.75e-02 1.56e-01s plsh 1.3790e-03 3.33e-16 1.21e-15 --------- 1.58e-01s status: solved solution polish: successful number of iterations: 4550 optimal objective: 0.0014 run time: 1.58e-01s optimal rho estimate: 2.58e-02 > xsol <- result$getValue(x) > names(xsol) = paste("x",1:100,sep="") > xsol[xsol>1e-4] x30 x75 x88 0.1465989 0.5314489 0.3219522 >
Here we use our covariance matrix directly. Both CVXR and OSQP accept it, even though it has tiny negative numbers. However, the OSQP documentation warns that this may be a bit dangerous [3]:
We recommend the user to check the convexity of their problem before passing it to OSQP! If the user passes a non-convex problem we do not assure the solver will be able to detect it.
OSQP will try to detect non-convex problems by checking if the residuals diverge or if there are any issues in the initial factorization (if a direct method is used). It will detect non-convex problems when one or more of the eigenvalues of
P
are “clearly” negative, i.e., whenP + sigma * I
is not positive semidefinite. However, it might fail to detect non-convexity whenP
has slightly negative eigenvalues, i.e., whenP + sigma * I
is positive semidefinite andP
is not.
Solution #3
Means-adjusted-returns Portfolio Optimization Model |
---|
\[\begin{align} \min\> & \sum_t \color{darkred}w_t^2 \\ &\color{darkred}w_t = \sum_i (\color{darkblue}r_{t,i} - \bar{\color{darkblue}r}_i)\color{darkred}x_i \\ & \sum_i \color{darkred}x_i = 1 \\ & \sum_i \bar{\color{darkblue}r}_i \color{darkred}x_i \ge 0.035\\ & \color{darkred}x_i \ge 0\end{align}\] |
Some of the advantages of this model are:
- The quadratic objective is benign. It is sparse and safely positive definite.
- The data is smaller than a full-blown covariance matrix given that the number of observations is smaller than the number of instruments.
- The number of quadratic variables is smaller (again, because \(T<N\)).
> library(CVXR) > x <- Variable(nA) > w <- Variable(nO) > adjRet <- mData - matrix(colMeans(mData),nrow=nO,ncol=nA,byrow=T) > prob <- Problem( + Minimize(sum_squares(w)), + list(w == adjRet %*% x, + sum(x)==1, + t(colMeans(mData)) %*% x >= 0.035, + x >= 0) + ) > result <- solve(prob, verbose=T) ----------------------------------------------------------------- OSQP v0.6.0 - Operator Splitting QP Solver (c) Bartolomeo Stellato, Goran Banjac University of Oxford - Stanford University 2019 ----------------------------------------------------------------- problem: variables n = 110, constraints m = 112 nnz(P) + nnz(A) = 1320 settings: linear system solver = qdldl, eps_abs = 1.0e-05, eps_rel = 1.0e-05, eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04, rho = 1.00e-01 (adaptive), sigma = 1.00e-06, alpha = 1.60, max_iter = 10000 check_termination: on (interval 25), scaling: on, scaled_termination: off warm start: on, polish: on, time_limit: off iter objective pri res dua res rho time 1 0.0000e+00 1.00e+00 1.00e+02 1.00e-01 1.15e-03s 200 5.3515e-03 1.49e-03 4.04e-05 1.00e-01 4.33e-03s 400 7.0787e-03 7.87e-04 6.50e-06 1.00e-01 8.41e-03s 600 7.3521e-03 7.13e-04 5.90e-06 1.00e-01 1.23e-02s 800 7.6413e-03 6.45e-04 5.34e-06 1.00e-01 1.61e-02s 1000 7.9373e-03 5.84e-04 4.84e-06 1.00e-01 1.93e-02s 1200 8.2334e-03 5.29e-04 4.38e-06 1.00e-01 2.27e-02s 1400 8.5244e-03 4.79e-04 3.96e-06 1.00e-01 2.64e-02s 1600 8.8067e-03 4.33e-04 3.59e-06 1.00e-01 3.00e-02s 1800 9.0778e-03 3.92e-04 3.25e-06 1.00e-01 3.34e-02s 2000 9.3358e-03 3.55e-04 2.94e-06 1.00e-01 3.69e-02s 2200 9.5798e-03 3.22e-04 2.66e-06 1.00e-01 4.05e-02s 2400 9.8091e-03 2.91e-04 2.41e-06 1.00e-01 4.43e-02s 2600 1.0024e-02 2.64e-04 2.18e-06 1.00e-01 4.81e-02s 2800 1.0224e-02 2.39e-04 1.98e-06 1.00e-01 5.12e-02s 3000 1.0410e-02 2.16e-04 1.79e-06 1.00e-01 5.48e-02s 3200 1.0582e-02 1.96e-04 1.62e-06 1.00e-01 5.84e-02s 3400 1.0741e-02 1.77e-04 1.47e-06 1.00e-01 6.19e-02s 3600 1.0887e-02 1.60e-04 1.33e-06 1.00e-01 6.55e-02s 3800 1.1022e-02 1.45e-04 1.20e-06 1.00e-01 6.90e-02s 4000 1.1145e-02 1.31e-04 1.09e-06 1.00e-01 7.53e-02s 4200 1.1259e-02 1.19e-04 9.85e-07 1.00e-01 8.18e-02s 4400 1.1362e-02 1.08e-04 8.92e-07 1.00e-01 8.95e-02s 4600 1.1457e-02 9.75e-05 8.07e-07 1.00e-01 9.44e-02s 4800 1.1544e-02 8.83e-05 7.31e-07 1.00e-01 1.04e-01s 5000 1.1623e-02 7.99e-05 6.62e-07 1.00e-01 1.09e-01s 5200 1.1695e-02 7.24e-05 5.99e-07 1.00e-01 1.13e-01s 5400 1.1761e-02 6.55e-05 5.42e-07 1.00e-01 1.20e-01s 5600 1.1821e-02 5.93e-05 4.91e-07 1.00e-01 1.26e-01s 5800 1.1876e-02 5.37e-05 4.45e-07 1.00e-01 1.31e-01s 6000 1.1925e-02 4.86e-05 4.02e-07 1.00e-01 1.37e-01s 6200 1.1970e-02 4.40e-05 3.64e-07 1.00e-01 1.45e-01s 6400 1.2011e-02 3.98e-05 3.30e-07 1.00e-01 1.58e-01s 6600 1.2049e-02 3.61e-05 2.99e-07 1.00e-01 1.72e-01s 6800 1.2082e-02 3.27e-05 2.70e-07 1.00e-01 1.81e-01s 7000 1.2113e-02 2.96e-05 2.45e-07 1.00e-01 1.90e-01s 7200 1.2141e-02 2.68e-05 2.22e-07 1.00e-01 1.98e-01s 7400 1.2166e-02 2.42e-05 2.01e-07 1.00e-01 2.01e-01s 7600 1.2189e-02 2.19e-05 1.82e-07 1.00e-01 2.05e-01s 7800 1.2210e-02 1.99e-05 1.64e-07 1.00e-01 2.08e-01s plsh 1.2411e-02 6.66e-16 1.28e-15 --------- 2.09e-01s status: solved solution polish: successful number of iterations: 7800 optimal objective: 0.0124 run time: 2.09e-01s optimal rho estimate: 3.27e-01 > xsol <- result$getValue(x) > names(xsol) = paste("x",1:100,sep="") > xsol[xsol>1e-4] x30 x75 x88 0.1465989 0.5314489 0.3219522
> Q <- nearPD(cov(mData))$mat > D <- rbind( + cbind(Q,matrix(0,nrow=nA,ncol=nO)), + cbind(matrix(0,nrow=nO,ncol=nA),diag(1e-6,nO)) + ) > aMat <- rbind( + cbind(adjRet,diag(-1,nO)), # sum(i,adjRet(t,i)*x(i))-w(t)=0 + c(rep(1,nA),rep(0,nO)), # sum(x)=1 + c(colMeans(mData),rep(0,nO)), # sum(mu(i)*x(i))>=0.035 + cbind(diag(1,nA),matrix(0,nrow=nA,ncol=nO)) # for bounds x(i)>=0 + ) > bVec <- c(rep(0,nO), + 1, + 0.035, + rep(0,nA)) > zeros <- rep(0,nA+nO) > solQP <- solve.QP(D, zeros, t(aMat), bVec, meq = nO+1) > xsol <- solQP$solution[1:nA] > names(xsol) = paste("x",1:100,sep="") > xsol[xsol>1e-4] x30 x75 x88 0.1465990 0.5314489 0.3219522
Conclusion
- It is not very suited for more complex models: the matrix oriented input format is just too cumbersome.
- The quadratic coefficient matrix must be strictly positive definite.
- All variables are quadratic, linear variables are not available.
- The models are more readable, assuming you are familiar with matrix notation.
- We can formulate models with linear variables without a problem.
- Models with a quadratic coefficient matrix that are borderline positive semi-definite are accepted by both CVXR and the solver I used: OSQP.
- nearPD(A) to find a positive-definite matrix close to A.
- Printing a sparse vector using names.
References
- R minimum variance portfolio: solve non invertible matrix, https://stackoverflow.com/questions/64067591/r-minimum-variance-portfolio-solve-non-invertible-matrix
- Covariance Matrix not positive definite in portfolio models, https://yetanothermathprogrammingconsultant.blogspot.com/2018/04/covariance-matrix-not-positive-definite.html
- OSQP, Status values and errors, https://osqp.org/docs/interfaces/status_values.html
No comments:
Post a Comment