Regression I |
---|
\[\begin{align}\min_{\color{darkred}w}\>& ||\color{darkblue}y-\color{darkblue}X\color{darkred}w||^2_2 \end{align}\] |
In Python code, this can look like:
import cvxpy as cp import numpy as np N = 200000 M = 100 X = np.random.normal(0, 1, size=(N, M)) y = np.random.normal(0, 1, size=N) w = cp.Variable(M) prob = cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w))) prob.solve() print("status:",prob.status) print("obj:",prob.value)
Unfortunately, this is giving the following runtime error:
[ec2-user@ip-172-30-0-79 etc]$ python3 ls0.py Traceback (most recent call last): File "ls0.py", line 12, in <module> prob.solve() File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 289, in solve return solve_func(self, *args, **kwargs) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 567, in _solve self._construct_chains(solver=solver, gp=gp) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 510, in _construct_chains raise e File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 501, in _construct_chains self._intermediate_chain.apply(self) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/chain.py", line 65, in apply problem, inv = r.apply(problem) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/qp2quad_form/qp2symbolic_qp.py", line 60, in apply return super(Qp2SymbolicQp, self).apply(problem) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py", line 58, in apply problem.objective) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py", line 96, in canonicalize_tree canon_arg, c = self.canonicalize_tree(arg) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py", line 99, in canonicalize_tree canon_expr, c = self.canonicalize_expr(expr, canon_args) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py", line 108, in canonicalize_expr return self.canon_methods[type(expr)](expr, args) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/qp2quad_form/atom_canonicalizers/quad_over_lin_canon.py", line 29, in quad_over_lin_canon return SymbolicQuadForm(t, eye(affine_expr.size)/y, expr), [affine_expr == t] File "/home/ec2-user/.local/lib/python3.7/site-packages/numpy/lib/twodim_base.py", line 201, in eye m = zeros((N, M), dtype=dtype, order=order) MemoryError: Unable to allocate array with shape (200000, 200000) and data type float64 [ec2-user@ip-172-30-0-79 etc]$
This is interesting: CVXPY seems to allocate a \(200,000\times 200,000\) matrix here. Actually it is an identity matrix! This can be seen from the name eye. This seems to be related to the reformulation into a QP model.
To get this a bit more under control, lets make the size a bit smaller and use a memory profiler. At least we get some information about the memory usage.
[ec2-user@ip-172-30-0-79 etc]$ python3 -m memory_profiler ls1.py ----------------------------------------------------------------- OSQP v0.6.0 - Operator Splitting QP Solver (c) Bartolomeo Stellato, Goran Banjac University of Oxford - Stanford University 2019 ----------------------------------------------------------------- problem: variables n = 20100, constraints m = 20000 nnz(P) + nnz(A) = 2040000 settings: linear system solver = qdldl, eps_abs = 1.0e-05, eps_rel = 1.0e-05, eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04, rho = 1.00e-01 (adaptive), sigma = 1.00e-06, alpha = 1.60, max_iter = 10000 check_termination: on (interval 25), scaling: on, scaled_termination: off warm start: on, polish: on, time_limit: off iter objective pri res dua res rho time 1 0.0000e+00 4.38e+00 2.98e+04 1.00e-01 1.03e+00s 50 1.9804e+04 2.13e-09 4.01e-07 1.00e-01 1.65e+00s plsh 1.9804e+04 4.01e-15 1.03e-12 -------- 2.45e+00s status: solved solution polish: successful number of iterations: 50 optimal objective: 19804.0226 run time: 2.45e+00s optimal rho estimate: 1.35e-02 status: optimal obj: 19804.022648294507 Filename: ls1.py Line # Mem usage Increment Line Contents ================================================ 4 50.188 MiB 50.188 MiB @profile 5 def f(): 6 # N = 200000 7 50.188 MiB 0.000 MiB N = 20000 8 50.188 MiB 0.000 MiB M = 100 9 10 50.188 MiB 0.000 MiB np.random.seed(123) 11 65.477 MiB 15.289 MiB X = np.random.normal(0, 1, size=(N, M)) 12 65.707 MiB 0.230 MiB y = np.random.normal(0, 1, size=N) 13 14 65.707 MiB 0.000 MiB w = cp.Variable(M) 15 65.707 MiB 0.000 MiB prob = cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w))) 16 412.086 MiB 346.379 MiB prob.solve(verbose=True) 17 412.086 MiB 0.000 MiB print("status:",prob.status) 18 412.086 MiB 0.000 MiB print("obj:",prob.value)
We see indeed this is solved as QP model (it is solved by the QP solver OSQP), and in the solve statement we have this spike in memory usage.
The allocation is measured in MiBs or Mebibytes. A mebibyte is equal to \(2^{20}=1,048,576\) bytes.
It is probably a bad idea to allocate this identity matrix like this as a fully allocated matrix. The better approach would be not to use this matrix at all. A second best approach would be to make this identity matrix a sparse matrix.
Approach 1: use a conic programming solver
This is an easy fix. Just use a solver like ECOS:
[ec2-user@ip-172-30-0-79 etc]$ python3 -m memory_profiler ls1.py ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS It pcost dcost gap pres dres k/t mu step sigma IR | BT 0 +0.000e+00 -2.601e-05 +2e+04 5e-01 7e-06 1e+00 1e+04 --- --- 1 2 - | - - 1 -9.312e-01 +2.050e-02 +3e+02 2e-02 1e-07 1e+00 2e+02 0.9829 1e-04 1 2 2 | 0 0 2 -5.192e+00 +2.947e+00 +1e+02 7e-03 5e-08 8e+00 7e+01 0.6186 7e-02 2 3 3 | 0 0 3 +6.518e+02 +8.866e+02 +3e+02 6e-02 4e-07 2e+02 1e+02 0.1800 9e-01 2 2 2 | 0 0 4 +9.646e+02 +9.849e+02 +4e+00 1e-03 8e-09 2e+01 2e+00 0.9826 1e-04 7 7 6 | 0 0 5 +1.130e+03 +1.144e+03 +7e-01 2e-03 1e-09 1e+01 4e-01 0.8836 2e-02 2 3 2 | 0 0 6 +1.493e+03 +1.521e+03 +3e-01 1e-02 4e-10 3e+01 2e-01 0.9890 1e-01 1 1 1 | 0 0 7 +2.057e+03 +2.080e+03 +5e-02 8e-03 9e-11 2e+01 3e-02 0.9271 5e-02 1 2 1 | 0 0 8 +2.134e+03 +2.152e+03 +4e-02 4e-03 4e-11 2e+01 2e-02 0.8136 2e-01 2 2 2 | 0 0 9 +2.244e+03 +2.256e+03 +3e-02 4e-02 2e-11 1e+01 1e-02 0.7107 1e-01 1 1 1 | 0 0 10 +1.995e+03 +2.004e+03 +4e-03 6e-02 2e-11 9e+00 4e-03 0.0005 8e-01 1 1 1 | 0 0 11 +6.609e+02 +6.615e+02 +8e+00 8e-03 2e-12 6e-01 4e+00 0.9890 8e-01 0 0 0 | 0 0 12 +6.293e+02 +6.423e+02 +1e+01 5e-03 2e-12 1e+01 8e+00 0.5325 3e-01 1 1 1 | 0 0 13 +5.873e+02 +7.017e+02 +3e+01 2e-02 3e-12 1e+02 2e+01 0.4675 5e-01 2 3 3 | 0 0 14 +7.549e+02 +1.687e+03 +4e+00 7e-03 4e-12 9e+02 3e+00 0.9890 1e-01 1 1 1 | 0 0 15 +1.292e+03 +1.300e+03 +2e+00 2e-03 3e-12 8e+00 9e-01 0.8813 2e-01 3 2 2 | 0 0 16 +2.141e+03 +2.149e+03 +2e-01 2e-02 2e-12 8e+00 9e-02 0.9890 5e-03 1 1 1 | 0 0 17 +2.819e+03 +2.827e+03 +2e-02 4e-02 4e-12 8e+00 1e-02 0.9052 3e-02 1 1 1 | 0 0 18 +2.757e+03 +2.774e+03 +5e-02 2e-02 2e-12 2e+01 2e-02 0.9746 4e-01 1 2 2 | 0 0 19 +2.615e+03 +2.631e+03 +1e-02 8e-02 2e-12 2e+01 8e-03 0.1350 2e-01 1 1 1 | 0 0 20 +2.002e+03 +2.009e+03 +2e-03 6e-02 3e-13 7e+00 3e-03 0.0029 9e-01 1 1 1 | 0 0 21 +2.884e+03 +2.959e+03 +2e-03 8e-02 3e-12 8e+01 9e-03 0.7080 7e-01 0 0 0 | 0 0 22 +2.861e+03 +2.934e+03 +2e-03 9e-02 3e-12 7e+01 9e-03 0.0007 9e-01 1 1 1 | 0 0 23 +1.847e+03 +1.857e+03 +3e-02 5e-02 2e-13 1e+01 2e-02 0.0026 1e+00 0 0 0 | 0 0 24 +6.545e+02 +6.548e+02 +1e+01 9e-03 2e-12 3e-01 6e+00 0.9890 6e-01 0 0 0 | 0 0 25 +7.003e+02 +7.142e+02 +2e+01 4e-03 2e-12 1e+01 1e+01 0.6078 2e-01 1 1 0 | 0 0 26 +6.232e+02 +7.771e+02 +5e+01 2e-02 3e-12 2e+02 3e+01 0.5706 4e-01 2 3 3 | 0 0 27 +5.916e+02 +1.396e+03 +8e+00 3e-03 2e-12 8e+02 6e+00 0.9890 1e-01 1 1 1 | 0 0 28 +1.468e+03 +1.658e+03 +1e+00 2e-03 3e-12 2e+02 7e-01 0.9890 9e-02 1 1 1 | 0 0 29 +2.046e+03 +2.089e+03 +4e-01 1e-02 4e-13 4e+01 2e-01 0.9890 3e-02 1 1 1 | 0 0 30 +1.720e+03 +1.744e+03 +7e-02 5e-02 3e-14 2e+01 6e-02 0.1873 2e-01 1 1 1 | 0 0 31 +6.708e+02 +6.722e+02 +2e+01 7e-03 2e-12 1e+00 1e+01 0.9890 6e-01 0 0 0 | 0 0 32 +5.985e+02 +6.158e+02 +5e+01 3e-03 9e-13 2e+01 3e+01 0.9260 3e-01 1 1 1 | 0 0 33 +7.045e+02 +9.047e+02 +1e+02 2e-02 2e-12 2e+02 6e+01 0.6591 5e-01 1 2 3 | 0 0 34 +4.246e+02 +8.139e+02 +3e+01 2e-03 1e-13 4e+02 3e+01 0.9890 2e-01 1 1 1 | 0 0 35 +1.140e+03 +1.436e+03 +7e-01 3e-03 7e-13 3e+02 3e+00 0.9600 1e-02 1 1 1 | 0 0 36 +1.597e+03 +1.707e+03 +1e+00 4e-03 3e-14 1e+02 9e-01 0.9890 2e-01 2 2 2 | 0 0 37 +2.184e+03 +2.226e+03 +1e-01 5e-03 1e-12 4e+01 1e-01 0.9890 6e-02 1 2 2 | 0 0 38 +2.540e+03 +2.556e+03 +2e-01 4e-03 2e-12 2e+01 1e-01 0.9890 2e-01 1 1 2 | 0 0 39 +3.055e+03 +3.065e+03 +5e-02 6e-03 2e-12 1e+01 3e-02 0.9890 8e-02 1 2 2 | 0 0 40 +3.450e+03 +3.458e+03 +4e-02 5e-03 2e-12 8e+00 2e-02 0.9890 1e-01 1 1 2 | 0 0 41 +3.932e+03 +3.939e+03 +1e-02 7e-03 2e-12 6e+00 8e-03 0.9890 1e-01 1 1 2 | 0 0 42 +4.347e+03 +4.352e+03 +1e-02 6e-03 2e-12 6e+00 5e-03 0.9890 1e-01 1 1 2 | 0 0 43 +4.796e+03 +4.801e+03 +5e-03 8e-03 2e-12 5e+00 3e-03 0.9890 1e-01 1 1 2 | 0 0 44 +5.206e+03 +5.211e+03 +3e-03 7e-03 2e-12 4e+00 2e-03 0.9890 1e-01 1 1 2 | 0 0 45 +5.633e+03 +5.637e+03 +2e-03 1e-02 2e-12 4e+00 1e-03 0.9890 1e-01 1 1 2 | 0 0 46 +6.028e+03 +6.032e+03 +1e-03 9e-03 2e-12 4e+00 8e-04 0.9890 1e-01 1 1 2 | 0 0 47 +6.430e+03 +6.433e+03 +9e-04 1e-02 2e-12 3e+00 5e-04 0.9890 1e-01 1 1 2 | 0 0 48 +6.811e+03 +6.814e+03 +7e-04 1e-02 2e-12 3e+00 4e-04 0.9890 1e-01 1 1 2 | 0 0 49 +7.188e+03 +7.191e+03 +5e-04 1e-02 2e-12 3e+00 3e-04 0.9890 1e-01 1 1 2 | 0 0 50 +7.551e+03 +7.553e+03 +3e-04 1e-02 2e-12 2e+00 2e-04 0.9890 1e-01 1 1 2 | 0 0 51 +7.908e+03 +7.910e+03 +2e-04 2e-02 2e-12 2e+00 1e-04 0.9890 2e-01 1 1 2 | 0 0 52 +8.251e+03 +8.253e+03 +2e-04 2e-02 2e-12 2e+00 1e-04 0.9890 2e-01 1 1 2 | 0 0 53 +8.586e+03 +8.588e+03 +1e-04 2e-02 2e-12 2e+00 8e-05 0.9890 2e-01 1 1 2 | 0 0 54 +8.911e+03 +8.913e+03 +1e-04 2e-02 2e-12 2e+00 6e-05 0.9890 2e-01 1 1 2 | 0 0 55 +9.228e+03 +9.230e+03 +8e-05 2e-02 2e-12 2e+00 4e-05 0.9890 2e-01 1 1 2 | 0 0 56 +9.534e+03 +9.535e+03 +6e-05 2e-02 2e-12 1e+00 4e-05 0.9890 2e-01 1 1 2 | 0 0 57 +9.779e+03 +9.780e+03 +5e-05 1e-01 2e-12 1e+00 3e-05 0.8357 2e-01 1 1 1 | 0 0 58 +9.766e+03 +9.768e+03 +2e-05 3e-01 2e-12 1e+00 1e-05 0.0001 9e-01 1 1 1 | 0 0 59 +3.518e+03 +3.517e+03 -6e+00 7e+00 9e-13 -5e-01 -3e+00 0.0000 1e+00 1 1 1 | 0 0 Unreliable search direction detected, recovering best iterate (58) and stopping. Close to PRIMAL INFEASIBLE (within feastol=5.7e-06). Runtime: 20.262543 seconds. status: infeasible_inaccurate obj: None Filename: ls1.py Line # Mem usage Increment Line Contents ================================================ 4 50.312 MiB 50.312 MiB @profile 5 def f(): 6 # N = 200000 7 50.312 MiB 0.000 MiB N = 20000 8 50.312 MiB 0.000 MiB M = 100 9 10 50.312 MiB 0.000 MiB np.random.seed(123) 11 65.578 MiB 15.266 MiB X = np.random.normal(0, 1, size=(N, M)) 12 65.836 MiB 0.258 MiB y = np.random.normal(0, 1, size=N) 13 14 65.836 MiB 0.000 MiB w = cp.Variable(M) 15 65.836 MiB 0.000 MiB prob = cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w))) 16 76.824 MiB 10.988 MiB prob.solve(solver=cp.ECOS,verbose=True) 17 76.824 MiB 0.000 MiB print("status:",prob.status) 18 76.824 MiB 0.000 MiB print("obj:",prob.value)
The results are mixed. We certainly don't see this crazy memory allocation anymore. It is now a very small 10 MiB. However the solver is not numerically stable enough to solve this problem.
There is another solver that comes with CVXPY that we can try:
[ec2-user@ip-172-30-0-79 etc]$ python3 -m memory_profiler ls1.py ---------------------------------------------------------------------------- SCS v2.1.1 - Splitting Conic Solver (c) Brendan O'Donoghue, Stanford University, 2012 ---------------------------------------------------------------------------- Lin-sys: sparse-direct, nnz in A = 2000002 eps = 1.00e-04, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00 acceleration_lookback = 0, rho_x = 1.00e-03 Variables n = 101, constraints m = 20002 Cones:soc vars: 20002, soc blks: 1 WARN: aa_init returned NULL, no acceleration applied. Setup time: 7.44e-01s ---------------------------------------------------------------------------- Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s) ---------------------------------------------------------------------------- 0| 1.22e+18 7.87e+18 1.00e+00 -9.64e+18 4.27e+21 4.18e+21 1.92e-02 100| 8.68e+17 7.02e+17 5.09e-01 1.71e+21 5.25e+21 3.54e+21 6.67e-01 200| 8.04e+17 5.33e+17 4.23e-01 2.16e+21 5.33e+21 3.16e+21 1.31e+00 300| 7.37e+17 4.39e+17 3.72e-01 2.39e+21 5.22e+21 2.83e+21 1.96e+00 400| 6.73e+17 3.74e+17 3.35e-01 2.51e+21 5.04e+21 2.53e+21 2.59e+00 500| 6.14e+17 3.23e+17 3.05e-01 2.57e+21 4.83e+21 2.26e+21 3.22e+00 600| 5.60e+17 2.82e+17 2.80e-01 2.59e+21 4.60e+21 2.01e+21 3.85e+00 700| 5.11e+17 2.48e+17 2.57e-01 2.58e+21 4.37e+21 1.79e+21 4.50e+00 800| 4.67e+17 2.20e+17 2.36e-01 2.56e+21 4.14e+21 1.58e+21 5.15e+00 900| 4.27e+17 1.96e+17 2.17e-01 2.52e+21 3.92e+21 1.40e+21 5.79e+00 1000| 3.91e+17 1.74e+17 1.99e-01 2.47e+21 3.70e+21 1.23e+21 6.46e+00 1100| 3.58e+17 1.56e+17 1.81e-01 2.42e+21 3.49e+21 1.07e+21 7.09e+00 1200| 3.28e+17 1.40e+17 1.64e-01 2.36e+21 3.29e+21 9.29e+20 7.72e+00 1300| 3.01e+17 1.26e+17 1.48e-01 2.30e+21 3.10e+21 7.98e+20 8.36e+00 1400| 2.77e+17 1.14e+17 1.31e-01 2.24e+21 2.92e+21 6.78e+20 9.00e+00 1500| 2.55e+17 1.03e+17 1.15e-01 2.18e+21 2.75e+21 5.67e+20 9.65e+00 1600| 2.35e+17 9.31e+16 9.88e-02 2.12e+21 2.59e+21 4.65e+20 1.03e+01 1700| 2.17e+17 8.44e+16 8.25e-02 2.06e+21 2.43e+21 3.71e+20 1.09e+01 1800| 2.00e+17 7.67e+16 6.62e-02 2.01e+21 2.29e+21 2.84e+20 1.16e+01 1900| 1.85e+17 6.98e+16 4.97e-02 1.95e+21 2.15e+21 2.03e+20 1.23e+01 2000| 1.72e+17 6.37e+16 3.30e-02 1.89e+21 2.02e+21 1.29e+20 1.29e+01 2100| 1.59e+17 5.81e+16 1.60e-02 1.84e+21 1.90e+21 5.98e+19 1.36e+01 2200| 8.80e-03 1.43e-01 1.60e-05 1.20e+04 1.20e+04 1.94e-16 1.42e+01 2300| 6.45e-03 1.37e-01 1.62e-05 1.23e+04 1.23e+04 6.83e-17 1.49e+01 2400| 6.25e-03 1.34e-01 1.40e-05 1.26e+04 1.26e+04 2.15e-16 1.55e+01 2500| 6.06e-03 1.30e-01 1.21e-05 1.28e+04 1.28e+04 2.25e-16 1.62e+01 2600| 5.89e-03 1.27e-01 1.04e-05 1.30e+04 1.30e+04 7.78e-17 1.68e+01 2700| 5.71e-03 1.24e-01 8.98e-06 1.33e+04 1.33e+04 8.14e-17 1.75e+01 2800| 5.55e-03 1.21e-01 7.70e-06 1.35e+04 1.35e+04 8.46e-17 1.82e+01 2900| 5.39e-03 1.18e-01 6.56e-06 1.37e+04 1.37e+04 8.78e-17 1.88e+01 3000| 5.24e-03 1.15e-01 5.57e-06 1.39e+04 1.39e+04 9.04e-17 1.95e+01 3100| 5.09e-03 1.12e-01 4.68e-06 1.41e+04 1.41e+04 9.40e-17 2.01e+01 3200| 4.94e-03 1.09e-01 3.90e-06 1.42e+04 1.42e+04 2.91e-16 2.08e+01 3300| 4.80e-03 1.07e-01 3.21e-06 1.44e+04 1.44e+04 9.96e-17 2.14e+01 3400| 4.67e-03 1.04e-01 2.60e-06 1.46e+04 1.46e+04 1.03e-16 2.21e+01 3500| 4.54e-03 1.01e-01 2.05e-06 1.48e+04 1.48e+04 3.18e-16 2.27e+01 3600| 4.41e-03 9.87e-02 1.57e-06 1.49e+04 1.49e+04 1.09e-16 2.33e+01 3700| 4.29e-03 9.62e-02 1.14e-06 1.51e+04 1.51e+04 1.12e-16 2.40e+01 3800| 4.16e-03 9.37e-02 7.64e-07 1.52e+04 1.52e+04 1.14e-16 2.47e+01 3900| 4.05e-03 9.12e-02 4.28e-07 1.54e+04 1.54e+04 1.17e-16 2.53e+01 4000| 3.93e-03 8.88e-02 1.30e-07 1.55e+04 1.55e+04 1.20e-16 2.60e+01 4100| 3.82e-03 8.65e-02 1.34e-07 1.57e+04 1.57e+04 1.23e-16 2.66e+01 4200| 3.72e-03 8.42e-02 3.67e-07 1.58e+04 1.58e+04 1.25e-16 2.73e+01 4300| 3.61e-03 8.20e-02 5.72e-07 1.59e+04 1.59e+04 1.28e-16 2.80e+01 4400| 3.51e-03 7.98e-02 7.52e-07 1.60e+04 1.60e+04 3.92e-16 2.86e+01 4500| 3.41e-03 7.77e-02 9.11e-07 1.62e+04 1.62e+04 1.33e-16 2.93e+01 4600| 3.32e-03 7.56e-02 1.05e-06 1.63e+04 1.63e+04 1.36e-16 3.00e+01 4700| 3.22e-03 7.35e-02 1.17e-06 1.64e+04 1.64e+04 4.15e-16 3.07e+01 4800| 3.13e-03 7.15e-02 1.27e-06 1.65e+04 1.65e+04 1.41e-16 3.13e+01 4900| 3.04e-03 6.96e-02 1.36e-06 1.66e+04 1.66e+04 4.30e-16 3.20e+01 5000| 2.96e-03 6.77e-02 1.44e-06 1.67e+04 1.67e+04 4.37e-16 3.27e+01 ---------------------------------------------------------------------------- Status: Solved/Inaccurate Hit max_iters, solution may be inaccurate Timing: Solve time: 3.27e+01s Lin-sys: nnz in L factor: 2025055, avg solve time: 5.82e-03s Cones: avg projection time: 3.02e-05s Acceleration: avg step time: 7.51e-07s ---------------------------------------------------------------------------- Error metrics: dist(s, K) = 0.0000e+00, dist(y, K*) = 3.6380e-12, s'y/|s||y| = -6.2180e-16 primal res: |Ax + s - b|_2 / (1 + |b|_2) = 2.9568e-03 dual res: |A'y + c|_2 / (1 + |c|_2) = 6.7689e-02 rel gap: |c'x + b'y| / (1 + |c'x| + |b'y|) = 1.4408e-06 ---------------------------------------------------------------------------- c'x = 16701.6612, -b'y = 16701.6131 ============================================================================ status: optimal_inaccurate obj: 16701.661202598487 Filename: ls1.py Line # Mem usage Increment Line Contents ================================================ 4 50.438 MiB 50.438 MiB @profile 5 def f(): 6 # N = 200000 7 50.438 MiB 0.000 MiB N = 20000 8 50.438 MiB 0.000 MiB M = 100 9 10 50.438 MiB 0.000 MiB np.random.seed(123) 11 65.699 MiB 15.262 MiB X = np.random.normal(0, 1, size=(N, M)) 12 65.957 MiB 0.258 MiB y = np.random.normal(0, 1, size=N) 13 14 65.957 MiB 0.000 MiB w = cp.Variable(M) 15 65.957 MiB 0.000 MiB prob = cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w))) 16 76.953 MiB 10.996 MiB prob.solve(solver=cp.SCS,verbose=True) 17 76.953 MiB 0.000 MiB print("status:",prob.status) 18 76.953 MiB 0.000 MiB print("obj:",prob.value)
That solver also has problems. In this case we ran out of iterations. We can allow for more iterations by adding the argument max_iters=nnnn to the solve statement. SCS is using a first order algorithm. These algorithms typically use a very large number of iterations.
Approach 2: use a different formulation
We can use a different formulation:
Regression II |
---|
\[\begin{align}\min_{\color{darkred}w,\color{darkred}r}\>& ||\color{darkred}r||^2_2 \\ & \color{darkred}r = \color{darkblue}y - \color{darkblue}X\color{darkred}w \end{align}\] |
Here we add a bunch of (free) variables \(r\) for the residuals, and also a bunch of linear constraints. The payback is that we have a much simpler quadratic objective. In addition it is 100% positive (semi-) definite. A reasonable implementation can look like:
import cvxpy as cp import numpy as np # N = 200000 N = 20000 M = 100 X = np.random.normal(0, 1, size=(N,M)) y = np.random.normal(0, 1, size=(N,1)) w = cp.Variable((M,1)) r = cp.Variable((N,1)) prob = cp.Problem(cp.Minimize(r.T @ r), [r == y - X @ w]) prob.solve(solver=cp.OSQP,verbose=True) print("status:",prob.status) print("obj:",prob.value)
This should work, but it doesn't. We see:
[ec2-user@ip-172-30-0-79 etc]$ python3 ls2a.py Traceback (most recent call last): File "ls2a.py", line 14, in <module> prob.solve(solver=cp.OSQP,verbose=True) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 289, in solve return solve_func(self, *args, **kwargs) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 567, in _solve self._construct_chains(solver=solver, gp=gp) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 510, in _construct_chains raise e File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/problems/problem.py", line 499, in _construct_chains construct_intermediate_chain(self, candidate_solvers, gp=gp) File "/home/ec2-user/.local/lib/python3.7/site-packages/cvxpy/reductions/solvers/intermediate_chain.py", line 70, in construct_intermediate_chain raise DCPError("Problem does not follow DCP rules. Specifically:\n" + append) cvxpy.error.DCPError: Problem does not follow DCP rules. Specifically: The objective is not DCP. Its following subexpressions are not: var1.T * var1
Bummer. This is a perfectly convex problem!
Some alternative formulations to generate a QP do not help:
- cp.sum_squares(r) yields a large allocation when forming a QP
- sum([r[i]**2 for i in range(N)] is very slow and very memory hungry
- quad_form with an identity matrix is not efficient
So I had no luck in being able to generate a QP problem for OSQP without very large memory allocations.
This did not work out so well. I am not really able to solve relatively large, but easy least squares problems using standard formulations.
Conclusion
- CVXPY thinks \(r^Tr = \sum_i r_i^2\) is not convex.
- We cannot always generate large problems for the OSQP QP solver due to large dense memory allocations in CVXPY. I probably would consider this a bug: it is never a good idea to physically create and allocate a large dense identity matrix in any somewhat serious code.
- CVXPY can generate large instances for the conic solvers ECOS and SCS. But they have some troubles solving this problem.
- A commercial solver like Cplex has no problem with this; 2 iterations, 0.2 seconds.
- We can conclude that CVXPY (and its collection of standard solvers) is best suited for smaller and medium-sized problems. When tackling large problems you may hit some issues. Of course, many interesting and relevant problems are not that big and can be solved conveniently using CVXPY.
References
- Least squares problem run out of memory, https://stackoverflow.com/questions/59315300/least-squares-problem-run-out-of-memory-in-cvxpy
Interestingly, the problem gets solved if you call ECOS directly (ecos requires G and h such that h-Gx in K, dim_q is the dimension of the second order cone, and the first element of h-Gx is norm bounded by the other elements (typically it is the last element)):
ReplyDeleteimport ecos
c = np.zeros((M+1,))
c[0] = 1
G = np.zeros((N+1,M+1))
G[0,0] = -1
G[1:, 1:] = X
G = sp.sparse.csc_matrix(G)
h = np.pad(y.flatten(), (1, 0), 'constant')
dims = {'q':[N+1]}
ecos.solve(c,G,h,dims)
Thanks. That is an interesting experiment.
Delete