Wednesday, March 7, 2018

Square system solver: Barzilai-Borwein spectral method in R

I was playing with some large sparse square systems of non-linear equations (actually resulting from some spreadsheets). R has a package called BB [1] which is based on [2,3]. Technically: this is Barzilai-Borwein spectral method. It is both derivative-free and matrix-free (only needs a few vectors of length \(n\)).

For some problems it works rather nicely:


n <- 100
x0 <- -runif(n)

fbroyden <- function(x) {
  n <- length(x)
  F <- rep(NA,n)
  F[1] <- x[1]*(3-0.5*x[1]) - 2*x[2] + 1
  i <- 2:(n-1)
  F[i] <- x[i]*(3-0.5*x[i]) - x[i-1] - 2*x[i+1] + 1
  F[n] <- x[n]*(3-0.5*x[n]) - x[n-1] + 1
  F
} 

library(BB)
dfsane(par=x0,fn=fbroyden,control=list(triter=1))

This solves very quickly especially given BB is written in pure R (no C or Fortran):


Iteration:  0  ||F(x0)||:  1.441474 
iteration:  1  ||F(xn)|| =   11.23496 
iteration:  2  ||F(xn)|| =   7.900779 
iteration:  3  ||F(xn)|| =   6.542131 
iteration:  4  ||F(xn)|| =   3.139281 
iteration:  5  ||F(xn)|| =   4.031394 
iteration:  6  ||F(xn)|| =   5.183635 
iteration:  7  ||F(xn)|| =   0.5650939 
iteration:  8  ||F(xn)|| =   0.1870383 
iteration:  9  ||F(xn)|| =   0.09536832 
iteration:  10  ||F(xn)|| =   0.04728994 
iteration:  11  ||F(xn)|| =   0.0231393 
iteration:  12  ||F(xn)|| =   0.01204457 
iteration:  13  ||F(xn)|| =   0.01253904 
iteration:  14  ||F(xn)|| =   0.004658623 
iteration:  15  ||F(xn)|| =   0.00671189 
iteration:  16  ||F(xn)|| =   0.002562044 
iteration:  17  ||F(xn)|| =   0.0007190788 
iteration:  18  ||F(xn)|| =   0.000526917 
iteration:  19  ||F(xn)|| =   0.0004205647 
iteration:  20  ||F(xn)|| =   0.0003359958 
iteration:  21  ||F(xn)|| =   0.0002087812 
iteration:  22  ||F(xn)|| =   2.471228e-05 
iteration:  23  ||F(xn)|| =   7.605672e-06 
iteration:  24  ||F(xn)|| =   1.304184e-05 
iteration:  25  ||F(xn)|| =   3.089383e-05 
iteration:  26  ||F(xn)|| =   6.874306e-05 
iteration:  27  ||F(xn)|| =   3.30784e-05 
iteration:  28  ||F(xn)|| =   1.16808e-05 
iteration:  29  ||F(xn)|| =   3.541111e-06 
iteration:  30  ||F(xn)|| =   8.387173e-07 
$par
  [1] -1.0323920 -1.3150464 -1.3887103 -1.4076713 -1.4125364 -1.4137837 -1.4141034 -1.4141853 -1.4142063 -1.4142117 -1.4142131
 [12] -1.4142135 -1.4142135 -1.4142135 -1.4142135 -1.4142136 -1.4142135 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136
 [23] -1.4142136 -1.4142136 -1.4142135 -1.4142136 -1.4142135 -1.4142136 -1.4142135 -1.4142136 -1.4142135 -1.4142136 -1.4142135
 [34] -1.4142136 -1.4142136 -1.4142136 -1.4142135 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142135 -1.4142136
 [45] -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142135 -1.4142136 -1.4142135 -1.4142136
 [56] -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142136 -1.4142135 -1.4142136 -1.4142136 -1.4142136
 [67] -1.4142136 -1.4142136 -1.4142135 -1.4142136 -1.4142136 -1.4142135 -1.4142135 -1.4142135 -1.4142135 -1.4142135 -1.4142134
 [78] -1.4142132 -1.4142128 -1.4142121 -1.4142107 -1.4142080 -1.4142027 -1.4141924 -1.4141722 -1.4141328 -1.4140561 -1.4139064
 [89] -1.4136144 -1.4130448 -1.4119339 -1.4097678 -1.4055460 -1.3973251 -1.3813439 -1.3503811 -1.2907820 -1.1775120 -0.9675106
[100] -0.5965290

$residual
[1] 8.387173e-08

$fn.reduction
[1] 14.41473

$feval
[1] 31

$iter
[1] 30

$convergence
[1] 0

$message
[1] "Successful convergence"

There is always to say something. First note how clever the test function was implemented. By setting up the vector i we don't need to do any loops. But also note that the test function is not correctly evaluated with \(n=2\). In R the sequence "\(2:1\)" results in counting backwards. It generates the numbers \([2\> 1]\).

The second thing: the first number \(||F(x^0)||\) looks a bit funny.  Indeed we have:


> f <- fbroyden(x0)
> norm(f,type="2")
[1] 16.11595
> 

Looking at the source I see:


    F0 <- normF <- sqrt(sum(F * F))

    if (trace) 
        cat("Iteration: ", 0, " ||F(x0)||: ", F0/sqrt(n), "\n")

I do not understand the reason for the division by \(\sqrt{n}\). It is only doing this for the initial value. Subsequent log lines are correctly printing \(||F(x)||\).

One possible reason is that the termination criterion is:
\[ ||F(x)|| \le \mathit{tol} \cdot \sqrt{n} \]
so the author wanted to compare \(\displaystyle\frac{||F(x^0)||}{\sqrt{n}}\) with \(\mathit{tol}\).  Obviously this first log line is still mislabeled.

References


  1. Varadhan, Ravi, and Paul D. Gilbert. 2009. “BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function.” Journal of Statistical Software 32 (4): 1–26. http://www.jstatsoft.org/v32/i04/.
  2. Cruz, William La, and Marcos Raydan. 2003. “Nonmonotone Spectral Methods for Large-Scale Nonlinear Systems.” Optimization Methods and Software 18: 583–99.
  3. Cruz, William La, Jose Mario Martınez, and Marcos Raydan. 2006. “Spectral Residual Method Without Gradient Information for Solving Large-Scale Nonlinear Systems of Equations.” Mathematics of Computation 75: 1449–66.