I was running some neural networks with a back propagation algorithm using a rather simple optimization tool (gradient descent). This actually ran pretty decent. Now lets if we can formulate something like this in GAMS. I tried something like:
I know the formulation is not great but may be better high-performance sparse solvers would help me out here and still do a decent job. Unfortunately, the result was dramatically poor. Most solvers ran out of memory, and MINOS was still running after 3 hours.
The basic problem is that we really only want THETA2 and THETA3 as variables and all the rest just as temporary intermediate values. That makes the problem very much smaller (the thetas are not depending on i which is the observation number of the training set). With AMPL we probably could have used defined variables to make this work but GAMS does not offer this.