Friday, December 24, 2010

Gmail problems

I use gmail almost exclusively and I am very happy with how it works. Recently there surfaced an interesting bug. Attachments can be downloaded individually or combined as a single zip file. It turns out gmail adds a few bytes to the zipped files. After unzipping I see:

C:\Users\erwin\Downloads>dir "AM RFP IFPRI.docx"/s
Volume in drive C is HP
Volume Serial Number is 8AA3-C6D2

Directory of C:\Users\erwin\Downloads

12/22/2010  01:03 PM           194,667 AM RFP IFPRI.docx
               1 File(s)        194,667 bytes

Directory of C:\Users\erwin\Downloads\rfpburundi

12/24/2010  12:06 PM           195,072 AM RFP IFPRI.docx
               1 File(s)        195,072 bytes

     Total Files Listed:
               2 File(s)        389,739 bytes
               0 Dir(s)  315,865,411,584 bytes free

C:\Users\erwin\Downloads>

First one downloaded as single file, second one unzipped from zip. After a few nasty warning messages about corrupted files Word is actually able to open and view the second file also.

See also: http://www.google.com/support/forum/p/gmail/thread?tid=7d4f236a3538ebbc&hl=en

Tuesday, December 21, 2010

A scheduling problem

For a scheduling that I could not solve in one swoop I tried to develop a small algorithm. Basically: schedule the m<n jobs that come first, then fix the m jobs and schedule the next m jobs. This seems to work with Cplex. The algorithm looks like:

loop(k,

   j(j0) = kj(k,j0);

  
solve
sched minimizing makespan using mip;
   report(k,
'obj'
) = sched.objval;
   report(k,
'modelstat'
) = sched.modelstat;
   report(k,
'solvestat'
) = sched.solvestat;

   xstart.fx(j,t) = xstart.l(j,t);

);

This algorithm will perform the following:

1-10 11-20 21-30 31-40 41-50
solved        
fixed solved      
fixed fixed solved    
fixed fixed fixed solved  
fixed fixed fixed fixed solved

For Cplex we get:

----    180 PARAMETER report 

           obj   modelstat   solvestat

k1    1480.000       1.000       1.000
k2    2940.000       1.000       1.000
k3    4400.000       1.000       1.000
k4    5860.000       1.000       1.000
k5    7320.000       1.000       1.000

However for Gurobi we get:

----    180 PARAMETER report 

           obj   modelstat   solvestat

k1    1480.000       1.000       1.000
k2    2940.000       1.000       1.000
k3    4400.000       1.000       1.000
k4                  19.000       1.000
k5                  19.000       1.000

It turns out that the fixing of a solution causes the next iteration to be infeasible. We can actually isolate step k3 and do:

   solve m minimizing makespan using mip;
   xstart.fx(j,t) = xstart.l(j,t);
  
solve
m minimizing makespan using mip;

This will make the second model infeasible: i.e. Gurobi does not like its own solution! In this case it is a tolerance question: the final solution is slightly infeasible. We can fix this by:

loop(k,

   j(j0) = kj(k,j0);

  
solve
m minimizing makespan using mip;
   report(k,
'obj'
) = m.objval;
   report(k,
'modelstat'
) = m.modelstat;
   report(k,
'solvestat'
) = m.solvestat;

   xstart.fx(j,t) = round(xstart.l(j,t));

);

because we know all job step times are in whole seconds. Now both Cplex and Gurobi can solve the problem just fine.

A reasonable good solution looks like:

50plates2

By accident our algorithmic batch size is the same as the batch size for a good solution (this is not required for the algorithm to work).

The solution is somewhat sensitive where we put the breaks when decomposing the model. This may indicate a rolling horizon algorithm may work better (see http://yetanothermathprogrammingconsultant.blogspot.com/2008/06/rolling-horizon-implementation-in-gams.html).

Tuesday, December 14, 2010

FTP client

Not many people seem to know this: the Windows file explorer can be used as a simple FTP client by entering a URL of the following format in the  text box where normally the directory is shown:

ftp://login:password@ftp.xxxx.com

Just drag files and whole directories to here and they will be transferred.

Note: when clicking on a URL like that the Internet explorer will start, allowing a “read-only” view of an FTP site.

Sunday, December 5, 2010

SQL Injection

From http://www.fsf.org/blogs/sysadmin/savannah-and-www.gnu.org-downtime:

Wed Nov 24 12:59 UTC -- On the evening before Thanksgiving, an IP located in Tbilisi, Georgia started an attack targeting the savannah.nongnu.org website. The perpetrators used SQL injection attacks to download the entire database of usernames and hashed passwords, and we should assume anything else in the Savannah MySQL database.

One way to prevent this is to solely use prepared SQL statements when accessing the DB. This is the strategy I have used in http://yetanothermathprogrammingconsultant.blogspot.com/2010/10/running-large-models-using-web.html. (Other approaches often mentioned are using an appropriate quoting mechanism, e.g. through mysql_real_escape_string and/or disabling multi statements) I browsed through a few recent books on PHP for this project, and I was amazed how much text is devoted to security; it looked like a third of the material seems to be about security. It truly sends shivers down your spine. Programming web applications becomes more and more a question of being paranoid all the time; this is not really a good development as it takes the fun out of things and also reduces productivity (extra work for code that does not implement any new and possibly exciting features for regular, non-hostile users). From a different perspective one could also say that the development tools are still too primitive such that developers are not isolated from these security issues and can concentrate on the real task at hand.

Wednesday, December 1, 2010

Collaboration tools

For different projects I have been using some of the following document sharing tools:

  • DropBox. This allows you to have a directory that is shared between different users. Advantage: on Windows this just looks like a network drive and you can easily copy files from and to the drop box directory. Disadvantages: just files. Also it looks for windows as a local disk so by default dragging means “move” instead of “copy”. Site: http://www.dropbox.com/.
  • Wiki-type environments. I have experience with PmWiki (http://www.pmwiki.org) and MediaWiki (http://www.mediawiki.org). Advantage: good support for HTML presentation of rich text. Disadvantage: somewhat complicated to setup and to secure; file upload facilities may be limited depending on the hosting environment.
  • Live.com. Microsofts sharing environment. Nice online editing facilities of Word documents.
  • My current favorite: sites.google.com. Clean and well-thought through. Has most of the facilities I need. Easily shared with just the people you want to share it with. Limits: 20 GB files, and 100 GB per site; I had to park some of the larger data files somewhere else.