Tuesday, July 31, 2012

Statistical Fraud Detection

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2114571

It is not very easy to create fabricated data that is really random. In one case the author describes, the means for a variable in different situations were really different but surprisingly the standard deviations were almost identical. That is a sign of possibly invented data. The author uses simulation to find out how likely these very close SD are.

This study has lead to some retractions of papers and people leaving their academic jobs. See: http://www.nature.com/news/uncertainty-shrouds-psychologist-s-resignation-1.10968.