ORI: Methods and tools for scalable figure reuse detection with statistical certainty reporting

less than 1 minute read

See in ORI



Fraudulent reuse of scientific figures is an increasingly common problem that damages the public perception of science. The Office of Research Integrity (ORI) reviews whistleblowers’ accusations carefully, taking a reactive approach to investigate this type of misconduct. Recently, Acuna, Brookes, and Kording (2018) used machine learning to detect figure reuse in PubMed Open Access articles which share same junior or senior scientist. They estimated that around 0.6% of the papers were very likely fraudulent. Current image manipulation investigations are however reactive, not across-authors scalable, originated from whistleblowers, and without statistically-supported verdicts.

In this project, we propose to dramatically scale automated detection of figure reuse across articles and collaborate with ORIs and active researchers in the area. We also propose to develop statistical methods to support conclusions regarding figure reuses. Once this project is completed, we hope that the tools and techniques that we will develop will become a standard practice. We expect our research to significantly reduce the acceptance of publications with image manipulation and therefore significantly reduce the incidence of one of the most damaging instances of scientific misconduct.