Your Paper has been Accepted, Rejected, or whatever: Automatic Generation of Scientific Paper Reviews




Alberto Bartoli, Andrea De Lorenzo, Eric Medvet, Fabiano Tarlao


International Cross Domain Conference and Workshop (CD-ARES), held in Salzburg (Austria)



Links and material:

Abstract #

Peer review is widely viewed as an essential step for ensuring scientific quality of a work and is a cornerstone of scholarly publishing. On the other hand, the actors involved in the publishing process are often driven by incentives which may, and increasingly do, undermine the quality of published work, especially in the presence of unethical conduits. In this work we investigate the feasibility of a tool capable of generating fake reviews for a given scientific paper automatically. While a tool of this kind cannot possibly deceive any rigorous editorial procedure, it could nevertheless find a role in several questionable scenarios and magnify the scale of scholarly frauds.A key feature of our tool is that it is built upon a small knowledge base, which is very important in our context due to the difficulty of finding large amounts of scientific reviews. We experimentally assessed our method with tens of human subjects. We presented to these subjects a mix of genuine and machine generated reviews and we measured the ability of our proposal to actually deceive subjects judgment. The results highlight the ability of our method to produce reviews that often look credible and may subvert the decision.

Media coverage #

Examples of generated review #

For the paper Detecting Android Malware using Sequences of System Calls:

This paper proposes a system to detect artifacts based on network related information. However, i am curious about the observation (shown in killer 1b) that sv and dc appear more similar to each other ‘after driver’ despite having never played recorder-to-head (as far as we are told). It would be good if you can also talk about the importance of establishing some good shared benchmarks.

For the paper Publication Venue Recommendation based on Paper Abstract

The submission describes a system (a fairly complicated one) which attempts to learn lexical edges from positive and negative examples. However, as there is only one method from both movies tested, it could as well be only these two instances that behave differently than all others. It would be useful to identify key assumptions in the modeling to which the predictions are sensitive yet uncertainty is high as im research assignments.