PubPeer – A universal comment and review layer for scholarly papers?

Lately I’ve had a plethora of discussions with colleagues concerning the possible benefits of a reddit-like “democratic review layer”, which would index all scholarly papers and let authenticated users post reviews subject to karma. We’ve navel-gazed about various implementations ranging from a full out reddit clone, a wiki, or even a full blown torrent tracker with rated comments and mass piracy. So you can imagine I was pleasantly surprised to see someone actually went ahead and put together a simple app to do exactly that.

Image

Pubpeer states that it’s mission is to “create an online community that uses the publication of scientific results as an opening for fruitful discussion.” Users create accounts using an academic email address and must have at least one first-author publication to join. Once registered any user can leave anonymous comments on any article, which are themselves subject to up/down votes and replies.

My first action was of course to search for my own name:

Image

Hmm, no comments. Let’s fix that:

Image

Hah! Peer review is easy! Just kidding, I deleted this comment after testing to see if it was possible. Ostensibly this is so authors can reply to comments, but it does raise some concerns that one can just leave whatever ratings you like on your own papers. In theory with enough users, good comments will be quickly distinguished from bad, regardless of who makes them.  In theory… 

This is what an article looks like in PubPeer with a few comments:

Image

Pretty simple- any paper can be found in the database and users then leave comments associated with those papers. On the one hand I really like the simplicity and usability of PubPeer. I think any endeavor along these lines must very much follow the twitter design mentality of doing one (and only one) thing really well. I also like the use of threaded comments and upvote/downvotes but I would like to see child comments being subject to votes. I’m not sure if I favor the anonymous approach the developers went for- but I can see costs and benefits to both public and anonymous comments, so I don’t have any real suggestions there.

What I found really interesting was just to see this idea in practice. While I’ve discussed it endlessly, a few previously unforeseen worries leaped out right away. After browsing a few articles it seems (somewhat unsurprisingly) that most of the comments are pretty negative and nit-picky. Considering that most early adopters of such a system are likely to be graduate students, this isn’t too surprising. For one thing there is no such entity as a perfect paper, and graduate students are often fans of these kind of boilerplate nit-picks that form the ticks and fleas of any paper. If comments add mostly doubt and negativity to papers, it seems like the whole commenting process would become a lot of extra work for little author pay-off, since no matter what your article is going to end up looking bad.

In a traditional review, a paper’s flaws and merits are assessed privately and then the final (if accepted) paper is generally put forth as a polished piece of research that stands on it’s on merits. If a system like PubPeer were popular, becoming highly commented would almost certainly mean having tons of nitpicky and highly negative comments associated to that manuscript. This could manipulate reader perceptions- highly commented PubPeer articles would receive fewer citations regardless of their actual quality.

So that bit seems very counter-productive to me and I am not sure of the solution. It might be something similar to establishing light top-down comment moderation and a sort of “reddiquette” or user code of conduct that emphasizes fair and balanced comments (no sniping). Or, perhaps my “worry” isn’t actually troubling at all. Maybe such a system would be substantially self-policing and refreshing, shifting us from an obsession with ‘perfect papers’ to an understanding that no paper (or review) should be judged on anything but it’s own merits. Given the popularity of pun threads on reddit, i’m not convinced the wholly democratic solution will work. Whatever the result, as with most solutions to scholarly publishing, it seems clear that if PubPeer is to add substantial value to peer review then a critical mass of active users is the crucial missing ingredient.

What do you think? I’d love to hear your thoughts in the comments.

How to reply to #icanhazpdf in 3 seconds

Yesterday my friend Hauke and I theorized about a kind of dream scenario- a totally distributed, easy to use, publication liberation system. This is perhaps not feasible at this point [1]. Today we’re going to present something that will be useful right now. The essential goal here is to make it so that anyone, anywhere, can access the papers they need in a timely manner. The idea is to take advantage of existing strategies and tools to streamline paper sharing as much as possible. Folks already do this- every day on twitter or in private, requests for papers are made and fulfilled. Our goal is to completely streamline this process down to a few clicks of your mouse. That way a small but dedicated group of folks – the Papester Collective – can ensure that #icanhazpdf requests are fulfilled almost instantly. This is a work in progress. Leave comments on how to improve and further streamline this system and join the collective!

SHORT VERSION: HOW TO GET A PAPER BEHIND A PAYWALL QUICKLY

Tweet (for example): “#icanhazpdf http://dx.doi.org/10.1523/JNEUROSCI.4568-12.2013

Click: Here you can find more detailed instructions.

HOW TO JOIN THE COLLECTIVE AND START SERVING REQUESTS

SHORT INSTRUCTIONS AND REQUIRED SOFTWARE:

  1. Twitter: Monitor #icanhazpdf #requests
  2. Zotero and zotero browser plugin: after clicking on DOI link or abstract page just click on ‘Save to Zotero’ button to auto-grabs PDFs

  3. Zotfile: automatically copies new Zotero pdfs files saved to public Dropbox folder

  4. Dropbox: Cloud storage system to seamlessly share files with anyone without login.

  5. Dropbox linker: automatically adds links from public folder to your clipboard

  6. Reply to request tweets: paste URL from clipboard and if you want #papester

That’s it! Now you can just click request links, click the Zotero get PDF button, and CTRL+V a dropbox direct download link in response!

Click: Here you can find more detailed instructions.

1.The fundamental problem: uploading huge repositories of scientific papers is not sensible for now. It’s too much data (50 million papers * 0.5-1.5 megabytes together make up ~ 25-75 Terrabytes) and the likelihood for every paper to be downloaded is more uniformly distributed than with files traditionally shared like music. For instance, there are 100 million songs x 3.5 mb songs, and it is difficult to find exotic songs online – some songs have decent availability now because there are only a few favourites – not so with favourite papers. Also, fewer people will share papers than songs, so this makes it more even more difficult to sustain a complete repository. Thus, we need a system that fufills requests individually.

Disclaimer: Please make sure you only share papers with friends who also have the copyrights to the papers you share.

Could a papester button irreversibly break down the research paywall?

A friend just sent me the link to the Aaron Swartz Memorial JSTOR liberator. We started talking about it and it led to a pretty interesting idea.

As soon as I saw this it clicked: we need papester. We need a simple browser plugin that can recognize, download and re-upload any research document automatically (think zotero) to BitTorrent (this was Aaron’s original idea, just crowdsourced). These would then be automatically turned into torrents with an associated magnet link. The plugin would interact with a lightweight torrent client, using a set limit of your bandwidth (say 5%) to constantly seed back any files you have in your (zotero) library folder. Also, it would automatically use part of the bandwidth to seed missing papers (first working through a queue of DOIs of papers that were searched for by others and then just for any missing paper in reverse chronological order), so that over time all papers would be on BitTorrent. The links would be archived by google; any search engine could then find them and the plug-in would show the PDF download link.

Once this system is in place, a pirate-bay/reddit mash-up could help sort the magnet links as a meta-data rich papester torrent tracker. Users could posts comments and reviews, which would themselves be subject to karma. Over time a sorting algorithm could give greater weight to reviews from authors who consistently review unretracted papers, creating a kind of front page where “hot” would give you the latest research and “lasting” would give you timeless classics. Separating the sorting mechanism – which can essentially be any tracker – and the rating/meta-data system ensures that neither can be easily brought down. If users wish they could compile independent trackers for particular topics or highly rated papers, form review committees, and request new experiments to address flagged issues in existing articles. In this way we would ensure not only an everlasting and loss-protected research database, but irreversibly push academic publishing into an open-access and democratic review system. Students and people without access to scientific knowledge could easily find forgotten classics and the latest buzz with a simple sort. We need an “research-reddit” rating layer  - why not solve Open Access and peer review in one go?

Is this feasible? There are about 50 million papers in existence[1]. If we estimate about 500 kilobytes on average per paper, that’s 25 million MB of data, or  25 terabytes. While that may sound like a lot, remember that most torrent trackers already list much more data than this and that available bandwidth increases annually. If we can archive a ROM of every videogame created, why not papers? The entire collection of magnet links could take up as little as 1GB of data, making it easy to periodically back up the archive, ensure the system is resilient to take-downs, and re-seed less known or sought after papers. Just imagine it- all of our knowledge stored safely in an completely open collection, backed by the power of the swarm, organized by reviews, comments, and ratings, accessible to all. It would revolutionize the way we learn and share knowledge.

Of course there would be ruthless resistance to this sort of thing from publishers. It would be important to take steps to protect yourself, perhaps through TOR. The small size of the files would facilitate better encryption. When universities inevitably move to block uploads, tags could be used to later upload acquired files quickly on a public-wifi hotspot. There are other benefits as well- currently there are untold numbers of classic papers available online in reference only. What incentive is there for libraries to continue scanning these? A papester-backed uploader karma system could help bring thousands of these documents irreversibly into the fold. Even in the case that publishers found some way to stifle the system, as with Napster  the damage would be done. Just as we were pushed irrevocably towards new forms of music consumption – direct download, streaming, donate-to-listen – big publishers would be forced toward an open access model to recover costs. Finally such a system might move us closer to a self-publishing ARXIV model. In the case that you couldn’t afford open access, you could self-publish your own PDF to the system. User reviews and ratings could serve as a first layer of feedback for you to improve the article. The idea or data – with your name behind it – would be out there fast and free.

edit:

Another cool feature would be a DOI search. When a user searches for a paper that isn’t available, papster would automatically add that paper to a request queue.

edit2/disclaimer:

This is a thought experiment about an illegal solution and it’s possible consequences and benefits. Do with it what you will but recognize the gap between the theoretical and the actual!

1.
Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308 free pre-print available from author here

Researchers begin posting article PDFs to twitter in #pdftribute to Aaron Swartz

Yesterday, as I was completing my morning coffee and internet ritual, @le_feufollet broke the sad news to me of Aaron Swartz’s death. Aaron was a leader online, a brilliant coder and developer, and sadly a casualty in the fight for freedom of information. He was essential in the development of two tools I use every day (RSS and Reddit), and though his guerilla attempt to upload all papers on JSTOR was perhaps unstrategic, it was certainly noble enough in cause. Before his death Aaron was facing nearly 35 years in prison for his role in the JSTOR debacle, an insane penalty for attempting to share information. We don’t know why Aaron chose to take his life, but when @la_feufollet and I tried to brainstorm a tribute to him, my first thought was a guerilla PDF uploading campaign in honor of his fight for open access. I’m not much of an organizer, so I posted in one of the many rising reddit threads and hoped for the best:

Image

My posts on reddit are usually ignored, so I went about my business and assumed it was the last i’d hear of it. It was amazing to wake up this morning and see that redditors had responded strongly to the idea and that a flood of tweets tagged #pdftribute had appeared:

Image

As far as I can tell early this morning Anonymous took the initiative and tweeted with the hashtag. Eva Vivalt and Jessica Richman took the initiative to think up the #pdftribute hashtag, and helped bring anonymous onboard. Currently there are hundreds thousands of authors posting their PDFs. It’s amazing to see that the original promise of the internet – the spread of ideas- is thriving. Lately i’ve been feeling a bit pessimistic, worried that the net was becoming an overly gamed, astroturf-ridden meme-preserve for advertisers to groom to their financial needs. It’s great to see that the most exciting power of our newfound connectivity- driving ideas to spread freely and have impact without the restrictions of traditional hierarchical barriers- continues to thrive. I hope #pdftribute lives on in both force and spirit, and that we can all begin working toward a world in which all publicly funded research is available to anyone with net access.

UPDATE 13/1/13 4:00 EST:

For those of you who don’t feel comfortable violating your copyright, but want to join in #pdftribute, your best bet is to check the specifics of your publisher agreement. Most journals allow you to upload a pre-print manuscript to your personal website. Then you can go ahead and tweet the link to your website or the individual pre-print PDFs. Jonathan Eisen has a helpful list of 10 ways to post your papers on twitter here.

Otherwise, hide in the swarm today as a show of support for Aaron. By standing together we show that the future of research publishing is freedom of information. But tomorrow remember that we need to push through real copright reform. You can start by reading Aaron’s wonderful Guerilla Open Access Manifesto. If you are ready to commit to open access, you can sign the petition at http://thecostofknowledge.com/. There is also this We The People petition demanding legislation requiring journals to use an open-access publishing model Woops that petition has expired- start a new one!. As Matthew Green put it, lets push for an Aaron Swartz copyright reform act.

UPDATE 1:

Some nice folks have put together a link scraper to collect PDFs tagged #pdftribute here:

Screen shot 2013-01-13 at 7.46.58 PM

UPDATE 2:

If I may make a humble suggestion- it may be useful to follow a specific format for sharing your papers. This will make them easier to find later, and for journalists to compile some sharing stats. Here is my suggested example.

Screen shot 2013-01-13 at 2.12.23 PM

UPDATE 3:

Eva Vivalt reports #pdftribute getting 500 tweets/hr, >2.5 million impressions!

Screen shot 2013-01-13 at 1.46.05 PM