ArXangel — an ArXiv recommender

Image for post
Image for post
The Greek letter “chi”. It is illegal to build any service for preprints without using this letter.

When I started out in publishing, the first job I was given was searching for scientists and asking if they would volunteer to peer-review the thousands of research papers that were coming into my work-queue. The scientists were always a pleasure to work with, but it was tough, repetitive work and the automation tools available were not much help.

At some point, it occurred to me that there might be a way to match referees to papers without having to do any work at all. Laziness is a wonderful motivator, so I went ahead and wrote what was perhaps the worst algorithm I could have come up with at the time. All it did was:

  • count the words in a research paper and then
  • find other research papers with the same words in similar proportions. (The assumption being that an author of a similar paper might be a suitable referee.)
  • For anyone working in Natural Language Processing (NLP), in the jargon of your field, this was basically a weighted bag-of-words model with cosine similarity.
  • To anyone not working in NLP: that means it was rubbish.

Unfortunately, while the potential to save time was clear to me, I could never get anyone to take the idea seriously and so it sat on the back-burner for many years.

Recently, the idea of content-based referee recommenders has come into vogue and most of them are based on more advanced versions of the same principle: count the words in papers and look for the authors of statistically-similar ones. This is great, because it means I can claim to be a pioneer of this field. (I appreciate that this is a bit like claiming to be a pioneer of ocean exploration after taking a particularly long bath, but no one else knows that, right?)

There’s a well-known problem with data science which is that, while data scientists can make data useful, it’s another thing altogether to make data usable. It’s our curse.

So, in the spirit of making data usable, I’ve built a little webapp which will recommend referees for arXiv preprints. Hopefully, this can reduce some of the manual effort and tedium in locating referees.

If you’re a researcher, arXangel might also interest you: it can recommend journals where you might publish your own preprints and will also show you similar articles to any preprint (which might help with a literature survey, for example).

Feel free to play around with the app and do let me know if you have any questions/feedback etc.

Written by

Data scientist working in research communication. #webapps #python #machinelearning #ai

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store