Let’s do a thought experiment. Imagine there are just 2 journal publishers in the world: P1 and P2.
In year 1:
Then, in year 2:
Here’s the weird thing. P1 has grown very rapidly in absolute terms, but the market share of P1 has declined from 100% to 99%. This might seem a little odd, but it’s really just a feature of any market where new entrants are common.
A common explanation proposed for this is that women have taken on greater domestic responsibilities during these times and this has prevented them from doing research. This may indeed be true, but I think there is something else going on here.
There’s a well-known fact among academic publishers which is that, as soon as COVID lockdown started across the globe in early March, the number of research papers being submitted for peer-review surged…
ArXangel.net has a few features which I’ve covered in past blog posts, but the latest feature is a sign-in function.
There are lots of services out there for searching for preprints or even for building recommendations (e.g. input some keywords, or pick a list…
The most surprising thing I’ve learned from overseeing the peer-review process is that science is an inherently subjective pursuit.
I’d always known science as the realm of the cold-hard-fact, the empirical observation and of steadfast, unyielding logic (of course it is!). But I’ve learned that, as long as science is a human endeavour, opinion is there too.
Don’t believe me? Imagine you are asked to referee a paper which rests on certain assumptions. How do you feel about those?
Data science in the realm of scholarly publishing is challenging for this reason: subjectivity finds its way into the data, too.
If you are editing a journal and your journal has a lot of overlap with ArXiv, then ArXangel.net can offer a few useful services.
It’s common practice for editors to search ArXiv for new preprints that fit their journal. If a preprint looks interesting, then the editor might invite the authors to submit the preprint to their journal.
With ArXangel, you no longer have to perform that search manually. ArXangel can show you suitable preprints for your journal in a feed.
E.g. Here is the feed for a journal called Neurocomputing https://arxangel.net/journal_feed/?journal_name=Neurocomputing
This list shows articles which are similar to…
It’s been a while since the last update on ArXangel. The main reason for this being that I was very busy updating ArXangel. The site has a few new features which I think might be of interest.
The old standard functionality still exists. You can still use the site to:
In a recent post, I introduced ArXangel — a hobby project of mine which recommends referees for arXiv preprints. It’s a very simple application. All it does is takes some arXiv preprint, finds similar published papers and then lists the authors of those papers as potential referees. This is what might be called a ‘high accuracy’ approach to recommendation in that our approach is only supposed to find people with the right expertise and ignores other considerations.
That’s what we want, right? The right expertise?
This might sound like a stupid question to anyone who hasn’t spent a lot of…
When I started out in publishing, the first job I was given was searching for scientists and asking if they would volunteer to peer-review the thousands of research papers that were coming into my work-queue. The scientists were always a pleasure to work with, but it was tough, repetitive work and the automation tools available were not much help.
At some point, it occurred to me that there might be a way to match referees to papers without having to do any work at all. Laziness is a wonderful motivator, so I went ahead and wrote what was perhaps the…
In a recent blog post, we saw how we can use data to define a semantic space for covid-19-related papers and get a nice visualisation of that space, like this:
You can see that the coronavirus papers (which come from the CORD-19 dataset by Semantic Scholar) and the other papers (a random sample of PubMed) occupy quite different regions of the space. That’s good, because it means that we can build a machine-learning classifier to discriminate between those 2 datasets.
That classifier is essentially a tool which can assign a probability that any document in this space is relevant to…
One thing that helps with initial exploration of data is to visualise the data and look for clusters and other patterns. It is relatively trivial to do.
Unfortunately, visualising the CORD-19 data on its own does not put the data into context. For that, we need to compare with a much larger dataset. Most of the world’s medical science journals (at least, the good ones) are indexed by a service called PubMed. …
Data scientist working in research communication. #webapps #python #machinelearning #ai