Which is better: maths or physics? Some ArXiv text analysis.

Imagine that we have 2 research papers: one is on mathematics and the other is on astrophysics. Which is better?

Image for post
Image for post
Image credit: ‘TheBusyBrain’ Flickr, license: CC-BY 2.0

It’s subjective, of course. And while arXiv users from different tribes may have already made their minds up on the matter, let’s see if we can make an objective judgment using citation data.

We might try to measure the impact that each paper has had by simply comparing their citation counts. Unfortunately, that’s not terribly fair. Different fields of research have different citation rates. So a large and fast-moving field like astrophysics might have high citation rates and mathematics, which is smaller and progresses more slowly, might have low citation rates.

It might be helpful if we could standardise citation-counts by field so that we can make an apples-with-apples comparison.

I’m going to look at an approach to this based on topic modelling. If you’ve never heard of topic modelling, I’ve added a brief introduction in an appendix.

I’ve started by creating a 100-topic model of ArXiv. Once we have such a model, each of our articles will have one ‘primary topic’ which is most closely associated with it. If we assign each article to its primary topic, it’s a lot like grouping our articles into their respective fields.

Now we can compare the citation rates of these fields.

Note that we can get this far without much manual work — almost the whole process is automatable. This may be food for thought if you ever need to categorise a bunch of articles by subject area and can’t be bothered to read them.

Interpreting the model

  • Look at the right side of the x-axis. We can see that topics 33, 40, 54 have the lowest mean citations. If we scroll down to our topic model in the appendix, we can see that these topics are associated with mathematics.
  • At the left side of the x-axis we see topics 42, 18, 62 have high citations. These topics appear to be associated with astrophysics.
  • This is in line with our expectations; we already knew that maths had lower citation rates than astrophysics.
  • But it illustrates an important point: we can’t compare astrophysics papers with mathematics papers. An average astrophysics paper in this dataset might have ~30 citations and would therefore be considered a highly-cited outlier if it was compared with a distribution of maths papers where the average paper has only ~2 citations.

Levelling the field

It’s very straightforward to take a model like this and standardise it.

Here, I have used SKLearn’s RobustScaler. This gives us new values for citation-counts for each article which are more easily compared. Now an average astrophysics paper is no better cited than an average mathematics paper, but outliers in either field are easy to spot.

Now we can make a much fairer comparison of maths papers and astrophysics papers!

Some things to consider

I hear that getting a job in academia is tough. There is a huge amount of competition for jobs and not enough of them to go around. It’s disappointing to learn that researchers applying for jobs are often judged on citation counts, h-indices and even journal impact factors. These are not generally valid measures of achievement or ability and can be taken out of context.

Context matters!

APPENDIX

Topic modelling

If topic modelling is unfamiliar to you, it goes something like this:

  • take some documents — in this case titles and abstracts of ArXiv preprints from 2012.
  • Try to find groups of words that tend to appear together inside documents.
  • For example: we might find that the words ‘black’, ‘hole’, ‘theory’, ‘horizon’, ‘entropy’, and ‘thermodynamics’ all tend to appear together in papers about black hole theory.
  • We can then infer that a paper is about black hole theory even if it just has the words ‘horizon’, ‘entropy’, and ‘thermodynamics’ in it.

These groups of words are called ‘topics’. The underlying assumption of a topic model is that topics are made of words and documents are made of topics.

So, our topic model will be built using ArXiv’s 2012 content. The full model is shown below, but here’s a snippet:

(Topic 41, ‘0.033*”mass” + 0.028*”model” + 0.022*”higg” + 0.014*”coupl” + 0.013*”standard_model” + 0.012*”decai” + 0.012*”new” + 0.011*”gev” + 0.011*”higg_boson” + 0.011*”ar”’),

(Topic 42, ‘0.043*”halo” + 0.028*”profil” + 0.021*”satellit” + 0.018*”simul” + 0.016*”stream” + 0.015*”tidal” + 0.012*”escap” + 0.012*”galact” + 0.012*”milki_wai” + 0.011*”veloc”’),

(Topic 43, ‘0.084*”black_hole” + 0.053*”binari” + 0.030*”hole” + 0.027*”mass” + 0.024*”neutron_star” + 0.023*”omega” + 0.017*”extrem” + 0.017*”bubbl” + 0.015*”horizon” + 0.014*”collaps”’),

If you’re familiar with physics, it should be clear that Topic 41 comes from particle physics, Topic 42 is something to do with galaxy formation and Topic 43 looks like numerical relativity.

Full topic model

This model was created using Gensim’s LDA module.

Some words will appear shortened, this is due to ‘stemming’ where we reduce the number of unique words in our set of documents by converting ‘force’, ‘forcing’, ‘forced’ all into the one ‘stem’ word ‘forc’. This should mean that there are fewer synonyms in our dataset, but the downside is that sometimes those different versions of a word do matter (consider the different context of those 3 words in physics).

You will also see some ‘ngrams’, i.e. sequences of words which we treat as one word. The canonical example is ‘New York’ which is 2 words that we treat as one word. LDA ignores word-ordering if we don’t do this. Below, you can see some examples where ngrams are detected and joined with an underscore, like ‘neutrino_mass’ or ‘liquid_crystal’.

[(0, ‘0.268*”function” + 0.041*”weight” + 0.026*”seri” + 0.019*”delta” + 0.018*”branch” + 0.018*”valu” + 0.017*”exponenti” + 0.017*”sum” + 0.013*”ar” + 0.013*”zero”’),

(1, ‘0.056*”correct” + 0.040*”factor” + 0.032*”calcul” + 0.030*”contribut” + 0.020*”effect” + 0.016*”ar” + 0.014*”result” + 0.012*”order” + 0.011*”product” + 0.010*”lead”’),

(2, ‘0.092*”mix” + 0.072*”neutrino” + 0.029*”hierarchi” + 0.018*”neutrino_mass” + 0.018*”heavi” + 0.013*”ar” + 0.012*”dirac” + 0.011*”mass” + 0.010*”nu” + 0.009*”flavor”’),

(3, ‘0.028*”manifold” + 0.028*”surfac” + 0.021*”singular” + 0.017*”point” + 0.015*”geometri” + 0.015*”geometr” + 0.015*”complex” + 0.013*”ar” + 0.012*”thi” + 0.011*”smooth”’),

(4, ‘0.070*”site” + 0.052*”structur” + 0.048*”distort” + 0.046*”protein” + 0.021*”bind” + 0.020*”hexagon” + 0.012*”mn” + 0.012*”earthquak” + 0.011*”residu” + 0.010*”avalanch”’),

(5, ‘0.185*”index” + 0.036*”squeez” + 0.030*”nemat” + 0.029*”len” + 0.025*”braid” + 0.022*”red” + 0.019*”groupoid” + 0.018*”blue” + 0.017*”liquid_crystal” + 0.017*”transact”’),

(6, ‘0.303*”metric” + 0.029*”cp” + 0.024*”ch” + 0.018*”squar” + 0.016*”lorentzian” + 0.014*”r_d” + 0.013*”distanc” + 0.012*”factori” + 0.012*”magic” + 0.012*”curvaton”’),

(7, ‘0.092*”forward” + 0.063*”front” + 0.046*”backward” + 0.036*”mm” + 0.032*”ep” + 0.019*”ne” + 0.017*”ilc” + 0.016*”12c” + 0.013*”cdot” + 0.012*”ture”’),

(8, ‘0.133*”imag” + 0.050*”object” + 0.038*”reconstruct” + 0.024*”bodi” + 0.023*”us” + 0.019*”shape” + 0.015*”segment” + 0.012*”techniqu” + 0.009*”filter” + 0.009*”wavelet”’),

(9, ‘0.121*”decomposit” + 0.114*”tensor” + 0.047*”fusion” + 0.046*”discontinu” + 0.032*”mesh” + 0.029*”finit_element” + 0.024*”multiscal” + 0.020*”su” + 0.017*”decompos” + 0.015*”cube”’),

(10, ‘0.093*”popul” + 0.057*”speci” + 0.054*”spread” + 0.033*”model” + 0.026*”persist” + 0.016*”infect” + 0.016*”diseas” + 0.013*”epidem” + 0.013*”competit” + 0.012*”twin”’),

(11, ‘0.019*”chiral” + 0.014*”fermion” + 0.014*”ar” + 0.014*”model” + 0.013*”mass” + 0.013*”symmetri” + 0.013*”quark” + 0.011*”effect” + 0.010*”break” + 0.010*”qcd”’),

(12, ‘0.068*”particl” + 0.045*”wave” + 0.021*”ar” + 0.019*”instabl” + 0.019*”plasma” + 0.018*”propag” + 0.017*”veloc” + 0.015*”densiti” + 0.014*”potenti” + 0.012*”effect”’),

(13, ‘0.036*”network” + 0.015*”us” + 0.014*”data” + 0.013*”ar” + 0.010*”thi” + 0.009*”system” + 0.009*”research” + 0.009*”inform” + 0.008*”applic” + 0.008*”thi_paper”’),

(14, ‘0.074*”sequenc” + 0.057*”select” + 0.026*”gene” + 0.019*”genet” + 0.016*”genom” + 0.013*”plate” + 0.013*”mutat” + 0.012*”divers” + 0.012*”void” + 0.010*”popul”’),

(15, ‘0.046*”pt” + 0.036*”hermitian” + 0.029*”plot” + 0.025*”fault” + 0.024*”antenna” + 0.021*”harvest” + 0.017*”rb” + 0.016*”st” + 0.016*”ptsymmetr” + 0.015*”nonhermitian”’),

(16, ‘0.157*”curv” + 0.073*”compress” + 0.050*”elast” + 0.038*”stress” + 0.037*”strain” + 0.032*”shear” + 0.024*”modulu” + 0.020*”load” + 0.017*”moduli” + 0.017*”kappa”’),

(17, ‘0.057*”fluid” + 0.042*”turbul” + 0.037*”vortex” + 0.031*”vortic” + 0.026*”flow” + 0.025*”dissip” + 0.017*”viscos” + 0.016*”granular” + 0.016*”cascad” + 0.015*”veloc”’),

(18, ‘0.057*”galaxi” + 0.047*”cluster” + 0.017*”mass” + 0.014*”ar” + 0.012*”sampl” + 0.012*”star_format” + 0.011*”redshift” + 0.009*”us” + 0.008*”we_find” + 0.008*”thi”’),

(19, ‘0.178*”stabil” + 0.115*”stabl” + 0.030*”tail” + 0.027*”unstabl” + 0.025*”secondord” + 0.020*”second_order” + 0.014*”perturb” + 0.013*”opinion” + 0.013*”critic” + 0.012*”consensu”’),

(20, ‘0.059*”period” + 0.047*”frequenc” + 0.033*”observ” + 0.031*”variabl” + 0.025*”time” + 0.017*”dure” + 0.017*”radio” + 0.016*”arrai” + 0.016*”flare” + 0.016*”chang”’),

(21, ‘0.066*”field” + 0.026*”spacetim” + 0.024*”graviti” + 0.016*”thi” + 0.015*”gravit” + 0.014*”theori” + 0.014*”relativist” + 0.014*”vacuum” + 0.013*”effect” + 0.013*”ar”’),

(22, ‘0.035*”channel” + 0.017*”node” + 0.016*”rate” + 0.015*”network” + 0.013*”scheme” + 0.012*”ar” + 0.012*”receiv” + 0.012*”perform” + 0.012*”transmiss” + 0.011*”propos”’),

(23, ‘0.069*”scale” + 0.055*”chain” + 0.053*”model” + 0.044*”loop” + 0.031*”expon” + 0.024*”defect” + 0.017*”conform” + 0.015*”polym” + 0.015*”ar” + 0.014*”critic”’),

(24, ‘0.123*”temperatur” + 0.057*”thermal” + 0.034*”heat” + 0.022*”increas” + 0.020*”cool” + 0.016*”decreas” + 0.011*”effect” + 0.010*”ic” + 0.010*”crystal” + 0.009*”at_low”’),

(25, ‘0.078*”method” + 0.069*”problem” + 0.030*”approxim” + 0.017*”thi” + 0.017*”solv” + 0.016*”approach” + 0.015*”us” + 0.014*”numer” + 0.013*”ar” + 0.012*”set”’),

(26, ‘0.034*”rotat” + 0.031*”magnet_field” + 0.018*”solar” + 0.016*”magnet” + 0.016*”ar” + 0.015*”observ” + 0.009*”model” + 0.009*”shock” + 0.009*”field” + 0.008*”structur”’),

(27, ‘0.070*”graph” + 0.029*”set” + 0.024*”number” + 0.020*”edg” + 0.018*”tree” + 0.016*”vertic” + 0.013*”thi” + 0.013*”degre” + 0.013*”ar” + 0.012*”color”’),

(28, ‘0.099*”bar” + 0.018*”qso” + 0.017*”median” + 0.016*”21cm” + 0.014*”loos” + 0.013*”euler_characterist” + 0.012*”smg” + 0.012*”p_q” + 0.011*”isotropi” + 0.010*”ast”’),

(29, ‘0.076*”coher” + 0.062*”puls” + 0.052*”transfer” + 0.043*”molecul” + 0.035*”excit” + 0.029*”synchron” + 0.019*”filter” + 0.011*”molecular” + 0.011*”vibrat” + 0.011*”polariton”’),

(30, ‘0.081*”water” + 0.042*”24n” + 0.041*”depth” + 0.039*”c24" + 0.033*”cs” + 0.027*”27d24" + 0.024*”textit” + 0.017*”245ceta24" + 0.016*”polariz” + 0.015*”pd”’),

(31, ‘0.019*”survei” + 0.013*”us” + 0.013*”ar” + 0.013*”data” + 0.011*”observ” + 0.011*”dust” + 0.011*”telescop” + 0.010*”sourc” + 0.009*”distanc” + 0.008*”present”’),

(32, ‘0.053*”pattern” + 0.031*”neuron” + 0.029*”wall” + 0.028*”big” + 0.025*”spike” + 0.025*”domain_wall” + 0.024*”input” + 0.024*”s24" + 0.021*”neural_network” + 0.020*”train”’),

(33, ‘0.124*”conjectur” + 0.044*”twist” + 0.030*”number” + 0.027*”modular” + 0.022*”prove” + 0.020*”odd” + 0.017*”24e” + 0.014*”arithmet” + 0.013*”toric” + 0.011*”refin”’),

(34, ‘0.412*”state” + 0.046*”fraction” + 0.019*”topolog” + 0.017*”edg” + 0.014*”ar” + 0.013*”bound” + 0.011*”degeneraci” + 0.011*”quantum_hall” + 0.007*”majorana” + 0.007*”box”’),

(35, ‘0.020*”layer” + 0.017*”surfac” + 0.016*”graphen” + 0.015*”superconduct” + 0.012*”structur” + 0.010*”film” + 0.010*”metal” + 0.009*”materi” + 0.009*”ar” + 0.009*”electron”’),

(36, ‘0.027*”game” + 0.021*”strategi” + 0.018*”thi” + 0.014*”ar” + 0.014*”agent” + 0.012*”question” + 0.012*”individu” + 0.009*”mechan” + 0.008*”inform” + 0.008*”player”’),

(37, ‘0.110*”equat” + 0.103*”solut” + 0.026*”nonlinear” + 0.016*”ar” + 0.014*”global” + 0.013*”problem” + 0.013*”exist” + 0.010*”obtain” + 0.010*”gener” + 0.010*”case”’),

(38, ‘0.126*”distribut” + 0.046*”statist” + 0.043*”invari” + 0.033*”relat” + 0.032*”law” + 0.022*”ensembl” + 0.020*”thermodynam” + 0.020*”gener” + 0.014*”deriv” + 0.011*”thi”’),

(39, ‘0.163*”expans” + 0.047*”inclus” + 0.028*”cumul” + 0.026*”calcul” + 0.026*”green_function” + 0.014*”flag” + 0.014*”approxim” + 0.011*”asymptot_expans” + 0.011*”carlo” + 0.011*”hydrodynam”’),

(40, ‘0.133*”group” + 0.035*”represent” + 0.026*”finit” + 0.015*”subgroup” + 0.015*”gener” + 0.013*”24g24" + 0.012*”noncommut” + 0.012*”free” + 0.012*”action” + 0.012*”ar”’),

(41, ‘0.033*”mass” + 0.028*”model” + 0.022*”higg” + 0.014*”coupl” + 0.013*”standard_model” + 0.012*”decai” + 0.012*”new” + 0.011*”gev” + 0.011*”higg_boson” + 0.011*”ar”’),

(42, ‘0.043*”halo” + 0.028*”profil” + 0.021*”satellit” + 0.018*”simul” + 0.016*”stream” + 0.015*”tidal” + 0.012*”escap” + 0.012*”galact” + 0.012*”milki_wai” + 0.011*”veloc”’),

(43, ‘0.084*”black_hole” + 0.053*”binari” + 0.030*”hole” + 0.027*”mass” + 0.024*”neutron_star” + 0.023*”omega” + 0.017*”extrem” + 0.017*”bubbl” + 0.015*”horizon” + 0.014*”collaps”’),

(44, ‘0.145*”alpha” + 0.103*”beta” + 0.096*”inhomogen” + 0.042*”fractal” + 0.032*”selfsimilar” + 0.027*”245cmu24" + 0.011*”digraph” + 0.011*”quantum_electrodynam” + 0.011*”wedg” + 0.011*”4manifold”’),

(45, ‘0.035*”decai” + 0.027*”measur” + 0.023*”product” + 0.023*”ar” + 0.016*”data” + 0.013*”event” + 0.012*”us” + 0.011*”result” + 0.010*”gamma” + 0.010*”search”’),

(46, ‘0.169*”beam” + 0.055*”momentum” + 0.032*”colour” + 0.024*”green” + 0.021*”reflect” + 0.019*”seed” + 0.017*”sea” + 0.012*”inject” + 0.012*”incid” + 0.011*”orbit_angular”’),

(47, ‘0.106*”gap” + 0.096*”frame” + 0.072*”perfect” + 0.055*”diagon” + 0.046*”mirror” + 0.035*”skew” + 0.021*”rectangular” + 0.019*”impuls” + 0.016*”dft” + 0.012*”swarm”’),

(48, ‘0.159*”polar” + 0.087*”photon” + 0.050*”transvers” + 0.040*”asymmetri” + 0.026*”longitudin” + 0.021*”virtual” + 0.020*”diffract” + 0.020*”helic” + 0.012*”clock” + 0.010*”twophoton”’),

(49, ‘0.090*”torsion” + 0.035*”fano” + 0.033*”glass” + 0.029*”firstli” + 0.025*”secondli” + 0.024*”calabiyau” + 0.023*”ln” + 0.021*”slit” + 0.019*”gl” + 0.017*”threefold”’),

(50, ‘0.030*”physic” + 0.024*”theori” + 0.020*”discuss” + 0.020*”thi” + 0.018*”review” + 0.016*”ar” + 0.013*”model” + 0.012*”string” + 0.010*”understand” + 0.009*”new”’),

(51, ‘0.078*”univers” + 0.071*”model” + 0.044*”cosmolog” + 0.024*”paramet” + 0.018*”matter” + 0.016*”dark_energi” + 0.016*”observ” + 0.015*”evolut” + 0.011*”thi” + 0.009*”shell”’),

(52, ‘0.025*”matrix” + 0.022*”number” + 0.021*”24n24" + 0.020*”matric” + 0.015*”let” + 0.014*”ar” + 0.011*”sequenc” + 0.011*”24p24" + 0.011*”show_that” + 0.011*”24k24"’),

(53, ‘0.035*”entropi” + 0.033*”hamiltonian” + 0.019*”quantiz” + 0.016*”ar” + 0.016*”gener” + 0.015*”formul” + 0.015*”thi” + 0.013*”term” + 0.013*”transform” + 0.013*”theori”’),

(54, ‘0.062*”algebra” + 0.017*”gener” + 0.017*”ring” + 0.016*”categori” + 0.015*”modul” + 0.014*”thi” + 0.013*”ar” + 0.011*”construct” + 0.011*”ideal” + 0.011*”varieti”’),

(55, ‘0.070*”process” + 0.033*”time” + 0.029*”stochast” + 0.029*”diffus” + 0.019*”discret” + 0.019*”converg” + 0.017*”moment” + 0.013*”distribut” + 0.013*”ar” + 0.012*”limit”’),

(56, ‘0.080*”spin” + 0.071*”magnet” + 0.018*”magnet_field” + 0.016*”order” + 0.014*”ferromagnet” + 0.013*”field” + 0.011*”effect” + 0.010*”anisotropi” + 0.010*”antiferromagnet” + 0.008*”superconduct”’),

(57, ‘0.052*”activ” + 0.050*”respons” + 0.033*”concentr” + 0.028*”model” + 0.021*”reaction” + 0.018*”mechan” + 0.018*”molecular” + 0.018*”chemic” + 0.017*”biolog” + 0.014*”membran”’),

(58, ‘0.069*”algorithm” + 0.045*”comput” + 0.028*”code” + 0.024*”us” + 0.023*”effici” + 0.017*”implement” + 0.014*”perform” + 0.012*”simul” + 0.012*”present” + 0.012*”error”’),

(59, ‘0.061*”model” + 0.036*”estim” + 0.034*”data” + 0.022*”us” + 0.016*”sampl” + 0.016*”ar” + 0.015*”paramet” + 0.014*”method” + 0.011*”analysi” + 0.010*”propos”’),

(60, ‘0.028*”emiss” + 0.025*”xrai” + 0.021*”observ” + 0.017*”line” + 0.014*”ar” + 0.014*”sourc” + 0.013*”agn” + 0.011*”region” + 0.009*”detect” + 0.008*”thi”’),

(61, ‘0.243*”space” + 0.063*”topolog” + 0.043*”lattic” + 0.024*”compact” + 0.018*”properti” + 0.018*”set” + 0.016*”24x24" + 0.015*”continu” + 0.014*”subspac” + 0.014*”measur”’),

(62, ‘0.052*”star” + 0.015*”ar” + 0.013*”planet” + 0.012*”stellar” + 0.011*”mass” + 0.011*”observ” + 0.009*”thi” + 0.009*”orbit” + 0.008*”abund” + 0.007*”model”’),

(63, ‘0.083*”optim” + 0.042*”problem” + 0.020*”algorithm” + 0.018*”cost” + 0.015*”control” + 0.013*”minim” + 0.012*”decis” + 0.011*”polici” + 0.011*”constraint” + 0.010*”maxim”’),

(64, ‘0.059*”24q24" + 0.048*”exciton” + 0.036*”circular” + 0.026*”insert” + 0.026*”merg” + 0.026*”fibr” + 0.022*”exit” + 0.020*”rod” + 0.019*”binomi” + 0.019*”kolmogorov”’),

(65, ‘0.061*”interpol” + 0.038*”cme” + 0.037*”nova” + 0.030*”eject” + 0.027*”mc” + 0.025*”bias” + 0.020*”larg_deviat” + 0.016*”erupt” + 0.016*”hadamard” + 0.014*”hash”’),

(66, ‘0.153*”cell” + 0.043*”visual” + 0.037*”recurr” + 0.026*”cellular” + 0.023*”self” + 0.018*”pl” + 0.015*”tower” + 0.014*”discrimin” + 0.014*”tissu” + 0.014*”nodal”’),

(67, ‘0.028*”theorem” + 0.020*”proof” + 0.018*”thi” + 0.017*”us” + 0.017*”theori” + 0.016*”gener” + 0.012*”formal” + 0.012*”logic” + 0.011*”notion” + 0.010*”present”’),

(68, ‘0.060*”ball” + 0.040*”prime” + 0.029*”van_der” + 0.025*”tip” + 0.022*”crack” + 0.022*”waal” + 0.022*”retard” + 0.021*”middl” + 0.020*”outlier” + 0.018*”stein”’),

(69, ‘0.064*”energi” + 0.024*”ar” + 0.024*”scatter” + 0.021*”nuclear” + 0.019*”calcul” + 0.016*”interact” + 0.014*”reson” + 0.014*”nuclei” + 0.013*”us” + 0.012*”neutron”’),

(70, ‘0.206*”famili” + 0.072*”causal” + 0.063*”ergod” + 0.041*”tilt” + 0.040*”member” + 0.022*”ab” + 0.017*”bernoulli” + 0.016*”ribbon” + 0.015*”cauchi” + 0.015*”orthogon_polynomi”’),

(71, ‘0.038*”mode” + 0.033*”optic” + 0.028*”atom” + 0.027*”reson” + 0.017*”light” + 0.015*”frequenc” + 0.015*”caviti” + 0.013*”trap” + 0.012*”laser” + 0.012*”oscil”’),

(72, ‘0.164*”system” + 0.104*”dynam” + 0.024*”control” + 0.015*”model” + 0.014*”time” + 0.012*”ar” + 0.011*”coupl” + 0.010*”studi” + 0.008*”behavior” + 0.007*”two”’),

(73, ‘0.027*”forc” + 0.021*”simul” + 0.015*”ar” + 0.014*”model” + 0.013*”liquid” + 0.012*”surfac” + 0.011*”structur” + 0.010*”interfac” + 0.010*”size” + 0.009*”solid”’),

(74, ‘0.088*”phi” + 0.047*”dna” + 0.045*”textur” + 0.035*”top” + 0.032*”evapor” + 0.032*”sort” + 0.025*”ladder” + 0.023*”partner” + 0.014*”precipit” + 0.011*”prolong”’),

(75, ‘0.073*”line” + 0.069*”doubl” + 0.067*”configur” + 0.042*”angl” + 0.040*”theta” + 0.030*”sign” + 0.028*”stack” + 0.025*”bundl” + 0.024*”moduli_space” + 0.021*”face”’),

(76, ‘0.045*”measur” + 0.026*”signal” + 0.019*”detect” + 0.019*”detector” + 0.019*”us” + 0.016*”sensit” + 0.016*”design” + 0.015*”experi” + 0.014*”nois” + 0.012*”perform”’),

(77, ‘0.038*”ion” + 0.030*”structur” + 0.028*”pressur” + 0.021*”24_224" + 0.018*”calcul” + 0.016*”electron” + 0.016*”ar” + 0.014*”compound” + 0.014*”245calpha24" + 0.013*”bond”’),

(78, ‘0.143*”quantum” + 0.031*”entangl” + 0.023*”state” + 0.022*”measur” + 0.018*”classic” + 0.015*”qubit” + 0.012*”thi” + 0.011*”ar” + 0.010*”us” + 0.010*”system”’),

(79, ‘0.138*”variat” + 0.043*”price” + 0.037*”market” + 0.030*”risk” + 0.029*”model” + 0.024*”return” + 0.020*”option” + 0.018*”barrier” + 0.016*”stop” + 0.016*”volatil”’),

(80, ‘0.214*”threshold” + 0.059*”percol” + 0.042*”satur” + 0.040*”ir” + 0.026*”polaris” + 0.021*”minima” + 0.019*”multifract” + 0.019*”maxima” + 0.018*”mf” + 0.016*”lda”’),

(81, ‘0.065*”conserv” + 0.063*”cone” + 0.042*”principl” + 0.031*”rough” + 0.024*”conserv_law” + 0.024*”implicit” + 0.022*”cylindr” + 0.020*”rai” + 0.018*”maximum” + 0.012*”cap”’),

(82, ‘0.122*”test” + 0.026*”text” + 0.019*”24b” + 0.018*”sensor” + 0.017*”us” + 0.016*”wa” + 0.015*”student” + 0.013*”document” + 0.012*”present” + 0.011*”facil”’),

(83, ‘0.077*”dark_matter” + 0.025*”constraint” + 0.020*”annihil” + 0.020*”dark” + 0.017*”direct” + 0.017*”dm” + 0.016*”cmb” + 0.015*”background” + 0.015*”detect” + 0.015*”power_spectrum”’),

(84, ‘0.064*”24d” + 0.039*”fm” + 0.037*”landscap” + 0.028*”disloc” + 0.028*”selforgan” + 0.020*”soc” + 0.017*”artifact” + 0.016*”bilay_graphen” + 0.016*”forbidden” + 0.012*”motif”’),

(85, ‘0.042*”ga” + 0.040*”disk” + 0.017*”disc” + 0.016*”format” + 0.016*”core” + 0.015*”model” + 0.013*”accret” + 0.012*”inner” + 0.012*”mass” + 0.010*”radial”’),

(86, ‘0.107*”3d” + 0.091*”2d” + 0.075*”revers” + 0.071*”potenti” + 0.053*”adiabat” + 0.023*”1d” + 0.019*”casimir” + 0.015*”dirac_equat” + 0.013*”cellular_automata” + 0.012*”flip”’),

(87, ‘0.060*”theori” + 0.033*”action” + 0.023*”limit” + 0.016*”amplitud” + 0.014*”thi” + 0.013*”us” + 0.012*”comput” + 0.011*”gaug_theori” + 0.010*”partit_function” + 0.010*”correl_function”’),

(88, ‘0.044*”electron” + 0.023*”current” + 0.022*”conduct” + 0.020*”transport” + 0.020*”charg” + 0.017*”effect” + 0.015*”devic” + 0.011*”tunnel” + 0.010*”electr” + 0.010*”quantum_dot”’),

(89, ‘0.104*”pulsar” + 0.043*”24r” + 0.039*”video” + 0.038*”recombin” + 0.025*”psr” + 0.021*”shadow” + 0.020*”l_evi” + 0.018*”aa” + 0.016*”spot” + 0.016*”seismic”’),

(90, ‘0.041*”secur” + 0.037*”protocol” + 0.034*”kei” + 0.026*”phy” + 0.025*”et_al” + 0.021*”cloud” + 0.020*”attack” + 0.019*”rev” + 0.014*”comment” + 0.013*”commun”’),

(91, ‘0.080*”lambda” + 0.047*”content” + 0.045*”tau” + 0.039*”mu” + 0.027*”social” + 0.025*”social_network” + 0.025*”user” + 0.022*”influenc” + 0.020*”topic” + 0.018*”onlin”’),

(92, ‘0.157*”random” + 0.070*”formula” + 0.034*”diagram” + 0.026*”probabl” + 0.022*”gaussian” + 0.020*”asymptot” + 0.018*”symbol” + 0.012*”24l24" + 0.012*”measur” + 0.011*”gener”’),

(93, ‘0.116*”cover” + 0.073*”interv” + 0.065*”length” + 0.053*”word” + 0.049*”translat” + 0.048*”languag” + 0.023*”polygon” + 0.018*”automata” + 0.015*”filtrat” + 0.011*”letter”’),

(94, ‘0.047*”oper” + 0.046*”bound” + 0.023*”domain” + 0.020*”inequ” + 0.018*”result” + 0.017*”condit” + 0.016*”spectral” + 0.016*”ar” + 0.015*”estim” + 0.014*”prove”’),

(95, ‘0.143*”correl” + 0.116*”flow” + 0.052*”fluctuat” + 0.052*”jet” + 0.025*”collis” + 0.014*”epsilon” + 0.012*”correl_between” + 0.011*”initi” + 0.010*”depend” + 0.009*”24v”’),

(96, ‘0.054*”phase” + 0.020*”transit” + 0.018*”model” + 0.017*”interact” + 0.016*”phase_transit” + 0.012*”disord” + 0.012*”system” + 0.012*”studi” + 0.010*”order” + 0.010*”critic”’),

(97, ‘0.164*”rule” + 0.056*”truncat” + 0.051*”rho” + 0.024*”xi” + 0.022*”ontolog” + 0.021*”team” + 0.016*”revis” + 0.016*”elect” + 0.015*”mismatch” + 0.014*”sigma”’),

(98, ‘0.035*”sourc” + 0.025*”spectrum” + 0.022*”energi” + 0.019*”observ” + 0.019*”flux” + 0.018*”gammarai” + 0.013*”peak” + 0.013*”emiss” + 0.011*”grb” + 0.011*”cosmic_rai”’),

(99, ‘0.117*”map” + 0.106*”integr” + 0.087*”polynomi” + 0.037*”deform” + 0.023*”transform” + 0.022*”ar” + 0.021*”gener” + 0.012*”ident” + 0.012*”dual” + 0.009*”cyclic”’)]

Written by

Data scientist working in research communication. #webapps #python #machinelearning #ai

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store