The InfluenceRanking™ Engine: The Nuts and Bolts of Our Ranking Technology
Looking Under the Hood
This article is technical but not overly technical. It provides enough explanation and justification to make plain that our InfluenceRanking engine aptly measures academic influence. At the same time, it does not provide full details of our engine, whose precise algorithms are proprietary. It looks under the hood without taking the engine apart. For a more user-friendly introduction to the InfluenceRanking engine, see our methodology page.
Scoring Influence by Person and Discipline
The InfluenceRanking engine calculates a numerical influence score for people, institutions, and disciplinary programs. It performs this calculation by drawing from Wikipedia/data, Crossref, and an ever growing of body of data reflecting academic achievement and merit.
The InfluenceRanking engine starts with measuring influence of a given person in a given discipline. Note that by disciplines, we also include subdisciplines (indeed, any widely recognized way of organizing and subdividing fields of study). So for example, someone like Alan Guth, who is influential in the discipline of physics became influential through his contributions to the physics subdiscipline of cosmology.
Once the InfluenceRanking engine assigns an influence score to a person for a given discipline, those scores can be cumulated in various ways:
- By cumulating influence scores for a given person across all disciplines, one calculates an overall influence score for that person.
- By cumulating influence scores for the multiple persons at a particular institution across a given discipline, one calculates the influence of that institution in that discipline.
- By cumulating influence scores for the multiple persons at a particular institution across all disciplines, one calculates the overall influence of that institution.
These cumulated influence scores, suitably normalized, can then be compared and ordered, inducing influence rankings for particular disciplinary programs at institutions as well as overall influence rankings for persons and institutions.
The Bootstrapping of Influence
At its most fundamental level, academic influence is always a relation between person and discipline. Leading figures in the academy trade in ideas, and those ideas become influential for and within particular disciplines. Einstein’s accomplishments did not make him influential as such. Rather, he contributed to the discipline of physics, and it was within the community of physicists that his work was recognized and even lauded. His overall influence therefore emerged out of his discipline-specific influence.
In assessing influence by discipline for persons, the InfluenceRanking engine avoids making influence a popularity contest. The point is not simpy to count the number of mentions a person receives. Rather, what is crucial to the InfluenceRanking engine is tracking the intersection of name mentions and discipline mentions, rewarding the proximity of these intersecting mentions, keeping track of relevant connecting hyperlinks and the amount of digital space devoted to the mentions. Such intersectional mentions become the “fine grains” that, when cumulated, yield the influence scores that in turn, when suitably normalized, yield a person’s InfluenceRanking vis-a-vis the discipline in question.
Once the InfluenceRanking engine is able to provide reliable calculations for measuring influence by discipline, the other derived influence scores become fairly straightforward, essentially bootstrapping influence scores for person by discipline to influence scores (1) for persons across disciplines, (2) for institutions by discipine, and (3) for institutions across disciplines (corresponding to the three bullet points above).
Commensurability and Noise-to-Signal Reduction
This broad-strokes characterization the InfluenceRanking engine is accurate as far as it goes, but there are many details to be worked out. These details fall into two categories: commensurability and noise-to-signal reduction. Commensurability here means making sure that diverse text documents and structured data on which the InfluenceRanking engine depends contribute to influence scores in a consistent way, without unduly weighting one set of data over another. Noise-to-signal reduction means making sure that we are picking up on signals of true academic influence and not drowning them in noise that may indicate influence in some more general sense but that is irrelevant to the academy.
Regarding commensurability, consider that the InfluenceRanking engine scores influence with Wikipedia through a statistical document analysis but with Crossref through a citation index approach. Combining influence scores from both data sources requires making them commensurable so that we’re not comparing apples and oranges.
Likewise, regarding noise-to-signal reduction, consider that celelbrities, such as politicians, musicians, artists, and actors, tend to score high on influence even though most cannot reasonably be said to exemplify academic influence. Terrorists and mass murderers have also developed a disconcerting habit of trying to slip past the InfluenceRanking engine: see our article ”Influence, Infamy, and the Case of Osama bin Laden.”
Quality Assurance and Fine-Tuning of the Engine
Accordingly, even though the InfluenceRanking engine is a sophisticated set of algorithms, it relies on a critical level of human quality control to fine-tune its results. This is not to say that we intervene in the algorithm, sticking in our fingers and disrupting its results. That would undercut the objectivity of our influence-based rankings and we don’t go there.
Nonetheless, our data scientists have spent years developing and refining the InfluenceRanking engine. Yet as with any powerful automated ranking engine, careful attention to subtle details is what keeps the rankings at AcademicInfluence.com from spinning out of control. To date, the algorithm has incorporated nearly 150 additional subtle refinements to keep the engine running smoothly. Some tweaks are as straightforward as finding effective ways to disambiguate names of academic influencers. Others require constricting the type of data that can disrupt signals of academic influence (such as celebrity, as noted above).
With the InfluenceRanking engine, AcademicInfluence.com has created a powerful database of academic persons and academic institutions that goes beyond simply determining who or what is academically influential now. For instance, by focusing on a certain range of birth years, users can parse this database to explore historical trends in education.
Take a historian of mathematics who wants to find the most influential school for mathematics in the late 1800s. By constraining the birth years to range between 1800 and 1900, math historians will find that the school with the highest InfluenceRanking in the discipline of mathematics is the University of Göttingen.
Finally, if you are unable to find a particular academic individual at AcademicInfluence.com, or if a school you want to explore is assigned no InfluenceRanking on our site, bear in mind that the InfluenceRanking engine and the databases on which it depends are constantly being updated, expanded, and refined. We are constantly striving to improve the InfluenceRanking engine.