To find the people and schools with the most academic influence, we look to Wikipedia. For people, we count how often a person is mentioned on a Wikipedia page (NameMentions) and how often a discipline is mentioned (DisciplineMentions) on that same page. We multiply the two numbers to obtain PageInfluence and sum over all pages to measure the person’s InfluenceRanking™. To calculate a school’s influence, we combine the InfluenceRanking™ of all the people who attended, worked for, or link to the school (from their Wikipedia page, within the first 20 links).
However, before we arrive at each school’s total academic influence, we constrain the people, the pages, and the disciplines.
First, we constrain people by including only people listed in Wikipedia’s Wikidata. Specifically, we filter Wikidata for humans (Q15978631) in 15 languages (en, eo, pl, fr, br, de, it, en-gb, ilo, es, en-ca, af, an, ay, and bcl). Thus, only NameMentions for people in Wikidata’s list of humans will be used to calculate a school’s influence. Also, because some professions are overrepresented on Wikipedia, we exclude from our calculations four types of people: politicians born in or after 1900, musicians, artists, and actors.
Second, we constrain the pages by ignoring three types of pages: disambiguation pages, pages that start with “List of” or “Timeline of”, and pages that contain the word “bibliography”. We also ignore albums, singles, films, television shows/episodes/season, and video games. These types of pages and templates have so many NameMentions and DisciplineMentions that including them would skew our results.
Third, we constrain disciplines by not calculating a person’s InfluenceRanking™ in a discipline unless that person’s profession (as specified by Wikidata or as found in the first paragraph of their Wikipedia page) is related to the discipline. For example, a person who is a biophysicist could not have influence in art history but could have influence only within biophysics or the larger discipline of physics.
Lastly, we fine-tune our InfluenceRanking™ calculation. First, we give triple value to NameMentions and DisciplineMentions found in a Wikipedia page title. Second, if two people have the same name and we are not able to distinguish between the two (e.g. if the page isn’t about one of them or doesn’t link to one of them), then we credit the NameMentions to the person with the longer Wikipedia page. Third, we make sure not to credit a Jr. for their father’s NameMentions or a Sr. for their son’s NameMentions.
Using this set of constraints and fine-tuning, we have created a useful database of InfluenceRanking™ for people and schools. Additionally, since we find each person’s birth year in Wikidata or on Wikipedia, users can parse this database in many interesting ways. For example, if math historians want to find the most influential school for mathematics in the late 1800’s, they can use our Advanced Search option to constrain the birth years, for example, to be between 1800 and 1880. They will find, as we mentioned in our About page, that the school with the highest InfluenceRanking™ in the discipline of mathematics is the University of Göttingen.
If you are unable to find a particular person on our website, or if a school you want to explore doesn’t have an InfluenceRanking™ yet, you may be encouraged to know that a new algorithm is forthcoming. This new InfluenceRanking™ algorithm will draw journal citation data from Crossref.org’s 100,000,000+ content items and the page ranks found at CommonCrawl.org. This algorithm ensures that all schools, and many, many more influencers will be ranked on our website. Check back soon for more details!