Microsoft Academic - FAQ

Why a new site?

Microsoft Academic Search
was a research project that ended in 2012, and no new content has been added to that site since 2012. Microsoft Academic is a new service that has been built from the ground up in partnership with the Bing team to be much more scalable, responsive, and compatible with modern web browsers.    

What's new?

This new service puts a knowledge driven, semantic inference based search and recommendation framework front and center. In addition, a new data structure and graph engine have been developed to facilitate the real-time intent recognition and knowledge serving. One illustrating feature is semantic query suggestions that identify authors, topics, journals, conferences, etc., as you type and offer ways to refine your search based on the data in the underlying academic knowledge graph. You can also refine your results using the filters on the search results page. Since we are built on top of Bing's web crawling infrastructure, we are able to discover and index new academic papers in a more scalable manner. We now have over 150 million entities and billions of relationships in the Microsoft Academic Graph and growing! 

We are also adopting an open approach in developing the service, and we invite community participation. We like to think what we have developed is a community property. As such, we are opening up our academic knowledge as a downloadable dataset and making key building components cloud based services from Microsoft Cognitive Services. You are welcome to test out the Academic Knowledge API instead of downloading the massive dataset over the internet. We invite you to make and vote on new feature suggestions via the voting mechanism in the Microsoft Academic Forum.

What will happen to the old site?

It will be retired soon, but we are keeping it around for a bit longer so that you can compare it with Microsoft Academic and let us know which features are most important you. To add your feature suggestions and ideas, go to the Microsoft Academic Forum

How do we calculate citation counts?

Due to the noisy nature of large-scale scholarly data available on the Web, a publication’s true citation count is not identical to a simple count of the citing documents indexed by any given scholarly database. Thanks to the huge quantity of publications in the Microsoft Academic Graph, we are able to estimate a more accurate citation count for each publication. The citation count shown per publication reflects this estimation based on a statistical model which takes advantage of both the local statistics of individual publications and the global statistics of the entire academic graph to determine the estimates of citation counts. The article “The Number of Scholarly Documents on the Public Web” from Madian Khabsa and C. Lee Giles provides another good example of statistic estimation based on the corpus in Microsoft Academic Search.

Why doesn’t the new site have the graph visualizations?

The visualizations included on the old site were based on Silverlight, which does not run on all browsers. But why should we be the only ones who can experiment with innovative approaches to big data visualization? Please use the Academic Knowledge API to try out your ideas. Be sure to let us and the world know what you have done.

What about the citation context?

We will be introducing new experiences surrounding the citation context and more generally the entity relationships in the near future. Please stay tuned!

How are Bing and Microsoft Academic different?

If you are in the United States, Bing includes academic papers within their web results and other academic data in Bing's Snapshot feature on the right side of the search results page. For example, a search for 'latent semantic analysis' on Bing will show academic papers along with other web results, and the Snapshot on will provide a definition of the topic and will show related people, conferences, and topics. Because the academic features are integrated with web search in general, you probably have to be more explicit in expressing the search intent, using query like "papers by <author>", "papers about <subject>", to retrieve publications.

Bing in China has gone one step further and provides a dedicated vertical search for academic search results, so that generic web contents, such as consumer news, are not mixed with academic results.

Microsoft Academic is new non-commercial service with dedicated academic experience, and one that will allow us to quickly try out new features and use as an experimentation platform. Your feedback is critical. If you have exciting online experiments that we can help facilitate, please let us know, too!

How do I report errors and ask questions?

Use the Microsoft Academic Forum to let us know if you see errors or bugs, or have suggestions for new features. 

How do I make my publications discoverable on your site?

To have your journal included in first make sure that your publications are indexed by Bing. Use the Bing Webmaster Tools to ensure that Bing is properly indexing your site through the use of the robots and sitemaps protocols. Second, to improve the discoverability and inclusion of your content, be sure to follow the web standards for HTML meta tags for academic content.   

Why don't you report h-index, EI/SCI and Journal impact factors?

The research literature has provided abundant evidence that these metrics are at best a rough approximation to gauge research impact and scholarly influences in the era when sizable publication and citation data are scarce and expensive to obtain. It is our hope that, by leveraging the scale of Microsoft's web crawler in Bing and making the dataset publicly available to the research community at large, we can jointly usher in an era of greater research activity into new avenues for evaluating research. As a start, Microsoft Research is teaming up with the web search, data mining, and information retrieval communities in organizing WSDM Cup 2016, KDD Cup 2016, and TREC Open Search track. Please join our efforts, and let us know how we can jointly do better.

Why don't many papers have PDF download links?

We link to a PDF copy when we can, but some papers are available in HTML, and sometimes the links we have go to a paper 'landing page' from which you can get to the PDF version. If there isn’t a View Link or View PDF option, it means that we know about a paper because it has been cited by a paper in the graph, but we have not yet located a copy online.

Why are author affiliations sometimes different from those appearing in the papers?

Often, author affiliations are deduced by parsing the publications in their PDF forms where author and affiliation relationships can be denoted in creative ways that our machine learning algorithm may fail to generalize. In such cases, our entity conflation algorithm in creating the knowledge graph may sometimes use the last known affiliation of an author or leave the relationship blank. Please let us know when our algorithm makes mistakes so that we can train our machine learning system to perform better.

Why don't some papers have a venue?

Our current data model only allows each paper to have one venue of publication. At this time, if the paper has multiple venues (e.g., joint conferences or conference proceedings published in a journal) and mentions online do not offer a clear indication for our machine learning algorithm to confidently choose the definite answers, the venue field will be left blank. We are continuously improving the algorithms and modeling techniques.

How do I cite the results from the new site or its API?

Please cite the following paper:

Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246. DOI=

How do I add or edit information about an author, publication, journal, or conference? 

During the preview period, we do not allow end-users to add or edit the data on the site, but we are working on this feature.    

Feedback and Knowledge Base