It is standard practice to represent documents as embeddings. Embeddings based on deep nets (BERT) capture text and other embeddings based on node2vec and GNNs (graph neural nets) capture citation graphs.

We evaluate these embeddings and show that combinations of text and citations are better than either by itself on standard benchmarks of downstream tasks. Embeddings are available for a range of applications: ranked retrieval, recommender systems and routing papers to reviewers.

How to use Better Together

  1. Input a corpus id or a query for a paper or an author
  2. Select from several different embeddings
  3. Output a list of similar papers with links to Semantic Scholar paper pages
  4. Explore citation counts or other similar papers

Similarities are based on embeddings (specter, ProNE [default]). Some embeddings are better for text (abstracts) and others are better for context (citations). More coming soon.

Based on JSALT-2023. See final report video for the detailed presentation.

Content on this page is generated by our API partners. If you find inappropriate content or errors, please contact us with details and we will do our best to address your concerns.

Join the Semantic Scholar API Community Slack Channel

Get Started