Skip to content

a-smart-company/auto-survey

Automated literature surveys


PyPI Status Code Coverage License LastCommit Contributor Covenant

Developer:

Getting Started

Get a Semantic Scholar API key

The first thing to do is to request an API key for Semantic Scholar. Note that this can only be used for research purposes. Here are some suggested answers for the form:

> How do you plan to use Semantic Scholar API in your project? (50 words or more)*

Generate literature surveys using large language models with relevant papers in context, using the `auto-survey` Python package. It re-writes the desired research topic into 10 different queries, pings the /paper/search endpoint for each of those for papers, and feeds those papers to a language model to generate a literature survey. It is only for my own private use.

> Which endpoints do you plan to use?

The /paper/search endpoint.

> How many requests per day do you anticipate using?

Around 100 requests per day.

When you have it, you create a file called .env in your current directory with the following content:

SEMANTIC_SCHOLAR_API_KEY="<your key here>"

If you already had a .env file, you can just append the line above to it.

Set up an LLM API key

Next, you need to set up an API key for the large language model (LLM) that you want to use. The default model is gpt-4.1-mini from OpenAI, which requires you to have an OpenAI API key, and again add it to your .env file:

OPENAI_API_KEY="<your key here>"

Installing and Running

Firstly, you need to install the pandoc and weasyprint packages, which allows for generating the final PDFs. You can do this on MacOS using Homebrew:

brew install pandoc weasyprint

On Ubuntu you can install them with apt:

sudo apt install pandoc weasyprint

Then, the easiest way to use the auto-survey package is as a uv tool. You can start generating a literature survey using the following command:

uvx auto-survey "<your topic here>"

This both installs the package and creates the literature survey, which typically takes about 10 minutes. With the default model, it costs about $0.05 per survey.

You can see all the available options by running the following command:

uvx auto-survey --help

Using Different Model Providers

The package supports all of LiteLLM's providers, including OpenAI, Anthropic, Google, xAI, local models, and more. You can simply set the --model argument to the model you want to use. For example, to use Claude Sonnet 4.5 from Anthropic, use

uvx auto-survey "<your topic here>" --model "claude-sonnet-4-5"

Some providers require you to prefix the model ID with the provider name. For instance, to use the Grok-3-mini model from xAI, you need to use

uvx auto-survey "<your topic here>" --model "xai/grok-3-mini"

All of this is documented in the LiteLLM provider documentation. If you use a different provider, you need to set different environment variables. See the LiteLLM provider documentation for more information on which environment variables to set.

Custom Inference API

You can also run the package with a custom inference API. In this case you need to set the --base-url argument with the URL to the inference API, and also set the --api-key-env-var argument with the name of the environment variable that contains the API key for the inference API. This variable must again be set in the .env file:

<value-of-api-key-env-var>="<your key here>"

Lastly, when using custom inference APIs, you need to use a custom prefix as well, dependending on what kind of inference server you're using. If it is running with vLLM, you need to use the hosted_vllm/ prefix, for instance, and Ollama models use the ollama_chat/ prefix. See the LiteLLM provider documentation for more information on which prefixes to use.

About

Automated literature surveys

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Contributors